1 1 Lecture 5 T-Test 2 This lecture covers ? N ormal distribution ? O ne-sample t-test ? P aired-sample t-test ? T wo-sample t-test 2 3 Review of the Normal Distribution ?A distribution is a collection of sco re s (values) on a variable that is arranged in order from lowest to highest value on the horizontal (X) axis, and in terms of frequency on the vertical (Y) axis. A normal distribution , sometimes referred to as a bell curve , has a distribution that forms the shape of a bell. All you need to know to plot the normal distribution is the mean and standard deviation of the data. 4 ? A normal distribution with a mean of μ , and a standard deviation of σ , is denoted as N(μ, σ ) . If a set of scores has a distribution of N(15, 2), then we would say it is a normal distribution with a mean of 15 and a standard deviation of 2. Normal distributions do not all look alike; their shape depends on the values of the mean and standard deviation. For a given mean, a normal distribution may be tall and thin (if σ is small), or short and flat (if σ is large). See the figures in the next slide. 3 56 Major characteristics ? It is sy mmetrical , meaning that the upper and lower halves of the distributi on of scores are mirror images of each other. ? S econd it is unimodal ; the mean, median and mode are all in the same place, in the center of the distribution (at the top of the bell curve); the distribution is thus highest in the middle, and curves downward toward the top and bottom of the distribution. ? T hird it is as y m ptotic , meaning that the upper and lower tails of the distribution never actually touch the baseline, known as the X-axis. 4 7 ? W hile normal distributions are all not the same, they share an important characteristic: a given standard deviation from the mean always “cuts off” the same proportion or percentage of scores in all normal distributions. ? S pecifically, one standard deviation above and below the mean includes about 68% of the scores; two (actually, about 1.96) standard deviations above and below the mean include 95 percent, and three include more than 99.7 percent of the sco re s. 8 These percentages are worth committing to memory: 68-95-99.7 5 9 There are three kinds of distributions that we need to distinguish? Population Distribution : the distribution of scores in a population, for example, the distribution of height scores for everyone in a cou n try. ? Distribution of a Sample : the distribution of scores in a sample, for example, the height scores of the students in this class. ? Sampling Dis t ribution : the distribution of some statistic (e.g., the mean) in all possible samples. 10 The following figure presents a schematic depiction The following figure presents a schematic depiction of these three distributions. It is obviously impossible of these three distributions. It is obviously impossible to actuall y draw all possible samples to actuall y draw all possible samples 6 11 ? T he sampling distribution of the mean for a random sample has extremely important properties. As the sample size n increases, the sampling distribution of the mean more and more closely resembles a normal distribution. ? S tatisticians refer to this tendency as the central limit the o rem , one of the most important ideas in statistics. 12 ? A ctually the sampling distribution of the mean approximates a normal distribution fairly clo s ely fo r sample size s of 30 or more. This is true regardless of the shape of the variable’s distribution in the population. Thus even if a variable is not normally distributed in the population, the mean of all possible sample means of this variable is the same as the population mean, μ . 7 13 Inference for the Mean of a Population ? C onfidence intervals and tests of significance for the mean of a normal population are based on the sample mean . The sampling distribution of has as its mean . Tha t is , is an unbiased estimator of the unknown . The spread of depends on the sample size and also on the population standard deviation . σ μ μ μ x x x x 14 Assumptions for inference about a mean ? O ur data are a simple random sampling (SRS) of size n from the population. This assumption is very important. ? O bservations from the population have a normal distribution with mean and standard deviation . In practice, it is enough that the distribution be symmetric and single-peaked unless the sample is very small. Both and are unknown parameters. μ μ σ σ 8 15 In this setting, the sample mean has the normal distribution with mean and standard deviation . Because we do not know , we estimate it by the sample standard deviation . We then estimate the standard deviation of by . This quantity is called the standard error of the sample mean . μ σ s n σ s n x x x 16 When we know the value , we base confidence intervals and tests for on the one-sample z statisticThis z statistic has the standard normal distribution N(0,1). When we do not know , we substitute the standard error of for its standard deviation . The statistic that results does not have a normal distribution. It has a distribution that is new to us, called a t distribution . x z n μ σ ? = σ σ μ s n x n σ 9 17 The one-sample t statistic and the t distributions ? D raw an SRS of size n from a population that has the normal distribution with mean and standard deviation . The one- sample t statisti c has the t distribution with n-1 degrees of freedom. x t s n μ ? = μ σ 18 ? T he t statistic has the same interpretation as any standard statistic: it says how far is from its mean in standard deviation units. ? T he degrees of freedom for one-sample t statistic come from the sample standard deviation s in the denominator of t. We know that s has n-1 degree of freedom. We will write the t distribution with k degrees of freedom as t(k ) for short. x μ 10 19 Figure 1 Density curve for the t distributions with 2 and 9 degrees of freedom and the standard normal distribution. 20 The figure illustrates these facts about the t distributions: ? T he density curves of the t distributions are similar in shape to the standard normal curve. They are symmetric about zero, single-peaked, and bell-shaped. 11 21 ? T he spread of the t distribution is a bit greater than that of the standard normal distribution. The t distributions have more probability in the tails and less in the center than does the standard normal. This is true because substituting the estimates s for the fixed parameter introduces more variation into the statistic. σ 22 ? A s the degrees of freedom k increase, the t(k) density curve app roaches the N(0,1) curve ever more closely. This happens because s estimates more accurately as the sample size incre a se s. So using s in place of causes little extra variation when the sample is large. σ σ 12 23 Table C gives critical values for t distributions. Each row in the table contains critical values for one of the t distributions; the degrees of freedom appear at the left of the row. By looking down any column, you can check that the t critical values approach the normal values as the degree s of freedom increase. 24 Table C t distribution critical values 13 25 The t confidence intervals and tests ? T o analyze samples from normal populations with unknown , just replace the standard deviation of by its standard error in the z procedures. The z procedures then become one- sample t procedures . Use P-value or critical values from the t distribution with n-1 degrees of freedom in place of the normal values. σ x n σ sn 26 The one-sample t procedures ? D raw an SRS of size n from a population having unknown mean . A level C confidence interval for is μ s xt n ± where t is the upper (1-C)/2 critical value for the t(n-1) distribution. μ 14 27 ? T o test the hypothesis based on an SRS of size n, compute the one-sample t statistic 00 : H μ μ = 0 x t sn μ ? = 28 ? I n terms of a variable T having t(n-1) distribution, the P-value for a test of against 0 H 10 : i s ( ) HP T t μ μ >≥ 10 : i s ( ) H PT t μ μ <≤ 10 : i s 2 ( ) HP T t μμ ≠≥ 15 29 Example 1 ? A n SRS of 5 objects: 55.95 68.24 52.73 21.50 23.78 To calculate 95% confidence interval, first calculate =44.44 and =20.741 The degrees of freedom are n-1=4. From Table C we find that for 95% confidence t=2.776. x s 30 ? T he confidence interval is 20.741 44.44 2.776 5 44.44 25.75 18.69 to 70.19 s xt n ±= ± =±= The larg e margi n of erro r is due to the sm all sample size and the rath er large variation among the observations, reflected in the large value of s. 16 31 ? T he one-sample t distribution confidence interval has the form ±× esti m a te es ti m a te SE t “SE” stands for “standard error.” 32 Example 2: Sweetness lose Sweetness losses from 10 tasters:2.0 0.4 0.7 2.0 -0.4 2.2 -1.3 1.2 1.1 2.3Are these data good evidence that the drink lost swe e tness? ?? Step 1: Hypothesis.Step 1: Hypothesis. The null hypothesis is The null hypothesis is ““ no lossno loss ”” , and the alternative hypothesis says , and the alternative hypothesis says ““ there is a lossthere is a loss ”” .. 01 :0 :0 HH μ μ = > 17 33 ? S t ep 2: T e st st at ist i c. The basic statistics are =1.02 and =1.196.The one-sample t test statistic is x s 0 1.02 0 2.70 1.196 10 x t sn μ ? ? == = 34 ? S tep 3: P-value. The P-value for t=2.70 is the area to the right of 2.70 under the t distribution curve with degrees of freedom n-1=9. Figure 2 shows this area. 18 35 ? T able C shows that the observed t lies between 0.01 an d 0.02. Using SPSS, we get the exact result P=0.0122. There is quite strong evidence for a loss of sweetne ss. 36 19 37 Matched pairs t procedures ? T he taste test in Example 2 is in fact a matched pairs study in which the same 10 tasters rated before-and-after sweetness. ? T o compare the responses to the two treatments in a matched pairs design, apply the one-sample t procedures to observed differences. 38 ? T he parameter in a matched pairs t procedure is the mean difference in the responses to the two treatments within matched pairs of subjects in the entire population. μ 20 39 Example 3 S u b j e c t G r o up A G r o up B D i f f e r e n c e S ub j e c t G r o u p A G r o u p B D i f f e r e nc e 1 30.6 0 37. 97 -7 .37 12 5 8.93 83.5 0 -24.5 7 2 48.4 3 51. 57 -3 .14 13 5 4.47 38.3 0 16. 17 3 60.7 7 56. 67 4 .10 14 4 3.53 51.3 7 -7. 84 4 36.0 7 40. 47 -4 .40 15 3 7.93 29.3 3 8 . 6 0 5 68.4 7 49. 00 19 .47 16 4 3.50 54.2 7 -10.7 7 6 32.4 3 43. 23 -10 .80 17 8 7.70 62.7 3 24. 97 7 43.7 0 44. 57 -0 .87 18 5 3.53 58.0 0 -4. 47 8 37.1 0 28. 40 8 .70 19 6 4.30 52.4 0 11. 90 9 31.1 7 28. 23 2 .94 20 4 7.37 53.6 3 -6. 26 10 51.2 3 68. 47 -17 .24 21 5 3.67 47.0 0 6 . 6 7 11 65.4 0 51. 10 14 .30 Listening to Mozart improves student’s performance on tests. 40 To analyze these data, subtract Group B sco re s fr om Gr o up A scores for each subject. ? Step 1: Hypothesis. To assess whether listening to Mozart significantly improved performance, we test 01 :0 :0 HH μ μ = > 21 41 Here is the mean difference in the population from which the subjects were draw. The null hypothesis says that no improvement occurs, while says that there is an improvement. ? S t ep 2: T e st st at ist i c. The 21 differences have =0.9567 and =12.5479 The one-sample t statistic is therefore μ 1 H x s 0 0 .9567 0 0.349 12.5479 2 1 x t sn ?? == = 42 ? S tep 3: P-value. Find the P-value from the t(20) distribution. Table C shows that 0.349 is less than the 0.25 critical value. The P-value is therefore greater than 0.25. SPSS gives the value P=0.3652. ? C onclusion. The data do not support the claim that listening to Mozart improve performance. The aver age improvement is small, just 0.96. This small improvement is not statistically significant at even the 25% level. 22 43 One-sample t test 44 Paired-sample t test 23 45 Using the t procedures ? E xcept i n the ca se of sm all samples, the assumption that the data are an SRS from the population of interest is more important than the assumption that the population distribution is normal. 46 ? S ample size less than 15. Use t procedures if the data are close to normal. If the data are clearly nonnormal o r if outliers are present, do not use t. ? S ample size at least 15. The t procedures can be used except in the presence of outliers or st ron g skewn e ss. ? L arge samples. The t procedures can be used even for clearly skewed distributions when the sample is large, roughly . 40 n ≥ 24 47 Comparing two means ? T he goal of inference is to compare the responses to two treatments or to compare the characteristics of two populations. ? W e have a separate sample from each treatment or each population. ? U nlike the matched pairs designs, there is no matching of the units in the two samples and the two samples can be of different sizes. Inference procedures for two-sample data differ from those for matched pairs. 48 Assumptions for comparing two means ? W e have two SRSs, from two distinct populations. The samples are independent. That is, one sample has no influence on the other. Matching violates independence, for example. We measure the same variable for both samples. ? B oth populations are normally distributed. The means and standard deviations of the populations are unknown. 25 49 Notation ? C all the variable we measure X1 in the first population and X2 in the second because the variable may have different distributions in the two populations. Notation used to describe the two populations: 2 1 Standard deviation Mean Variable Population 1 x 2 x 1 μ 2 μ 2 σ 1 σ 50 ? T here are 4 unknown parameters, the two means and the two standard deviations. We want to compare the two population means, either by giving a confidence interval for their difference or by testing the hypothesis of no difference, . ? W e use the sample means and standard deviations to estimate the unknown parameters. Notation used to describe the samples: 01 2 : H μ μ = 12 μ μ ? 26 51 2 1 Sampl e standard deviation Sampl e mean Sampl e size Population 1 n 2 n 1 x 2 x 1 s 2 s To do inference about the differe nce between the means of the two populations, we start from the difference between the means of the two samples 12 μ μ ? 12 x x ? 52 Two-sample t procedures ? S tandardize the observed differe nce by dividing by its standard deviation which is 12 x x ? 22 1212 nn σ σ + 27 53 ? B ecause we do not know the population standard deviations, we estimate them by the sample standard deviations from our two samples. The result is the standard erro r , or estimated standard deviation, of the difference in sample means: 22 12 12 SE s s nn =+ 54 ? W hen we standardize the estimate by dividing it by its standard error, the result is the two-sample t statistic: 12 SE xx t ? = The statistic t has the sa me interpretation as any z or t statisti c: it sa ys how far is from 0 in standard deviation units. 12 x x ? 28 55 The two-sample t procedures ? D raw an SRS of size from a normal population with unknown mean , and draw an independent SRS of size from another normal population with unknown mean . T he confidence interval for given by has confidence level at l east C. Here t is the upper (1-C)/2 critical val ue for the t(k) distribution with k the smaller of and . 1 n 2 n 1 μ 2 μ 12 μ μ ? 22 12 12 12 () s s xx t nn ?± + 1 1 n ? 2 1 n ? 56 ? T o test the hypothesis , compute the two-sample t statistic 01 2 : H μ μ = 12 22 12 12 xx t s s nn ? = + and use P-value or critic al values for the t(k) distribution. The true P-value or fixed significance level will alw a ys be equal to or less tha n the value calculated from t(k). 29 57 ? T he two-sample t confidence interval again has the form ? T hese two-sample t procedures always err on the safe side, reporting higher P-value and lower confidence than are actually true. The gap between what is reported and the true is q u ite sma ll unless the sam p le size s are both small and unequal. As the sample sizes increase, probability values based on t with degrees of freedom equal to the smaller of and become more accurate. ±× esti m a te es ti m a te SE t 1 1 n ? 2 1 n ? 58 Example 4 ? T wo SRSs of primary school children aged 7: 5 boys and 5 girls, their height 110 140 110 98 124 Boys 129 120 126 126 118 Girls From the data, calculate the summary statistics: 16.09 116.40 5 2 Boys 4.60 123.80 5 1 Girls Group x s n 30 59 ? T he observed difference in mean heights is Is this good evidence that girls are higher than boys when they are at age 7? 12 123.80 116.40 7.4 cm xx ?= ? = ?? The hypotheses areThe hypotheses are 01 2 11 2 : : HH μ μ μ μ = > 60 ? T he test statistic is 12 22 2 2 12 12 123.8 116.4 4.60 16.09 55 7.4 0.989 7.484 xx t ssnn ?? == + + == 31 61 ? T here are 4 degrees of freedom, the smaller of and . Because is one-sided on the high side, the P-value is the area to the right of t=0.989 under the t(4) curve. Figure 3 illustrates this P-value. 1 H 2 14 n ? = 1 14 n ?= 62 ? T able C shows that it lies between 0.15 and 0.20. df=4 SPSS tells us that the actual value is P=0.189. The survey did not find convincing evidence that girls are higher than boys when they are 7 years old. 1.190 0.941 t 0.15 0.20 P 32 63 ? F or a 95% confidence interval, Table C shows that the t(4) critical value is t=2.776. We are 95% confident that the mean height difference lies in the interval 12 μ μ ? 22 12 12 12 22 () 4.60 16.09 ( 123.8 116.4) 2.776 55 7.40 20.78 13.29 to 28.18 ss xx nn ?± + =? ± + =±=? 64 ? T hat the 95% confidence interval covers 0 tells us that we cannot reject against the two-sided alternative at the =0.05 level of significance. 01 2 : H μ μ = α 33 6566 34 67 More accurate levels in the t procedures ? T he two - sample t statisti c doe s n o t have a t distribution. Moreover, the exact distribution changes as the unknown population standard deviation and change. However, an excellent approximation is available. 2 σ 1 σ 68 Approximate distribution of the two-sample t statistic ? T he distribution of the two-sample t statistic is close to the t distribution with degrees of freedom given by 2 22 12 12 22 22 12 11 2 2 df 11 11 ssnn ss nnn n ?? + ???? = ?? ? ? + ?? ? ? ?? ?? ? ? 35 69 ? T his approximation is quite accurate when both sample sizes and are 5 or larger. ? B ack to Example 4, for improved accuracy, we can use critical points from the t distribution with df given by 1 n 2 n 2 22 22 22 4.60 16.09 55 3137.08 df 4.65 674.71 14 . 6 0 1 1 6 . 0 9 45 4 5 ?? + ???? == = ?? ? ? + ?? ? ? ?? ? ? 70 ? T he tw o- s a mple t pr oc edur e s ar e ex ac tly as before, except that we use a t distribution with more degrees of freedom. The number df is always at least as large as the smaller and . On the other hand, df is never larger than the sumof the two individual degrees of freedom. The df is generally not a whole number. 1 1 n ? 2 1 n ? 12 2 nn +? 36 71 ? SPSS reports th e results of two t proced u r es: the general two-sample procedure (Unequal variances) and a special procedure that assumes that the two populations have the same standard deviation. When Levene’stest is significant (P<0.05), we reject that the variances are equal and use the t test that assume that variances are not equal. ? T he degrees of freedom are df=4.65. The two-sided P-value from the t(4.65) distribution is 0.371. We divide by 2 to find that the one-sided result is P=0.186. 72 Example 5 ? U sing the 1997 survey data, we are going to test whether or not there is a statistically significant difference in the number of children ever born to women of Han and of Minority nationality. There is no significant difference . There is a significant difference. 01 2 11 2 : : HH μ μ μ μ = ≠