Chapter 6: Analysing the Data
Computing the probability
We are now in a position to put an actual probability statement on the obtained sample mean of 9.4. The question we need to answer is whether a deviation of 1.4 in either direction from the null hypothesis is a likely or an unlikely outcome if the null hypothesis is true. One way of answering this question is to first transform the obtained deviation to a test statistic called the t statistic. The t statistic for the problem we are considering is:
where m0 is the population mean under the null hypothesis (m0 = 8 in this case), s is the sample standard deviation (s = 8.41 in this case), and n is the sample size (n = 5 in this case). Notice that the denominator of this formula is simply the standard error of the sampling distribution of the sample mean (the S.E.M.).
As you can see, this t statistic simply measures the obtained difference between the sample mean and the null hypothesised population mean (9.4 minus 8) in terms of standard errors from the null hypothesis. Plugging the relevant information into the formula, t = 0.372. So, 9.4 is 0.372 standard error units from the null hypothesised value. This statistic is important. Before using it further, let's discuss why we need to do this conversion to a t statistic.
The reason why we do this conversion is because we need to rely on a known sampling distribution in order to compute the probability we need. In the case of testing a hypothesis about a single population mean, we shall refer to a t distribution, where the degrees of freedom is equal to n-1, or one less than the sample size. Our sample size is 5, so our t statistic is evaluated in reference to the t(4) distribution, or the t distribution with 4 degrees of freedom.
Now that we have expressed our sample mean in terms of standard error units from the null hypothesis, we can answer the following question: What is the probability of getting a sample mean 0.372 standard errors or more from the null hypothesised value if the null hypothesis is true. Because we are doing a two-tailed test, we are interested in 0.372 standard errors in either direction from the null hypothesis (that is, above or below 8). This is the original question we asked except it had to be rephrased in terms of a t statistic rather than in terms of the original sample mean.
There are two ways of deriving the answer to this question: critical values, or p-values. Conceptually they are the same. I will start with p values because they are more commonly used and can be derived with a computer. So, from now on we will only refer to computer printout.
Our problem is presented graphically in Figure 6.7. The curved line represents the frequency distribution for the t statistic with 4 degrees of freedom. That is, this represents the relative frequencies of the possible t values if the null hypothesis is true. Notice that the t distribution is centred at 0, which is what the t statistic would be if the sample mean exactly equalled the population mean.
Call the total "area" under this curve and above the x-axis "1" to denote the fact that the probability is 1 that you will get a t statistic of some value. The obtained t of 0.372 is marked on this figure. This value of t corresponds to a sample mean of 9.4. The hashed proportion of the area under the t distribution to the right of 0.372 represents the probability of finding a t statistic equal to or greater than 0.372 if the null hypothesis is true, or of finding a mean of 9.4 or greater if the null hypothesis is true. The -0.372 is also of interest because we are doing a two-tailed test. We want to know the probability of getting a t statistic as large or larger than 0.372, regardless of the direction of the difference between the sample mean and the null hypothesised value of the population mean. Notice that -0.372 corresponds to a sample mean of 6.6 (you can verify if you want by plugging 6.6 into the t statistic formula above). The hashed area under the t distribution to the left of -0.372 represents the probability of finding a t statistic equal to or less than -0.372 if the null hypothesis is true, or of finding a sample mean of 6.6 or less if the null hypothesis is true.
Figure 6.7 The t-distribution with df = 4 and with t = +/- .372 marked.
What we want to derive is the sum of these two hashed proportions. The sum of these proportions is the probability of getting a difference of at least 1.4 (in either direction) between the sample mean and the population mean if the null hypothesis is true. The SPSS printout from this problem is displayed in Output 6.3. With SPSS, the probability is computed as .729. So the p value of the obtained result is .729 (or 72.9%). So the probability of finding a difference of at least 1.4 between the sample mean and null hypothesised value of the population mean is 0.729 if the null hypothesis is true.
You can see that SPSS prints out all the information we could have computed by hand, including the sample mean (Mean), sample standard deviation, sample size. The standard error of the sample mean (Std. of Mean), t statistic (under "t"), degrees of freedom (df) and the all important p value are also printed. The p value is listed under "Sig. (2-tailed)". In the present example, the probability that a difference as large as the one we have observed (or greater), given that the null hypothesis is true, is equal to .729. Thus, it appears that what we have observed is quite likely to occur if the null is true, and so we have no grounds on which to reject Ho as a plausible explanation of our data!
Output 6.3 Compare means É One-Sample T Test
First output is the result of taking a sample of size 5 from the population specified in Figure 6.6 (also see notes) and comparing it to the national average of 8.
You can use the PROB20 program that was included with the floppy disk that was sent to you to find p-values for t. It will have installed to C:\202progs\prob20\Prob.exe, and you can use File Manager (in Windows 3.1) or Explorer (in Win95) to locate it. Double-clicking will bring up a DOS window with PROB20 running. Use the space bar to shift between distributions (Z, t, etc.) and tab to shift between windows. For the preceding example, we see
Figure 6.8 The results of using the PROB20 program. A t distribution with 4 degrees of freedom and a t-value of .37 gives the probability of that t-value or larger occurring. Note how this program gives you the upper tail only. If you multiply .3650701 by 2 (for the 2-tailed probability on the test above), you obtain the "2-Tail Sig." from SPSS, about .73 (within rounding).
© Copyright 2000 University of New England, Armidale, NSW, 2351. All rights reserved
Maintained by Dr Ian Price