Lab 20

Independent samples t-tests

Hypothesis tests analyzed with related samples t-tests

Let's start with a brief review. In the last several labs we looked at ways to test samples.

    We used z-scores to test whether a sample was probably from a population for which we knew the population mean μ and the population standard deviation σ

    We used one-sample t-tests to examine the same situation, except we don't know the population σ, so we need to use an estimate (the sample standard deviation).

    The last lab used a different computational formula to calculate the observed t for two more situations:

    • repeated measures, in which there is one sample, but each individual is tested twice.
    • matched pairs, in which there are two samples, but they are related on a subject by subject basis.

The logic of today's lab should seem similar to the last several labs. The overall logic is the same, we still use the t-distribution to find our critical values. However things get a little more complicated, because of the formulas we'll use. Now we are going to look at a situation where we are interested in the potential difference between two independent populations. And again, we'll deal with situations in which we don't know the μ or σ for either of these populations, so we'll have to use estimates.

An experiment that uses a separate independent sample for each treatment condition (or each population) is called an independent-measures research design. Often you'll also see it referred to as a between-subjects or between-groups design.

So we'll use the same logic and steps for hypothesis testing that we used in the previous labs, and fill in the details of the differences as we go.

    Step 1: State your H0 and H1 and choose your criterion: α
    Step 2: Compute the observed t for your samples
    Step 3: Compare observed t to critical t (or p to α) and make a decision

Let's start with Step 1:

    Figuring out your criteria is exactly the same process as before, you pick what your field has decided as being an accepted level of alpha (chance of making a type I error). For our example, let's assume α = 0.05

    The hypotheses are going to be a bit different, because the situation is different. Remember, that now we are making hypotheses about two different populations, not just comparing a treatment to what is known.

    For example, suppose that you want to compare two different treatments (e.g., two ways of studying, two different drugs, etc), or you want to compare two groups of people (e.g., men vs. women, young vs. old, etc.). So now, the hypotheses are about population A (men) and population B (women), and how they are different from one another.

      Suppose that we are interested in how tall men and women are.

      Is this going to be a one-tailed test or a two-tailed test? In this case, we'll conduct a two-tailed test. We won't make a directional prediction.

      So the H0 hypothesis would be that men and women are the same height. That is,

        H0: μMen = μWomen

        - or -

        H0: μMen - μWomen = 0

      Our alternative hypothesis would be that men and women have different mean heights. That is,
        H1: μMen ≠ μWomen

        - or -

        H1: μMen - μWomen ≠0

      Note: What might the hypothesis be for a one-tailed test? Men are taller than women.

        H0: μMen <= μWomen
        H1: μMen > μWomen
Step 2:

    We are going to be using two samples, one to represent each population.

    This is a good time to look at some sample data:

      Men's heights: 67, 73, 74, 70, 70, 75, 73, 68, 69
      Women's heights: 69, 63, 67, 64, 61, 66, 60, 63, 63

        Any guesses as to how we'll compute our df?

        Think about it this way, with one sample we used n - 1 because all of the values in the sample are free to vary but one, because we know the value of the sample mean.

        Now consider the current situation. We've got two samples. How many values are free to vary?

          sample 1: nMen - 1
          sample 2: nWomen - 1
          so together there are nMen + nWomen - 2 = df

        We need to know how many individuals we have in our samples.

        nA = 9 and nB = 9

        So the df for our example is: nA + nB - 2 = 9 + 9 - 2 = 16

        Remember, that because we're using samples, we can only estimate the values of the population parameters and so we're going to need to take degrees of freedom into account.

      So what is our critical t for a 2-tailed test?

      Critical t = TINV(2α/tails, df) = TINV(2 * .05 / 2, 16) = TINV(.05, 16) = 2.12


Now comes the big difference. We need to compute our observed t statistic. At the conceptual level, the formula is the same. However, at the practical level, it is a much more complex because we have two samples, which means that we have two estimates. Let's break the formula below into several parts.




In other words, we're interested in the difference between the two populations, so to compute the t statistic we need to see if the difference between our two samples is different from the difference between the two populations.

So the numerator is straightforward:
is the difference between the sample means

is the difference between the population means. Since H0 says the means are the same, the difference between them always is 0.

The denominator is where things become complex:
is the estimated standard error of the difference between sample means. That is, when we estimate the difference in population means using 2 sample means, we are typically going to be off somewhat in our estimates of the true population difference. This error, "on average," is the standard error of the difference between sample means.

The formula is going to be a little tricky but is based on the same idea as other standard errors. Let's rewrite the estimated standard error formula so both the numerator and denominator are under the square root like this:

We see that there is an estimated variance (s2) in the numerator and a sample size n in the denominator. Eventually, we'll do roughly the same thing with the independent samples t-test but it will require several steps to get there.

In order to calculate the standard error, we are going to need the formulas for the degrees of freedom first.

dfA and dfB are simply the sample sizes of groups A and B minus 1.

The total degrees of freedom is df = dfA + dfB = (nA -1) + (nB - 1) = nA - nB - 2

Second, we need a sort of average variance in the 2 samples. We call this, "pooled variance." It is a measure of variance that is a weighted average of both samples' variance. Here is the formula for pooled variance:


An alternate formula for pooled variance that makes it easier to calculate from SPSS output is:



The pooled variance is just a step on the way to calculating the estimated standard error of the difference between sample means. Here is the formula for that:

Thus, you can see that there is a sort of variance in the numerators and sample sizes in the denominators. Thus, although the formula looks very weird, at its core, it is very much like the formula we've already seen.

Example:

Suppose we have Groups 1 and 2 with 5 and 4 scores, respectively.

Person Score Group
1 23 1
2 35 1
3 45 1
4 33 1
5 22 1
6 16 2
7 22 2
8 14 2
9 18 2

If we calculate the means, sums of squares, and degrees of freedom, we get the following:


Group 1 Group 2
Mean 31.6 17.5
s
9.476 3.416
df 4 3



So pooled variance = (4*9.476*9.476 + 3*3.416*3.416) / (4 + 3) = 56.31
and the estimated standard error = sqrt(56.31 / 5 + 56.31 / 4) = 5.03


Observed t = (31.6 - 17.5) / 5.03 = 2.80

Total degrees of freedom = 4 + 3 = 7
Using Excel, we see that a 2-tailed critical t = TINV(2 * 0.05 / 2, 7) = 2.36

Finally, we see that the observed t of 2.80 is larger than the critical t of 2.36 and thus we reject the null hypothesis.
In ordinary language, we would conclude that Group 1 appears to come from a population with a mean that is significantly larger than that of Group 2.

Assumptions of all t-tests

(1) The observations are independent (both between and within groups)

(2) The two populations are normally distributed 

** New Assumption ** (3) The two populations have equal variances. This is referred to as the homogeneity of variance assumption. Recall that in the formula we pool our sample variances. This is an okay thing to do if the variances are about the same. However, it isn't okay if they are very different. SPSS provides a test for this assumption. In the output for the Independent Samples Test, you'll see a box labeled Levene's Test for Equality of Variances. If this test is significant (the Sig. value is 0.05 or less), then there is evidence that this assumption has been violated and a corrected formula must be used. We won't deal directly with this corrected formula but we can make use of it from SPSS output.



Using SPSS to compute independent samples t-tests

    Note: To do an independent samples t-test you'll need to have two variables (columns) in your data file. One column will contain the data (your dependent measure). The other column will be an independent variable that specifies which group the subject belongs to (e.g., 1 for group 1, 2 for group 2).
    Person Score Group
    1 23 1
    2 35 1
    3 45 1
    4 33 1
    5 22 1
    6 16 2
    7 22 2
    8 14 2
    9 18 2
    Go to the Analyze menu and select the submenu Compare Means. In this submenu you'll see several tests. The one that we're interested in today is independent samples t-test.
    After selecting Independent samples t-test, you'll get a window that looks like this. Here you should select the variables that you are testing. Your test variable is your dependent variable. Your group variable is the independent variable that assigns each subject to a group.

    Before you can do the analysis, you must define the groups. Click the button and then enter the values that you used to define the groups (e.g., 1 for group 1 and 2 for group 2).
    Here is what the output will look like.


    Notice that SPSS doesn't tell you to reject or fail to reject the H0, nor does it give you the critical t. To make your decision about the H0 you must compare the p-value with your &alpha-level. If the p-value is equal to or smaller than the your &alpha-level, then you should reject the H0, otherwise you should fail to reject H0.
    You may also notice that there are two rows of numbers in the t-test output. One row "assumes equal variance" the other doesn't. This is related to the assumption of homogeneity of variance discussed above. If the Levene's test is not significant (look at the Sig. value), then we can assume equal variances and use the values in that row. If Levene's test is significant, we must use the values in the second row.

    In this case, the Levene's test is not significant (p = .981) so we read the upper row (Levene's test is not significant most of the time which means that the homogeneity of variance assumption has been met.).
    The p-value of the t-test is .937 so the null hypothesis is retained (i.e., the means are not significantly different).




Download the worksheet here to answer the questions.
Email it to your GA when you are finished.