Lab 15

Confidence Intervals

With z-tests we learned how to decide if a sample came from a population with a known mean. Now we are going to reverse this process. We are going to use the sample mean to make an estimate of the population mean that is unknown.

You may have heard of the term "margin of error" when you've read or heard about polling data in the news. We are going to learn precisely how these margins of error are calculated but first we are going to discuss a related concept called confidence intervals.

If you have a sample mean and you wish to make a guess as to what the population mean is, you can make two kinds of estimates:
1. Point estimates are guesses that specify an exact number. When you make a point estimate using the sample mean, it is likely your guess is near the true population parameter but it is very unlikely that it will be exactly the same as the parameter. For example, you might make a point estimate that μ is 4 when really it is 3.55.
2. Interval estimates are guesses that specify a range of numbers. With interval estimates, you guess that the true population parameter falls somewhere between 2 numbers. For example, you might make an interval estimate that μ is between 2 and 6.

A confidence interval is an interval estimate that surrounds the point estimate. Also, a confidence interval estimates the probability that the interval is correct. That is, how confident are you that the true population parameter lies within your interval estimate? For example, you might say that you are 90% certain that μ falls between 2 and 6. This means that there is a 10% chance that μ is lower than 2 or greater than 6. In this case, 2 is the lower bound of the confidence interval and 6 is the upper bound.

The distance from the point estimate to the upper bound (or lower bound) is the margin of error. In this case, the point estimate was 4. Thus the margin of error is 6 - 4 or 2 - 4 = ± 2.

A margin of error can be associated with any degree of confidence but typically 90%, 95%, or 99% is chosen. When the margin of error's confidence level is not specified in the news, you can usually assume that it is a 95% margin of error.



With a 2-tailed z-test, we had 2 critical regions associated with ± critical z. For example, if α = .05, the critical regions were associated with critical z= ± 1.96. To get these two critical z values, we had to divide the α of .05 in half and split it into the bottom .025 and the top .025 portions of the normal distribution. Using the NORMSINV function in Excel, we see that NORMSINV(.025) = -1.96 and NORMSINV(1 - .025) = 1.96.

To calculate lower and upper bounds of a confidence interval, we are going to use critical z values and the z-test formula to solve for μ instead of z. With z-tests we used this formula:

With a little bit of algebraic manipulation, we can solve for μ by multiplying both sides by the standard error, adding μ to both sides, and subtracting ±z from both sides to get this formula:


The point estimate for μ is always the sample mean (X-bar).
The ± z times the standard error will give us our margin of error.
Here is the formula:

The lower bound of the confidence interval is the sample mean minus the margin of error.
The upper bound of the confidence interval is the sample mean plus the margin of error.

Here is a figure showing the various concepts so far.



Let's walk through an example.

Suppose we know that we have a sample of n = 25 scores with a mean 50. The population standard deviation (σ) is 10. What is the 99% confidence interval for μ?

First, we can calculate the standard error just as we have in the past. The standard error is the population standard deviation divided by square root of th sample size. In this case:
standard error = 10/sqrt(25) = 10/5 = 2

Next, we need to find the 2 critical z scores for a 99% confidence interval. We need the middle 99% of the distribution, meaning that half a percent (.005) is below and half a percent is above (1 - .005 = .995). We can use the NORMSINV function in Excel to find the critical z scores associated with these probabilities:
=NORMSINV(.005) = -2.58
=NORMSINV(.995) = 2.58

You'll notice that really you only need to look up one of these values up and then multiply it by -1 to get the other critical z.

Plug in the sample mean, the 2 critical z scores, and the standard error in the formula above and the confidence interval can be calculated like this:

μ = 50 ± 2.58 * 2

μ = 50 ± 5.16

Lower bound = 50 - 5.16 = 44.84
Upper bound = 50 + 5.16 = 55.16

Thus, the point estimate for μ is 50 with a margin of error of ± 5.16.
The 99% confidence interval for μ between 44.84 and 55.16. This means that we are 99% certain that the true mean of the population from which the sample was drawn is somewhere between 44.84 and 55.16.

Suppose that a sample of n = 36 scores has a mean of 20. The population standard deviation (σ) is 9.
Blackboard 1) What is the standard error of this sample mean? Hint: Standard error = σ / sqrt(n)
Blackboard 2) What is the positive critical z associated with a 90% confidence interval? Note: Enter the positive critical z only and round to 2 decimals. Hint: =NORMSINV((1 - .90)/2) in Excel will give the negative critical z. Enter the absolute value of this critical z. An alternate method is to calculate the positive critical z score directly like this: =NORMSINV(1 - (1 - .90) / 2)
Blackboard 3) What is the margin of error? Hint: Multiply your answer to #1 by your answer to #2.
Blackboard 4) What is the lower bound of the 90% confidence interval for μ? Hint: Sample mean - margin of error
Blackboard 5) What is the upper bound of the 90% confidence interval for μ? Hint: Sample mean + margin of error

Suppose that a sample of n = 81 scores has a mean of 30. The population standard deviation (σ) is 27.
Blackboard 6) What is the lower bound of the 95% confidence interval? (Round to 2 decimals)

Blackboard 7) If the 99% confidence interval for a population mean is 50 to 60, what was the sample mean? Hint: Sample mean = the average of the upper and lower bounds of the confidence interval

Confidence Intervals with Proportions
The margin of error is often mentioned in the news in polling data (e.g., opinion data and candidate preferences). For example, it might be reported that Candidate A is ahead of Candidate B in the polls 55% to 45% with 3% margin of error.

When Candidate A is said to have 55% support, this is merely a point estimate from the sample. There is always some error in the point estimate and this error can be estimated with the standard error.
Proportions (p) have a special formula for the standard error:



Other than this, the confidence interval is calculated in the same way:
μ = p ± zσp

Suppose the 55% support came from a sample of 100 people. What is the 95% confidence interval?

μ = .55 ± 1.96*sqrt(.55(1 - .55)/100)
μ = .55 ± .10
It is 95% certain that Candidate A's support falls between .45 and .65.

If the margin of error were reported, I would state, "In a sample of 100 likely voters, Candidate A appears to enjoy 55% support with a margin of error of ± 10%."

Candidate B with 45% support turns out to have the exact same margin of error (look at the formula for the standard error to see why) of ± 10%. Thus, Candidate B's support is
μ = .45 ± .10 or between 35% and 55%.

Notice that the confidence intervals of the candidates overlap (45% to 65% vs. 35% to 55%). This means that we are not confident that Candidate A is really ahead of Candidate B. This is what is referred to as a "statistical dead heat." Even though the sample data suggests that Candidate A is ahead, we cannot make strong conclusions about the race. This is same as retaining the null hypothesis. We see a difference in the sample but it is not large enough to conclude that the difference would be seen in the population.

Blackboard 8) Among 200 chocolate lovers, only 40% preferred Midnight-in-a-Cave Mini Dark Chocolate Wafers over Cow & Cacao Milk Chocolate Delights. Do the 99% confidence intervals of the 2 chocolates overlap? Hint: is the upper bound of Midnight-in-a-Cave Mini Dark Chocolate Wafers higher than the lower bound of Cow & Cacao Milk Chocolate Delights?
Blackboard 9) 1000 likely voters were surveyed about how they were likely to vote in the election the next day. 53.5% said they would vote for the incumbent. Can the incumbent be 90% certain of victory? (Hint: Is the lower bound of the 90% confidence interval higher than 50%?)
Blackboard 10) From the question above, what is the lower bound of 99% confidence interval? Convert percentages to proportions (between 0 and 1). Round to 2 decimals.