GARY C. RAMSEYER'S ARCHIVES OF STATISTICS FUN

These Were The "Good Old Days" In Statistics

  1. WHEN an instructor spent a third of a course teaching the computation of the mean and standard deviation from a grouped frequency distribution and then apologized to the student that these were only approximations.

  2. WHEN a real luxury was owning a $150 Texas Instrument hand-held calculator that could perform the four fundamental operations.

  3. WHEN a student complained about math anxiety the instructor could compassionately recommend completing a one-week regimen of a paperbacked programmed-instruction book stressing the mathematics necessary for basic statistics.

  4. WHEN angrily dropping a jammed 50 lb Frieden rotary calculator to the floor would magically restore the machine to full operation.

  5. WHEN performing a Wherry-Doolittle multiple regression on a rotary calculator produced only a weary statistician.

  6. WHEN a bulky $3000 Monroe programmable electonic calculator was regrettably limited to 32 steps and would just barely compute a standard deviation.

  7. WHEN your Nixie-tube Sony programmable desktop calculator was the envy of the entire department.

  8. WHEN using the mainframe computer for analyzing data necessitated punching Hollerith cards on a humungous 8 ft. wide steel contraption and then storing 100's of these cards in long cardboard boxes which could be lugged from building to building.

  9. WHEN students were awestruck with the notion of an ANOVA replacing six pairwise t-tests to test the equality of four treatment population means.

  10. WHEN the spirit duplicator ("ditto") machine made you appreciate the dangers of glue sniffing and had you begging for a box of latex gloves to protect your hands from the purple plague.

  11. WHEN nonparmetric tests which were the rage of the 1950's were likened to the discovery of penicillin and forced you to question even the most minute violation in the assumptions of parametric tests and subsequently toss many t or F-tests on the junk heap.

  12. WHEN students were convinced that there was only one unique table of random numbers and were dumbfounded when they did a frequency count of single digits in the table and found them roughly rectangularly distributed rather than normal.

The Above Was Archived on 17 August 2001.


During the month of November or December, graduate students in my multivariate analysis class traditionally pay special homage to the celebrated Cayley-Hamilton theorem. It is accorded this high honor by Professor Maurice Tatsuoka in the chapter on linear transformations, axis rotation, and eigenvalues in his excellent textbook. The role of this theorem in the textbook is rather obscure. It is not, to my knowledge, applied or used in any multivariate technique or employed in the proof of any other theorem or formula in the entire text! It is presented as a stand-alone pillar of mathematical splendor. It is shocking to many students that a theorem can have absolutely no practical application what so ever but yet be intrinsically elegant in and of itself. However, one can reflect on innumerable things in real life that are of this very same nature.

In my graduate-level statistics classes I have three, four and five star handouts graded according to importance. The one-page handout on Cayley-Hamilton is a SIX-STAR handout and is the only one I have ever given this highest distinction. This handout is printed in limited quantities and serially numbered to insure its value as a collector's item. My students are admonished not to discard or in any way, bend, fold, or mutilate this work of art. Remember, I tell them..."ASK NOT WHAT THE CAYLEY-HAMILTON THEOREM CAN DO FOR YOU BUT WHAT YOU CAN DO FOR THE CAYLEY-HAMILTON THEOREM."

Here is a spectacular demonstration of how like its eigenvalues a matrix behaves. I am proud to present in gif format the celebrated Cayley-Hamilton theorem. ENJOY!!!

The Above Was Archived on 21 September 1999


This month I would like to focus on perhaps one of the most significant news event that has occurred in my lifetime. It makes the discovery of the incandescent light bulb or man's landing on the moon very inconsequential. I am of course referring to Mark McGwire of the St. Louis Cardinals and his shattering of the single season home run record on Tuesday, September 8, 1998 at 9:18 PM EDT. This record is the most revered one in all of sport both in the USA and many foreign countries. All eyes were fixed on Busch Stadium that evening when Mr. McGwire in the wink of an eye rifled his 62nd home run over the left field wall.

What does this feat have to do with the field of statistics you ask? I maintain it has everything to do with statistics. Baseball is a game whose very objective and rich heritage is vitally dependent on the art of record keeping and the meaningful manipulation of these records. There is no other sport in the world that breeds the thousands upon thousands of numbers and summaries that baseball does year after year. Indeed, each season there are many new records contrived to fit the particular accomplishments and combinations of skills of certain ballplayers and or the teams that employ them. Observe the emergence in recent years of the 30-30 or the 40-40 player or the manager's detailed charting of pitches thrown.

My purpose here is not to discusss all these newfangled indices. I will leave that task to the writers and news media who scramble to produce these tidbits to justify their existence. I simply want to capture the wonder of that magical September night and relate to you my observations of what were the important coincidences and facts about that historic day. Here they are:

  1. The stock market made its greatest daily gain ever of 380 points on the Dow.

  2. McGwire's 62nd home run was his shortest up to that point in the season at 341 ft. His longest was 550 ft. Through 144 games his 62 home runs have totalled 25,684 ft. or 4.9 miles.

  3. The record breaking home run would have been nullified by the umpire had McGwire missed first base ( He initially did ) but touched all the other bases and the Cubs would have appealed before the first pitch to the next batter.

  4. Tuesday was McGwire's favorite day during this stretch. He hit 15 of his 62 home runs on Tuesdays.

  5. The fourth inning was McGwire's favorite inning. He hit 11 of his 62 home runs in the fourth inning.

  6. None of McGwire's 62 homers was hit on a 3-0 count! Only one was hit on a 3-1 count! Wow what patience as a hitter.

  7. McGwire hit his 62 homers off of 57 different pitchers. He did a masterful job of spreading the misery around.

  8. McGwire came to bat 451 times officially up to and including his 62nd home run. In other words, through 144 games he could be expected to hit a home run every 7.27 times at bat.

  9. During this stretch the Chicago Cubs and Florida Marlins tied for being victimized the most by McGwire home runs. Each of these teams had 7 homers hit against them.

  10. Of McGwire's 62 home runs so far, 30 were solo homers and 23 of his last 31 homers were solo blasts.

  11. The date and time of McGwire's record home run was 9/8/1998 at 9:18 EDT. If the single digits in these numbers are totaled the sum is 62. TRULY AMAZING!

  12. Finally my wife put the frosting on the cake that day. She found the bezel and crystal which had been lost for several days for my FAKE Rolex watch. A STATISTICIAN JUST COULD NOT ASK FOR A GREATER DAY!

You can see how a statistician can easily become obsessed with facts and figures like the above set particularly when that statistician happens to be a Cardinal fan. However, there is much more to this story. I would truly like to thank both Sammy Sosa of the Cubs and Mark McGwire for the great show that they have put on this year ( and the home run derby is not over with this writing). I think these two fine role models have shown baseball and this country that their close friendship and encouragement of one another is what life is all about! All other athletes should take special note of this relationship.

The Above Was Archived on 16 December 1998


This month we will present another neat probability experiment that can easily be conducted by students as a short assignment or an in-class project. If you have been a regular reader of my home page you may recall the penny-spinning problem whose disscussion now resides in my Archives of Statistics Fun. That experiment and the current one represent excellent opportunities for an instructor to highlight the value of the Monte Carlo method in estimating probabilities for unusual variations in experiments that don't lend themselves to the usual formulas.

Let's state the current question in simple language:

If a penny is flipped until a head first appears, what is the probability that this first head occurs on an odd-numbered trial (i.e.,first, third, fifth, etc.)?
At first blush, a typical student would reason that since the first head is just as likely to occur on an odd trial as it is on an even trial (second, fourth, sixth, etc.), the probability is obviously .5. But wait! Another student mentions that maybe the probaility should be somewhat greater than .5 since the first opportunity for a head to pop up is on the very first trial, and an odd trial continues to preceed an even trial after the first two. At this point the band-wagon effect sets in and students begin to incrementally up their estimates slightly from .5. But after many values are offered, a hush settles over the room and students begin to look at one another and shrug their shoulders. No one is really sure!

Enter Captain Sigma (the instructor)! With a flourish of his cape and a wink of his eye, he quietly suggests that this is a problem that just begs for empirical data. He urges each student to take about five minutes at home and repeat the experiment 10 times, tally how many times the first head appears on an odd trial, and bring the data to the next meeting. The students happily concent to this simple task ( Someone in the back of the room asks, "How many points is it worth?") and they all eagerly await the pooling of their data at the next meeting.

Two days later the instructor rushes into the classroom and puts all the results from 35 students on the board. The students sit on the edge of their seats in awe as the numbers accumulate. The final tally results in 245 out of 350 replications ending on an odd trial. Zowie! THAT IS 70%! Something is wrong. The pennies must have been seriously flawed.

The instuctor showing no emotion on his face allows the buzzing and chattering to go on for several minutes. Finally he cracks a grin and informs the students that this result is a very good estimate although it is a tad too high. He proudly states that the actual answer is P=2/3 or 67%. The students are dumbfounded and become quite excitable. They actually all cheer for the instructor and demand a formal proof (Did I say "cheer" in a stat class? I must be delirious from a high fever!).

Here is what the instructor wrote on the board:

The solution involves the sum of the first n terms of a geometric series expressed as:

S = a + ar + ar2 + ... + arn-1

Where
a = first term of the series
n = number of terms
r = the common ratio
S = the sum of the first n terms calculated by

S = a (1 - rn) / (1 - r)

In our case, a =1/2 = .5 and r = (1/2)(1/2) = .52 or .25 and using the first expression for S we have:

S = .5 + .53 + .55 + ...

In words, the above is stating that the probability of getting the first head on an odd trial is the probability of getting a head on the first trial (.5) plus the probability of geting a head on the third trial (.5)(.5)(.5) plus the probability of getting a head on the fifth trial (.5)(.5)(.5)(.5)(.5) plus etc.,etc. for n trials.

Now to compute what this sum would be for n trials we calculate using the second formula:

S = .5 (1 - .25n) / (1 - .25)

Finally taking the limit of this calculation as n approachs infinite, we arrive at

S = (.5) / (1 - .25) = (1/2) / (3/4) = 2/3 or .67

Truly Remarkable! The students all give the instructor a standing ovation and shout "QED" "QED" "QED".... The instructor smiles sheepishly while taking a bow and thinks to himself how rewarding it is to be a statistics professor.

The Above Was Archived on 13 September 1998


Here are the answers to last month's crossword puzzle. As warned previously, some of these statististicians are not exactly household words. Use the following scale to grade your perfomance as a Statistician Trivialist:

CROSSWORD PUZZLE SOLUTION

Statisticians

Across

4. Likes a good match
6. Bullets that strayed from target
7. Advocate of equal rights for variations
8. Always correcting things
15. Rymes with macaroni
17. Past tense of feel
18. Delights in a small ratio
20. Homo sapiens
21. A delicious pear
22. Finished
23. Old Faithful
25. Detested a small expected frequency
26. Shopping for sleeping accomodations





Down

1. Inventor of cotton gin
2. French cook
3. Measures agreement in judges
5. Can walk on water in flooded farm plots
9. Honestly different than others
10. Plants thrive in this
11. A swear word
12. Quality control expert at brewery
13. Storage container
14. A fine vodka
16. Rejuvinated male
19. A chocolate covered mint
24. Uses lambda and is sometimes exact

The Above Was Archived on 10 July 1998


One of the most intriguing and frequently mentioned probability questions of all time is the so-called "Birthday Problem." I really do not remember when I first encountered this question but I know it has been around for decades and pops up in many treatments of probability in basic statistics textbooks. Let us revisit this interesting problem and hopefully shed some new light on this time-honored topic.

First, for those readers who are not familiar with this problem, let us pose the question in its simplest form:

Given a room with a random collection of N people, what is the minimum N needed for an observer to state there is greater than a 50-50 chance of at least two people having identical birthdays?
Responses to this question are many and varied depending on a person's exposure to probability topics. However, the three most frequently offered answers are (a) 183 (b) 20 and (c) 23. The correct answer,of course, is (c) 23 which may shock some of you and prompt you to immediately head out and bet some of your buddies on a coincidence of birthdays in rooms with this few people present. Before you make this rash decision read the remainder of this discussion.Note:We shall assume in our discussion that a year has 365 days rather than the 366 in a leap year. We shall also assume that by "identical birthday" or "birthday coincidence" or "duplicate birthday" we mean the same month and day disregarding the year of birth.

The (a) response of 183 has much intuitive appeal for the ordinary person on the street. He or she would reason that in order to be absolutely certain that two birthdays coincide, 366 people would be needed in the room. Now since a probability just greater than .50 of a duplicate is all that is wanted, simply take 1/2 of 366 and arrive at 183. This seems logical but the laws of probability tell us the correct N is dramatically smaller than 183! Just how much smaller?

Many people who have studied a little probability would give the (b) response of 20. Wow! This intuitively seems way to small to give us even a slight chance of a coincidence of birthdays let alone a better than even chance. But the reasoning merits close examination and goes something like this:
Check the birthdays in the room one by one. After the first person has given his or her birthday, the second person will have one chance in 365 of having the same birthday as the first. The third person could have the same birthday as the first or second person, so the third person has 2 chances in 365. Added to the chance from the second person, there are a total of 3 chances in 365. By the same logic, the fourth person has 3 chances of having the same birthday as any of the first three so this needs to be added to the previous 3 to get 6 chances out of 365, and so on. By the time we exceed 183 chances in 365, which is just greater than our 50-50 probability, we will have checked just 20 people. Mathematically, this is more concisely expressed as follows: We want the smallest integer N-1 such that
(0)(1/365)+(1)(1/365)+(2)(1/365)+(3)(1/365)+...+(N-1)(1/365) > 1/2 or
(1 + 2 + 3 +...+(N-1))/365 > 1/2

The required N-1 is 19 since 1 + 2 + 3 +...+ 19 = 190 but don't forget to add one for the first person checked even though a match can not occur with just one person. Thus N = 20 people are required according to this line of reasoning.

But hold the phone! We have a serious flaw in this argument. The number (1 + 2 + 3 + ...+(N-1))/365 is NOT a probability but the EXPECTED VALUE or MEAN NUMBER of birthday coincidences for N = 20 people in a room. Thus, if many rooms of randomly assembled N's of 20 people were examined, the mean number of coincidences is just greater than 1/2. This is not particularly reassuring to a shrewd betting person!

Although N=20 is an incorrect answer to the original problem it does suggest an alternate approach for betting purposes. Suppose we wanted the expected value of coincidences to be just greater than one. We could continue the above computatation for several more terms until the ratio just exceeded one. With a calculator it is easy to see that we must only go out to N-1=27 or N=28 for this to occur. Thus with many rooms of N=28 people we would have a mean number of coincidences just greater than one and many bettors would take greater comfort in this value.

Now let us explain the correct answer (c) N=23 for the original problem. The easiest approach is to find the probability of NO duplicate birthdays in a sample of size N and then subtract this result from ONE to get the probability of at LEAST ONE duplicate. Again we shall check the people one by one. After the first person establishes a birthday (P=365/365), the probability of the second person's birthday not duplicating the first is (365/365)(364/365). The probability of the third person not duplicating the first two is (365/365)(364/365)(363/365). This multiplicative process goes on and on and for a given N this product becomes (365/365)(364/365)(363/365)...((365-N)/365). Since this term is the probability of no duplicates, our task is to determine the value of N that will cause this probability to be as large as possible without exceeding .50. Then when this probability is subtacted from one the probability of at least one duplicate will just exceed .50. With a hand calculator it is easy to show that when N=22 this product is .5243 and 1 - .5243 = .4757 but when N=23 the product is .4927 and 1 -.4927 =.5073. We can thus state that if a room contains 23 randomly assembled people, we stand a slightly better than 50-50 chance of finding a duplicate birthday.

If you are a conservative bettor and flinch at the above chances, I have developed the following table that allows you to read in the desired probability level for a duplicate birthday and read out the required N in your room. Note that the probability levels should be interpreted as actually those just greater than the listed value.

Required Numbers of People
For Selected Probabilities
of a Birthday Coincidence
Probability N
.50 23
.60 27
.70 30
.75 32
.80 35
.90 41
.95 47
.99 57

Thus if you are a real gambler, when the situation presents itself, you go with N=23 and impress the pants off everyone in the room by hopefully finding a duplicate. If you don't feel that you are an exceptionally lucky person, then you might select the comfortable 75-25 chance of a duplicate and use N=32. On the other hand, if you fall at the other end of the continuum and only bet on close to sure things, then pick the .99 level and go with N=57. Here you are almost certain to find a duplicate but the people will not be that impressed and you won't elicit that wonderful "WOW!" effect.

I tried this experiment last semester in my Statistics I class with N=26 students in attendance that day. I knew my chances were below .60 but I put on an air of absolute certainty with my pronouncement. I confidently started around the room with students stating their birthdays. When I got to the 12th person I had a duplicate. The reaction was electrifying. You would have thought that I had just floated an elephant in mid-air! My student ratings skyrocketed for at least one day!

I hope you have enjoyed my presentation of the above topic and hopefully if you are an instructor you can have some fun and try this in your class. I must fess up to one other assumption that was not mentioned earlier. Not only must you assume a random sample of people are assembled in the room but theoretically you must assume that birthdays are randomly distributed throughout the 365 days of the year. This is probably not satisfied in any strict sense but that is a question involving a whole different ballgame. If you are turned on by the concept of chance and how pervasive it is in our society check out Chance News, a bimonthly newsletter letter published at Dartmouth University.

The Above Was Archived on 5 April 1998


This is the season of good cheer and merriment. If you know of a lonely statistician please tell him that you love him and truly appreciate all the wonderful methodologies that he has perpetuated and enhanced throughout his career. I am sure any kind remarks directed his way will warm his heart and point him toward the new year with a renewed sense of vitality and dedication.
HAPPY HOLIDAYS to all my readers! May all your Summation Sigmas be operational and all your means be true m's.

We at RAMO PRODUCTIONS are particularly thankful for this Holiday Season. In fact, we are ecstatic and even giddy over the honor that was recently bestowed on this Web Site. At the 1997 Annual Conference of the Society for the Preservation of Humor In Statistics (SPOHIS) held November 20-22 in Las Vegas, this Home Page was awarded "The Golden Sigma Cup." This highly coveted award signifies the BEST contribution of any Site on the WWW toward the promotion of statistics as a humorous subject. The acceptance of this award was truly a defining emotional moment in my career. I would like to thank all the members of the SPOHIS Academy for the necessary and sufficient consideration given all the nominees for this award and the unbiased selection of this particular site. I will try to be a worthy recipient of this magnificent cup and redirect my energies toward uncovering new tidbits of humor that make statistics the enchanting field that it has now become.

The Above Was Archived on 7 February 1998


This month all my readers will be given a real treat. The World Famous Three Step Method (WFTSM) will be revealed. I have had many requests and pleadings through my guestbook and other personal email to present this marvelous technique to the World Wide Web. This procedure which I consider the Holy Grail of statistical methodology (just ask my students) is a three-step sequence for calculating the sample standard deviation. Some textbooks emphasize a multi-step, direct or brute force procedure (see formula on right) which for most situations is very painful and tedious. That is, given a set of N raw scores (X-scores) you are advised to (a) compute the mean (b) subtract the mean from each score (c) square each of these deviations (d) sum the squared deviations (e) divide the sum of squared deviations by the number of scores N and finally (f) extract the square root of this result. While this technique works rather well when the number of scores is small and the mean is a nice whole number, it is a nightmare in other situations. When the number of scores is say 15 or more and the mean is a decimal (In practice this will be true about 95% of the time), this procedure involves repeated subtracting and squaring of decimals and gets extremely messy even when using a calculator. A better method is needed!

Never fear. A white knight is waiting in the wings. Let us apply some finesse and demonstrate an elegant substitution for all but the last two steps in the above procedure. Please study the animated gif on the right. Observe that Step One is the key to the entire computation. It is the mathematical equivalent of steps (a) through (d) in the "brute force" method. It requires only two basic calculations: ∑X (the sum of the raw scores) and ∑X2 (the sum of the squares of the raw scores). Once Step One is computed, school is almost out and Steps Two and Three roll out very easily. Note also that Steps Two and Three here are exactly the same as the earlier steps (e) and (f) respectively.

To illustrate this new calculation consider a simple example. Suppose we are given the following set of 15 raw scores (X's):
5, 6, 8, 8, 10, 12, 12, 12, 14, 16, 16, 18, 18, 19, 20
For our data ∑X = 5 + 6 + 8 +...+ 20 = 194 and
∑X2 = 52 + 62 + 82 +...+ 202 = 2838

Now substituting the above results and applying WFTSM:

  1. x2 = ∑X2 - (∑X)2/N (STEP ONE-Sum of Squares of Deviation Scores)
    x2 = 2838 - (194)2/15
    x2 = 2838 - 2509.0667 = 328.9323
  2. s2 = ∑x2/N (STEP TWO-Variance)
    s2 = 328.9323 / 15 = 21.9288
  3. s = Sq Root (∑x2/N) (STEP THREE-Standard Deviation)
    s = Sq Root (21.9288) = 4.68
VOILA! There you have it ladies and gentlemen. This is the formula that has taken the world by storm all the way from El Paso, Illinois to Paris, France! You ask - Why is it so gosh darn great and globally famous?
Let me enumerate its many advantages and virtues:
  1. It allows the researcher to stop at any of the three stages of the sequence. For a descriptive index of variability you probably want to perform all three steps and get the standard deviation. However, sometimes in different statistical developments you may want to stop at Step Two and obtain the Variance. Most Important, a researcher on many occasions will fold the tent and stop with Step One. This result is probably the single most pivotal calculation in the entire statistical world. It is the cornerstone for such procedures as the t-test, Analysis of Variance, correlation and regression, discriminant analysis, Multivariate Analysis of Variance, etc.
  2. It lends itself well to computation with a scientific calculator. Most of these have hard-wired functions that will give the results of all three steps at the touch of several keys after a one-touch entry for each of the raw scores.
  3. It avoids nasty computations with decimals in the early stages when the raw scores are whole numbers.
  4. It avoids rounding error early since the "brute force" method requires as a first step the computation of the mean and subsequent rounding.
  5. There is something sacrosanct about three-step methods in math and statistics. The third step has the effect of developing closure. Two, four, and six step-methods seem out of balance to a statistician! :-)
  6. It is COOL! COOL! COOL!

OK now that we have proven WFTSM is the greatest thing since sliced bread, how do we accord it high distinction in the folklore of statistical methodology? This is my suggestion to all students: Get a calligrapher to write WFTSM on fine parchment paper. Then roll and tie the scroll and place it in a soft bed of puffed pima cotton. Place the cotton and scroll in a box wrapped in blue holographic foil. Finally have an expert gift-wrapper surround the box with white silk ribbon topped with an elegant bow. Finally go to your dining room table and replace the flower arrangement as the center piece with the blue box. Now we have given WFTSM its proper respect. It will be the center of attraction for all guests invited to your house for dinner. Just think of the thrills you will experience explaining to your best friends the story of the blue box and WFTSM.

Thanks for reading the development of my favorite statistical procedure. Who says statistics has to be dull when it embraces world renowned technology cradled in elegant blue boxes!

The Above Was Archived on 20 December 1997


October is the month of goblins and ghoulies. Unfortunately, students of basic statistics experience far too many of these creatures on days other than Halloween night. As promised last month, I will offer some general suggestions for teaching the course in basic statistics. Several caveats are in order. First, these ideas have worked for me over several decades of teaching but I make no warranties they will work for other instructors. Secondly, these techniques have been employed in classes with enrollments of between 30 and 40 students and therefore are probably not appropriate for large lecture sections. With this in mind, I present this short list of hints to help rid the statistical learning environment of goblins and ghoulies:

Teaching Tips for the Instructor of Basic Statistics

  1. Utilize group activities in the classroom when it is desired to solidify certain critical skills (e.g., calculating the standard deviation with a computational routine). This is time consuming but most students enjoy a change of pace from the usual lecture or discussion. I have good luck with triads and quads of students working together. Larger groups inhibit the verbal interchange of ideas.
  2. Use open-book examinations! Yes, I said it! This encourages students to study concepts and relationships rather than memorize formulas. Moreover, it is one of the best anxiety-reducing techniques available.
  3. Use power not timed examinations. I generally give students two hours for what would typically be a one hour examination. This usually necessitates giving the examination in the evening and accomodating many make-up exams. I believe it is well worth the price. Students really appreciate the extra time and this is another great anxiety-reducing tool. (Consider that for many students it takes thirty minutes just to quit twitching and quaking and get down to business. :-) )
  4. I firmly believe that a comprehensive basic course that covers the waterfront of statistical techniques is WORTHLESS. It is far better to cover fewer topics but cover them in depth rather than jam a plethora of topics into the course and only touch upon the highlights.
  5. Emphasize the handful of reoccurring themes in basic statistics. For example, with any set of data we always want information about three important characteristics: (a) the form and outstanding features of the data when it is graphed through a histogram, stem and leaf plot, or a box and whisker plot (b) central tendancy or location of the data and (c) variability or dispersion of the data. This theme MUST be stressed whether you are dicussing raw sets of data or sampling distributions of statistics.
  6. Give graded assignments on a weekly basis consisting of one or two problems. This sends an important message to the student that statistics must be practiced on a daily basis and must not be allowed to slide into a single marathon study session over a two or three week period.
  7. Above everything else, maintain a sense of humor and don't take yourself so seriously. Students associate your mood and outlook with how they will perceive the content of the course. Remember you are not teaching a course in human sexuality which is inherently interesting. You must exude excitement and enthusiasm in showing students how statistical methodology can have relevance in their lives.
Thanks everyone for reading my tips for the instructor. Hopefully, some of the above will promote some lively discussion. Let me hear from you!

The Above Was Archived on 11 November 1997


The fall semester has now begun at most universities across this great country. This means that many students are experiencing for the first time an encounter with a basic applied statistics course. Whether the course is taken in business, psychology, education, biology, economics or some other discipline really does not matter. The frequency of application of certain techniques will vary from field to field but the basic concepts remain amazingly the same. If you are an upperclassman, this is the course you have postponed for several years and with great trepidation must now meet head on. If you are an underclassman, the fear is no less since the horror stories already hit the moment you arrived on campus. You must cope with this perceived encirclement by dragons. Your mental outlook and approach to this course will become the single most important determinant of a meaningful positive experience with beginning statistics. I have attempted to put together, from several decades of teaching, a short list of helpful suggestions for the student. I make no warranties that these will work with everyone. I do know, however, that many students over the years have given me feedback that these hints are quite useful. Here they are:

Study Tips for the Student of Basic Statistics

  1. Use distributive practice rather than massed practice. That is, set aside one to two hours at the same time each day for six days out of the week (Take the seventh day off) for studying statistics. Do not cram your study for four or five hours into one or two sittings each week. This is a cardinal principle.
  2. Study in triads or quads of students at least once every week. Verbal interchange and interpretation of concepts and skills with other students really cements a greater depth of understanding.
  3. Don't try to memorize formulas (A good instructor will never ask you to do this). Study CONCEPTS CONCEPTS CONCEPTS. Remember, later in life when you need to use a statistical technique you can always look the formula up in a textbook.
  4. Work as many and varied problems and exercises as you possibly can. Hopefully your textbook is accompanied by a workbook. You can not learn statistics by just reading about it. You must push the pencil and practice your skills repeatedly.
  5. Look for reoccuring themes in statistics. There are probably only a handful of important skills that keep popping up over and over again. Ask your instructor to emphasize these if need be.
  6. Be a Gestalt Psychologist! In other words, recognize that the whole of statistics is greater than the sum of its parts. It is very easy to get hung up on nit-picking details and fail to see the forest because of the trees.
  7. If you are a victim of math or stat anxiety (Probably 70 % of the general population are) do something about it! Most universities understand the debilitating nature of this problem and provide excellent counseling programs for the alleviation of this disability. Do yourself a favor and get help. This may very well be the best decision you make in undergraduate school.

If you are a student, I hope the above suggestions prove useful. Next month I will present some tips for the instructor of basic statistics.

The Above Was Archived on 7 October 1997


Merry Christmas everyone! Contrary to popular belief statisticians also believe in Santa Claus and have their wish lists. I thought you might want to see what desires I have had for many years. These are far out so be prepared!

CHRISTMAS WISH LIST

  1. A revolutionary cylindrical statistics classroom with the following features:
    1. A wrap around chalkboard with a dustless, automatic wipe-eraser.
    2. A catwalk next to the board on which the instructor moves.
    3. A circular revolving platform on which the seated class moves.
    4. Finally, if the administration is far sighted, these classrooms can be stacked on top of one another to form the statistics classroom of the future that looks like the Leaning Tower of Pisa.

  2. A complete set of 50 trading cards portraying the Top Statisticians of All Time. The cards would be uv-coated with holographic foil stamping on both sides. The backs would contain the significant publications and contributions of each statistician. A gold insert set would feature R. A. Fisher, Karl Pearson, Harold Hotelling, G. P. Box and John W. Tukey.

  3. A year's supply of head-nodding, smiley-faced students who could be strategically seated at the instructor's discretion in any of the statistics classes. These students roar into action when the lectures become the least bit dull or boring.

  4. A 30-minute documentary 3-D movie on bivariate normal distributions with all the accompanying projection equipment. This movie would demonstrate in virtual reality the passing of planes both parallel and perpendicular to the xy-plane through the bivariate surface to yield isodensity contour ellipses and univariate normal distributions respectively. Also, the testing of the equality of the centroids of several bivariate populations could be dramatically illustrated.

  5. Semester evaluations of my statistics classes by my students that would compare to the glowing ratings received by a professor who teaches human sexuality where the material is intrinsically interesting and not steeped in mathematics.

Hope you enjoyed the above. Have a happy holiday season! If you are a statistician don't take yourself seriously and laugh at yourself. If you are a student make a New Year's resolution to attemmpt to understand the poor statisticians of this world who are only trying to eke out a living.

The Above Was Archived on 28 February 1997


For the month of November we have a very special report for you! From our home office high atop the grain elevator in Fooseland, Illinois we are proud to bring you: THE TOP TEN REASONS WHY STATISTICIANS ARE MISUNDERSTOOD. These are not listed in any particular order of importance but represent all those nagging suspicions you have always harbored against statisticians but were always afraid to ask about. Fasten your seat belts and here we go!

  1. They speak only the Greek language.

  2. They usually have long threatening names such as Bonferonni, Tchebycheff, Schatzoff, Hotelling, and Godambe. Where are the statisticians with names such as Smith, Brown, or Johnson?

  3. They are fond of all snakes and typically own as a pet a large South American snake called an ANOCOVA.

  4. For perverse reasons, rather than view a matrix right side up they prefer to invert it.

  5. Rather than moonlighting by holding Amway parties they earn a few extra bucks by holding pocket-protector parties.

  6. They are frequently seen in their back yards on clear nights gazing through powerful amateur telescopes looking for distant star constellations called ANOVA's.

  7. They are 99% confident that sleep can not be induced in an introductory statistics class by lecturing on z-scores.

  8. Their idea of a scenic and exotic trip is traveling three standard deviations above the mean in a normal distribution.

  9. They manifest many psychological disorders because as young statisticians many of their statistical hypotheses were rejected.

  10. They express a deap-seated fear that society will someday construct tests that will enable everyone to make the same score. Without variation or individual differences the field of statistics has no real function and a statistician becomes a penniless ward of the state.

The Above Was Archived on 20 December 1996


We are quickly approaching election day and throughout the entire month of October you can expect to be bombarded with the results of many presidential polls. The pollsters of today (Gallup, Roper, etc.) use highly sophisticated techniques that employ samples of about 1600 or less registered voters who are likely to vote. If these samples are drawn at random, the public can expect the percentages that favor the candidates to fall within a 3% or 4% margin of error. This all sounds great to the typical citizen (except if your candidate is trailing). What happens, however, if the sample is biased or in some pernicious way, nonrandom? In short, incorrect inferences may be drawn and widely disseminated, the public may lose faith, and entire polling organizations or their sponsers may go out of business! Following is what I consider to be the worst case in history of a biased presidential poll which resulted in such a devastating effect (No folks, I am not going to rehash the Truman-Dewey election of 1948).

Here are the basic concepts of a random sample and bias. A sample is considered random if each member of the population from which it is drawn has an equal chance of of being selected. I like to think of a random sample as an equal opportunity employer. A table of random numbers or a computer is usually employed to draw a random sample. A sample becomes biased when certain members of the population have a greater chance of being selected than others. The sample then tends to systematically overestimate or underestimate a certain characteristic of the population such as the percentage of a particular class of individuals. Thus, serious inferential errors may occur.

Now go back to the year 1936. This was the year that pitted the Republican, Alf Landon, against the Democratic incumbent, Franklin Roosevelt. This year was during the great depression. It also should be remembered for the record extreme temperatures and dust storms that hit the midwest. In fact, it was so hot that a statistician could not even calculate a standard deviation without working up a sweat. Moreover, the high humidity that year forced the Goudey Baseball Card Company to print only black and white cards and many of the cards came off the printing press hopelessly bowed.

A prestigious weekly news periodical called The Literary Digest continued that year a tradition of conducting a national presidential poll through the mail. Supreme faith was placed in a humongous sample of 10,000,000 prospective voters drawn primarily from telephone directories. When the returns were tallied, an easy win for Landon was indicated. This highly respected periodical staked its reputation on this outcome. With such a huge sample how could anything go wrong?

Well, as strange as it may seem, the poll was a miserable failure. Two sources of bias that were unfortunately in the same direction raised havoc with the results. First, the poll excluded non-telephone owners and hence also included a disproportionate number in the older age groups. Since many more Republicans (the wealthy) in these depression years owned phones than Democrats (the poor), it is not surprising that the returned ballots would favor the GOP candidate. Secondly, it is a fairly well known fact that members of the party out of power are far more likely to return ballots through the mail than members of the "in" party. This again supported more Republican ballots. These two sources of bias formed a potent combination that pointed toward a Landon victory. It is interesting to note that in the actual election, Roosevelt swept all states except two and won in a landslide. It is also of historical note that several years after this debacleThe Literary Digest went out of business.

As an ironic footnote to the above story, The Literary Digest, using the same sampling technique, correctly called the outcome of the 1932 election. Recall that Roosevelt ran against the incumbent Republican, Hoover. Again, economic problems were the prime issues in this campaign. However, the above two sources of bias were in opposite directions and tended to cancel one another out. That is, the use of telephone directories favored Republican returns but the "out of office" phenomenon favored Democratic returns. Thus, through sheer luck,The Literary Digest correctly predicted a win for FDR.

Here are some important lessons from these historic presidential polls:

  1. If bias is present, a huge sample has nothing to do with an accurate survey result. Even millions in a sample cannot overcome a nonrandom procedure.
  2. One should not use the mail service for a scientific poll. Too much depends on the whims of the prospective respondents.
  3. Baseball cards should not be stored under humid conditions. :-)
  4. Democrats should make sure they are listed in the phone directory. :-)

The Above Was Archived on 10 November 1996


If you have taken a basic statistics course, when the topic of probability was introduced you no doubt heard the instructor mention the time-honored coin flipping example. Flip a penny and the chances of getting a tail (or head) is 1/2. No problem-this is a concept that a primary-aged child can understand.

But change the scenerio slightly. Suppose a penny held vertically on a table by the index finger of one hand is spun vigorously with a flick of the other index finger and allowed to come to rest flat on the table. Is the probability still 1/2 of either a head or a tail facing up? Well, the head-nodders in the first few rows generally smile warmly and shake their heads up and down in agreement ( By the way, all you aspiring statistics instructors should enlist at least five head-nodders prior to the second week of class to offer you constant supportive feedback during the entire semester :-)). However, the more the students reflect on this situation, the more uncertain they become. Is spinning really the same as flipping? Finally, a carefully planted confederate toward the back of the room timidly suggests that maybe we should replicate the experiment a number of times and see what happens. Yes! Yes! Yes! Just what you want as an instructor. You quickly seize this opportunity to introduce the class to Monte Carlo type probability. You announce an extra credit assignment for everyone in the class. Each student is instructed to select a relatively shiny penny without noticeable wear, spin the penny on a table 100 times, and record the number of tails that face up. A deadly silence settles over the classroom! The students now realize they have been hoodwinked into performing a rather embarrassing act, particularly if their dorm roommates are watching that evening. Spin a penny 100 times on a table and watch it land- what type of looneyness is this? Before any student utters another word, you quickly remind them that all the results will be posted and discussed next meeting, and then you grudgingly dismiss them two minutes early.

Next meeting the fruits of your well-planned operation are realized. Thirty of your 35 students complete the experiment. Wow!- you gleefully think to yourself. That is 3000 replications of the penny-spinning experiment. Methodically, you start around the room asking each student to report the number of tails he or she obtained on the 100 trials. The numbers roll in and you write them on the chalkboard: 65, 59, 57, 64, 52, 70,... The students sit in utter amazement as a definite pattern unfolds. Overall, the numbers appear to be much larger than 50! When all the numbers are collected, a mystified student in the rear of the room suggests that we average the 30 results. Obligingly, I ask a student with a fancy calculator and a pocket protector in the front row to add up the results and find the mean. In a wink of the eye, the student blurts out 62.12. This is totally unreal! Can we place any faith at all in this finding? Does this mean that if we spin a penny on a table many times the coin will fall with tails facing up about 62% of the time?

*DISCUSSION*
This is no abberation. Experts refer to this phenomenon as the "pop bottle cap effect". Find a cap from an old 16 oz. bottle of Pepsi or Coke and spin it in the same fashion we did the penny. About 90% of the time or more, the cap will fall with its top facing down and the sides facing up. Now how does this relate to the penny? If you examine closely a relatively new shiny penny, you will observe that the edge around the penny protrudes further on the tail's side than on the head's side. Thus, the extra edge on the tail's side simulates the side of the pop bottle cap although certainly not as pronounced visibly. The experts proclaim that the extra edge produces results that in the long run converge on 60% tails facing up. Of course, if you use a worn penny, this advantage in favor of tails disappears. I can indeed attest to these results. In the four or five years that I have used this experiment in class, the results have hovered right around 60%. Amazing but true. I then tell my students they have a sure way of winning some money. Engage a friend ( maybe your nosey roommate from last night) in a four or five hour penny-spinning game and bet on tails each time!

Gosh Henry! It really does work!

The Above Was Archived on 4 October 1996


The field of statistics is replete with technical terms or jargon that I prefer to call "club words" in my classes. We have a lot of fun with these since I tell my students that they can derive much satisfaction from mastering these and joining a very unique club. They are then able to throw these terms around in casual conversation and blow the sox off of their friends who are not in the "club". Let me give you a few examples of some real humdingers.

Homoscedasticity
Homogeneous elasticity betweeen different sizes of rubber bands. NOT!
Equal population variances.
Interpolate
Breeding a statistician with a clergyman to produce the much sought "honest statistician". NOT!
The linear approximation of an unlisted value in a statistical table by using two listed values.
Kurtosis
A debilitating foot disease producing pungent odors. NOT!
The degree of peakedness in a graph of a distribution of scores.
Type II Error
An error message that pops up on my Mac when an unstable browser freezes. NOT!
Retaining a false null hypothesis in inferential statistics.
Standard normal deviates
A comparison group of sociopaths who were formally normal people. NOT!
The distribution of the standard normal curve.
Geez Albert! It wasn't that bad was it?

The Above Was Archived on 4 September 1996.

     
From the First Internet Gallery of Statistics Jokes

OUR MONTHLY SERIOUS BUSINESS

DEC 2007........DEGREES OF FREEDOM

Degrees of Freedom is a a very slippery concept that always seems on the verge of being mastered only to slither through the fingers of the beginning student. I will attempt to give it a very simple conceptualization and then present a working rule of thumb for counting the degrees of freedom in a variety of situations. In its simplest form the degrees of freedom (df-value) of a situation is the number of variables that you are allowed to vary freely without restriction. Thus, if I tell you X1 is a variable and you are perfectly free to assign it any number in our number system, you would have 1 df (not an infinity as you might think). Remember it is the number of variables not the number of values you can give a variable. Ok now suppose I expand the situation slightly. If I now give you the variables X1, X2, X3, and X4 and I again tell you can assign each of the four variables any number in our number system, you now have increased your df-value to 4. Get the idea? Now one more example will set this basic definition in stone (you can pick marble if you so desire). If I again give you the same four variables of the previous example and again offer you the opportunity to assign each variable any value until your heart is content but there is only one catch, the mean of the four numbers must be 8 when you end up. So you go on your merry way and give X1 the value 10, X2 the value 5, X3 the value 8, and without hesitation you go with the value 12 for X4. But whoa, the mean of your four numbers is 8.75 not 8! Now you suddenly realize you can't just give that 4th variable any value. You must give it a value that makes the sum of the four values 32 so that when you divide by 4 you get 8. Oh my gosh you are locked into the value 9 for X4! YOU HAVE LOST A DEGREE OF FREEDOM and really only have df=3 in this situation. In other words, giving you the opportunity to assign four variables any value but forcing the mean to be 8 is tantamount to losing a df. Knowledge of the mean counts as a restriction and subtracts one from the total df.

Now I shall move to a more generalized definition of degrees of freedom. The degrees of freedom of a statistic is the number of observations minus the number of necessary auxillary values which values themselves are based on the obsevations. This is kind of a nasty statement and somewhat flakey but don't panic. The rule works for 95% of the situations and that isn't bad statistically is it? In the last example the variables are the observations and the auxilliary value is the mean (note it is based on the four observed values), and therefore the df= 4 - 1 = 3. Finally, one more example using this better rule and I shall close up shop for the month. When correlated pairs are present, what is the df for that situation. If I give you the correlated pairs (X1, Y1), (X2, Y2), (X3, Y3), (X4, Y4) and again allow you to assign the values, the df is not 8 but 4 because a correlated pair is an observation (Now don't be rigid and not allow this). Also if you set the X mean and the Y mean, these count as two auxilliary values and the df = 4 - 2 = 2. In other words , in a correlational situational, the df is N-2 where N is the number of paired scores. With this definition you must expand your notion of an observation and be cautious about auxilliary values. Next month I will use this new rule to explain the df for the standard errors of some test statistics.

HAPPY HOLIDAYS


JAN 2008........COUNTING DEGREES OF FREEDOM FOR A TEST STATISTIC

Now that we all know something about the concept of degrees of freedom (df), I will show you how it plays a critical role in many statistical hypothesis tests. As you will soon see the df-value is usually a function of the sample size N. Many test statistics like t, F, and Chi Square have distinct df-values associated with them that must be determined in a given situation(In the case of F, it even has TWO distinct df-values...can you imagine that!). The df-value(s) is then entered into a table in the appendix of your book along with the level of significance to determine what critical value is needed to declare statistical significance. I shall use the dear old Student t given to the right to illutrate how you count this df-value for a few tests. The key with any t-test is to count the df-value of the ESTIMATED standard error which is the denominator of the ratio given to the right (indicated by the tilda sign). This then becomes the df-value for the test itself. Recall that a standard error is nothing more than a highly specialized standard deviation of the sampling distribution of the test statistic. Think of this formula as a generic template for ANY t-test with T standing for the test statistic in any given situation. In words, this formula is saying to calculate a t-ratio, take the observed value of the test statistic T and subtract the hypothesized population mean of the test statistic T and then divide this result by the ESTIMATED standard error of the test statistic T. Notice the emphasis on "ESTIMATED". If this were the EXACT standard error (no tilda), the statistic would become the well known standard normal z-test. Remember again for a test to be considered a t-test, it must be capable of being placed in the general format of this formula and the denominator must be an ESTIMATED standard error. As you might guess, the t-ratio at first glance could easily be mistaken for a z-ratio except for that little wiggle above the standard error indicating ESTIMATION. In fact, the distribution curves of both t and z are very similar(bell-shaped) except the t curve has more area in the tails as a function of the df-value. The greater the df-value the more alike the two curves become. Don't tell anyone this but a standard normal z-curve is really a special case of a t-curve with df=infinity. Now isn't that the cat's meow!

I shall now turn to the most basic t-test of them all displayed to the right, the t for testing a hypothesis about a single population mean. Recall that basically degrees of freedom is the number of variables that are allowed to vary freely without restriction. In hypothesis testing we usually work with a random sample(s) of scores of some sort. Here think of each score in the sample as a variable that is capable of taking on any value. Thus, each score becomes an observation and the total number of observations is the sample size N. Now the only necessary auxillary value in this case is the sample mean. Hence invoking the principle from last month that the df-value of a statistic is the number of observations minus the number of necessary auxillary values, the df-value of the estimated standard error in the denominator of this ratio is N-1 which becomes the df-value for this basic test. This ratio for example might be used to test the null hypothesis that a popuation mean of IQ scores is 100 which would be substituted on the right in the numerator. Of course, the sample mean and standard deviation s would be calculated from the data and plugged in also.

Several interesting observations are in order. Last week we determined that the sample standard deviation s had df=N-1 which is the same df as the estimated standard error of the mean in this test. Also if you suddenly go brain dead and forget the df-value for this test, it is staring you right in the face in the denominator of the formula. This is not by chance but occurs quite frequently with t-tests. Pretty nifty huh?

OK since things are going so smoothly, I next want to discuss the most widely used t-test in the literature...the so-called independent samples t. It is used to test the hypothesis that there is no difference in the means of two distinct popuations (i.e., the null hypothesis 0 is plugged into the right side of the numerator). The formula to the right admitedly looks a little scary but again it is nothing more than an iteration of the basic t template with the difference in the sample means serving as the test statistic. Here the observations are the scores in both samples and the auxillary values are the two separate sample means. Thus, by our rule the df-value for the estimated standard error and the test is N1+N2-2. Now that is pretty slick! The ingredients you need to calculate this t are the two sample means and the two sample variances and of course the two sample sizes. Again popping out like a zeon light from the formula is the df-value from the denominator to jog your memory. This test finds many applications. When you have two separate random samples of scores as in Experimental and Control groups or two different treatment groups and you desire to test the significance of the difference in the two means, this t becomes the star of your stat world.

Finally, another relatively important t-test is presented that tests the hypothesis that there is no difference in the means of two correlated populations. To conduct this test, we must use the framework of the correlated pairs of (X1-X2) scores which were discussed last month. Fortunately in this situation you are allowed to compute a difference score (D) for each related pair in the sample and subsequently work with the sample D's from then on out. In essence you have reverted back to the simple t test with D's taking the place of X's. Thank God for little favors. Without this simple move, you must use an alternate method which requires that you compute the correlation coefficient and treat the X1's and X2's separately. Believe me unless you do this on a computer it is a statistican's nightmare and requires three times the work. Returning to the main problem of getting the df-value here, an observation becomes a D-value of which we have N and we have one auxilliary value which is the mean D. Thus the df for the estimated standard error is N-1 which becomes the df-value for the test. Beware of something with the calculation of t. You are working with a sample of D's so a difference is computed in the same order and you will probably end up with positive and negative D's which must be accounted for. The sample mean D and the sample standard deviation of D along with 0 for the hypothesized value of the population mean D are substituted in the formula and the value of t rolls out. This test is employed when you have a pre-test and post-test situation for a number of subjects or when you have subjects that are matched on another variable prior to administering two treatments. A common mistake with this test is to treat the X1's and the X2's as independent samples and use N+N-2 or 2N-2 as the df-value (too large) and employ the independent samples t-test above. This would be a positively biased test and result in too many Type I errors.

Well that concludes my ramblings for January. I hope you are realizing that statistics has many reoccuring themes. Certainly the principles for counting degrees of freedom is one of them. You all now should be experts in counting degrees of freedom at least when you perform William Sealy Gossett's celebrated t-test.


FEB 2008........ N VS. N-1

What you say! You have to be kidding. You are making an issue of the number of scores in the sample and ONE LESS THAN THE NUMBER OF SCORES? How can that make a pennies worth of difference except in situations where the sample size is extremely small? How in the world can this be classified as a Sticky Wicket?

Well I understand where you are coming from in the moderate to large sample situation, but these two quantities have caused students of statistics more problems and confusion than a barrel of monkeys particularly when the students have used several textbooks in a course or in different statistics courses. The crux of the issue which generally an author makes no mention of is that the sample variance and standard deviation can be defined two different ways. This in turn makes subsequent formulas such as estimated standard errors (or error variances) look "seemingly" different depending on the definition when in reality the formulas are equivalent. The reason I feel that this issue requires discussion is that, to my knowledge, I have not seen a good explanation of this problem in any statistics textbook and it will save you questioning whether there are typos on many pages of the book. Let us then examine each definition , see where it takes us, and talk about the positives and negatives of each choice. Here you are going to see an issue that many statisticians are split on. If I were to guess I would say that the statistical community is about 50/50 on this one!

Now look at the two methods that are labeled (A) and (B) to the right. One thing both methods have in common is the sum of the squared deviations of the scores about the mean (∑x2). Three Cheers! In other words, statisticians pretty much agree that in most situations in order to measure how variable a set of scores is, you first must take into account each and every score in the sample. That is , you find out how far each score is above or below the mean of the sample (a deviation score). Then you square each of these deviation scores and summate the squared deviations. This is the direct or "brute force" method of computing this quantity and it involves far too many messy decimals. It is far easier to make this computation with only the raw scores and not fuss with the mean. You get ∑X and ∑X2 and employ STEP ONE of the World Famous Three Step Method. (i.e., ∑x2 = ∑X2 - (∑X)2/N). See Step One WFTSM for an example of this calculation.

Now the two methods part company. In (A) we divide the ∑x2 by the sample size N and this produces the sample variance s2. Since this index is in squared units, if we want an index in the original score units we extract the square root and have the sample standard deviation s. Division by N in this process makes sense logically because then we are able to state that the sample variance is the average squared deviation of each of the scores in the sample about the mean. This just shouts that it is measuring variation and it also just feels like a meaningful way of getting at the spread of a set of scores. Also it is valid when you have an N of 1 since the variance and standard deviation would be 0 which upon reflection is exactly what it should be.

Turning to method (B), we divide ∑x2 by N-1 to get s2 and then take the square root if desired to obtain s. However, the N-1 just seems nonintuitive. You cannot now neatly enterpret the sample variance as an average and the two formulas seem to lose their logical appeal. In addition, if N is 1 then the variance and standard deviation are both undefined because you are dividing by 0. Why then would anyone employ (B) to define the sample variance and standard deviation? I am going to whisper this but there is one slight advantage of (B). The reality of the matter is that with (B) you really have calculated an unbiased estimate of the population variance and very close to the same for the population standard deviation. So some authors feel that this method bypasses the sample index and moves directly to the population estimate. Thus, when authors label s2 = ∑x2/(N-1) and the subsequent square root as the sample variance and sample standard deviation, they are really somewhat disingenuous in doing so.

I will illustrate the confusion that the two definitions can create when you are reading different books. If the author uses (A), the estimated error variance of the mean is given by s2/(N-1) whereas if the author prefers (B) the same estimated error variance of the mean is s2/N...Two seemingly different results! But wait, two different definitions have been used for s2. REALLY THE TWO RESULTS ARE IDENTICAL! To show this, using the former result and substituting (A) for s2, we have ∑x2/N(N-1). Now using the latter result and substituting (B) for s2, we have ∑x2/(N-1)N...precisely identical results. Sooo...(what Steve Jobs would utter) what does all this mean? THE FIRST THING A PERSON SHOULD CHECK UPON OPENING A STATISTICS TEXTBOOK IS SEE WHAT STANCE THE AUTHOR TAKES ON THE N VS N-1 ISSUE IN DEFINING THE SAMPLE VARIANCE AND STANDARD DEVIATION. My opinion favors division by N but about half of the textbooks use division by N-1 so be prepared to make adjustments in your thinking. Statisticians end up at the same place on this one but sure create some illusions along the way. Thanks for reading my blurb and see you next month.


MAR & APR 2008........THE DEMISE OF THE CONFIDENCE INTERVAL

In inferential statistics, there have been two primary methodologies for gaining knowledge about population parameters. However, hypothesis testing has become the dominant force over confidence intervals throughout the latter half of the 20th century and into the 21st century. In fact in most disciplines, testing null hyotheses has become the exclusive method of choice in almost all of the research literature. The current textbooks have very little to say about confidence intervals. If they do it is in the form of a token short discussion or footnote. What has happened to a procedure that once was favored by mathematical statisticians and had an entire chapter devoted to it? Let us take a look at this procedure and see what difficulties have caused it to fall out of favor.

We will present a simple example of calculating upper and lower limits of a 95% confidence interval for a population mean μ. The figure at the right displays a standard normal curve of z-scores with two examples of useful percentiles that would be needed to obtain a 95% confidence interval. The first is called z.025 = -1.96 and by definition is the point on the z-scale such that 2.5% (.025) of the area falls below it (Remember the total area under this curve is 1 so areas correspond to probabilities). Now at the upper end we have z.975 = +1.96 or the point on the z-scale such that 97.5% (.975) of the area falls below it (upper blue area is therefore .025). The -1.96 and +1.96 come from the standard normal curve table and were perhaps memorized by some of you. Also the middle white area (called Δ or the confidence coefficient) then becomes 95% or .95. Note that in building a confidence interval, Δ is selected first and the tail-areas are always equal. Some other commonly used percentiles that may be dear to your heart from the tables are z.005 = -3.29 and z.995 = 3.29 with a middle area of 99% or Δ = .99. Also z.05 = -1.64 and z.95 = 1.64 with a middle area of 90% or Δ = .90. Great memories, huh? Now returning to the pictured example: If a random z is drawn from this distribution, the probabilty that a z will fall between -1.96 and +1.96 is .95 or mathematically, P(-1.96 ≤ z ≤ +1.96) = .95.

Next moving to the another 3-Step Procedure displayed to the right (notice never 2, never 4, always 3 steps for nice psychological closure), draw a random sample of size N from a population with known σ. Then convert the sample mean to a z in the previous probability statement and get statement (1) for a result. Then solving this three-way inequality with some simple algebra and getting μ smack dab in the middle by itself and everything else on the ends we arrive exactly where we want to be with statement (2). These end expressions are indeed the formulas for the lower and upper limits of a 95% confidence interval for μ. They are pulled out and stated for emphasis in statements (3). To cement these formulas in our minds let's do a simple example. Suppose we have a population of IQ scores with an unknown μ and σ = 16, We want to generate a 95% confidence interval for the population mean μ. If a random sample of N=64 is drawn and is computed to be 98.7, we substitute into statements (3):

    LL = 98.7 -1.96(16/sq rt(64)) = 98.7 -1.96(2) = 98.7 - 3.92 = 94.78
    UL = 98.7 +1.96(16/sq rt(64)) = 98.7 +1.96(2) = 98.7 + 3.92 = 102.62

Now the fun begins folks when we try to interpret these results. But you say, "This is a snap. We simply say the probablity that the population mean μ is between 94.78 and 102.62 is .95." But wait I hate to inform you that the population μ is a a fixed parameter and it is either between 94.78 and 102.62 ahead of time in which case the probability is one or the population μ is not beween 94.78 and 102.62 ahead of time in which case the probability is zero. Keep in mind probabilities refer to random variables and the mean μ is a fixed constant even though we don't know what it is. In other words, we can not associate a probability with any single pair of limits. This seems like a minor problem, but to many experts it is a real deterrant for using confidence intervals. Now we could replicate the experiment and obtain several sets of limits. Would this add any information? Certainly it would, but each pair of limits would be subject to the same criticism. But if I did collect an infinity of limits from N's of 64, 95% of the limits would contain the true value of μ. This is a true statement but many would deem this fact essentially useless.

The big advantage of a hypothesis test where an H0 is tested against a two-tailed alternative is that you do end up with an observed test statistic that has a probability associated with it when you reject or retain the null hypothesis. This method appears to appeal to many researchers even though we all know one hypothesis test does not prove anything. It is my speculation that the language itself with hypothesis testing has a certain degree of strength and finality associated with it. Expressions such as "Reject H0: μ1 - μ2 = 0 at the .05 level of significance and Accept the alternative that H1: μ1 > μ2" have a ring of authority linked to them. Recall also, thanks to Pearson and Neyman, we have our dear old friends Type I Error, Type II Error, and the Power of the test. It is indeed sad that the confidence interval approach has no such counterparts. In addition, the terminology of reject or retain H0 seems to mesh with complex ANOVA's where multiple comparisons are perfomed following a significant overall test. For these reasons and perhaps others that I have overlooked, hypothesis testing currently is the KING of the HILL with statisticians.

I would like to give you one advantage for the lonely confidence interval before I close shop. Assume the limits of the previous example where Δ = .95. If another reader reads these results and desires to hypothesis test instead, the results can be predicted very easily. Remember confidence intervals by nature are two-tailed and must be compared with a two-tailed hypothesis test. If the reader wants to test H0:μ = 100 with .05 as level of significance against two alternatives, retention of H0,: μ = 100 would be predicted because 100 is contained between the limits of 94.78 and 102.62. If the reader desires to test H0: μ = 104 against two alternatives, rejection of H0: μ = 104 would be predicted and acceptance of H1: μ < 104 would be supported since 104 is above both limits of 94.78 and 102.62. This may be continued on and on. Thus, the reader may very quickly and easily test any null hypotheses that his heart desires with the single set of data and limits given. Mathematicians have always thought this was pretty neat. However it has not caught on in other disciplines and this interpretation has not helped the cause for confidence intervals.

Thus, we conclude our cases for both methodologies of inference. I must admit I also favor hypothesis testing but who knows where we will be in ten years. Maybe we will turn to Tukey's Exploratory Data Analysis and refine sampling procedures to such a point where we do not even have to use inferential statistics. Now that would be a monumental advance. Meanwhile, thanks again for reading this presentation and HAPPY INFERRING!


MAY & JUNE 2008........A LEAN AND MEAN BASIC STATISTICS COURSE

In this month's Sticky Wicket we shall discuss one of my pet peeves in the area of statistics education. Just how many and what topics should be covered in the basic applied statistics course at the undergraduate level? This has been a troubling problem among the experts throughout my entire career but I have not shifted my position one iota in the last 30 years. I do not subscribe to the so-called comprehensive or "waterfront" course where you try to survey most of the statstical techniques and touch upon almost every imaginable topic. This is next to worthless. A good beginning course will teach the small set of reoccuring statistical themes that are invariant over a wide variety of fields such as psychology, biological science, political science, and yes even art and music. Statistics is statistics. There are only slight nuances between different fields with certain applications employed more often in some fields than in others (i.e, multiple regression in economics). Surprisingly, in the basic course, there are a FEW critical topics and skills that are fundamental and require mastery with a ton of practice . In this plan FEW is MORE! This type of basic course has the advantage of giving the student confidence and a solid footing in a core of topics that pop up over and over again in statistical analysis. Think of this course as the surface of a ball. You are flying a plane above the suface and darting up and down , in and out erratically, at varying heights. You want to land and enjoy the commonality of the surface (WOW, what a stimulating analogy).

Now let us look at this small glob of critical topics and skills that should be the focus of the course. I will present these in a sequential fashion but there is some flexibility in how they are ordered:

TOPICS FOR A BASIC APPLIED STATISTICS COURSE

(1) Collecting and Organizing Data.

(2) Picturing Distributions of Scores through Polygons, Histograms, Stem and Leaf Designs, and Box-and-Whisker Plots.

(3) Describing the Central Tendency of Distributions (Mean, Median, and Mode) and Examining Skewness and Kurtosis.

(4) Variability - What Makes the Whole Field of Statistics Tick. The Most Important Skill of All--- Applying WFTSM Which is The World Famous Three Step Method Used to Calculate the Standard Deviation. (The Golden Key is Step1 which is at the top of the heap as far as important formulas in Statistics go)

(5) Interpreting a Score's Location in a Distribution - Percentiles and Standard Scores (Primarily z-Scores)

(6) The Normal Curve and Reading Out Probabilities from Under the Curve.

(7) Simple Hypothesis Testing with the z-Test using LFFSM Which is the Locally Famous Five Step Method, Another Critical Skill Almost as important as WFTSM.

(8) The t-Test and Reading the Table. Coverage of the Related and Independent Samples Tests.

(9) Correlation and the Importance of Step1's Cousin (Sum of products of the deviation scores) in the Calculation of the Correlation Coefficient.

(10) Simple Regression Analysis with One Predictor Variable.

Well, there you have my Ten Super Topics that give a student the solid underpinnings of statistical thought and allow him to easily move into more advanced areas. But wait you say, there are so many topics being left out such as One and Two Way Analysis of Variance, Confidence Intervels, the Chi-square Statistic, the Power of the Test, Non-parametric Statistics, Follow-up Tests in ANOVA and on and on. No doubt these are important but not core in the sense of lower level themes. If you made time for some or all of these more advanced topics, the course would evolve into a hodge podge of techniques with only the surface being scratched on each one. Precisely what you don't want at the basic level. You want depth in the above TEN topics. After all, there are entire courses devoted to Analysis of Variance and Covariance called Experimental Design and also semesters directed at Nonparametric techniques. There is a time and a place for these courses but don't muddle the beginning student's mind with the whole ball of wax in one semester. Allow the student to have some fun and insure that he walks away with a good impression of the statistics field. Thanks for your attention.


JULY & AUGUST 2008........STEP ONES COUSIN AND HOW IT SPAWNS COVARIANCE AND CORRELATION

We have repeatedly praised in these pages the World Famous Three Step Method (WFTSM) and its gold studded STEP ONE ∑x2 = ∑X2 - (∑X)2/N as perhaps the single most important computing sequence in basic statistics (See STEP ONE AND WFTSM for an example of this calculation). For a small set of scores, this procedure is really cool. If a scientific calculator is employed, the student enters each score one at time in the calculator and the N, ∑X, and ∑X2 are stored in separate memories of the calculator. Then when all the scores are entered, the student can punch a single key, and the standard deviation s will pop up on the display. You even have a choice of division by N or N-1 depending on your definition of s. Now that has to be so SLICK! What is going on here is the calculator is just moving through WFTSM when the last key is pushed. Just so you don't get too big a high on this recent news, if you are faced with large sets of data and many groups with ANOVAS and other procedures to carry out, the mainframe or a smaller computer is probably your best choice. Programs such as SPSS and SAS are then very useful. However, we now want to show you how STEP ONE can logically lead you into another critical computational routine.

Consider the score format that has been visited before and we have termed related or correlated pairs. That is, given (X1, Y 1), (X2, Y2),...(Xi, Yi),...(XN, YN). Now your dear little $15 calculator can still easily handle the task of entering the X member of the pair with one key and the Y member of the pair with another key until all pairs are entered. Then we can retrieve the following descriptive indices by pushing 4 different keys:, , sX, and sY. Now that is a pretty impressive array of indices. But recall that each of these pertain to either the separate X scores or the separate Y scores. We have no information on how high(or low or intermediate) the X score is relative to its mean compared with how high(or low or intermediate) the paired Y is relative to its mean. Putting this in very crude language, do the pairs of scores tend to be high together, low together and intermediate together or a completely different pattern such as the pairs being high and low together or low and high together? I hope you can see that the 4 basic indices do not touch on this type of "togetherness or covariability" relationship. Let us try out a calculation that may get at what we want...The sum of the products of the X and Y deviations about their respective means or in formula form:

This calculation can either be a positive number or a negative number unlike ∑x2and ∑y2 which are ALWAYS POSITIVE! So this has great promise for doing what we want it to. However, this is usually referred to as a "thinking" formula because it allows you to see exactly thow to calculate it directly but a direct calculation often is a very messy creature. Here we generally get decimals for both means, then we must subtract a decimal from each raw score for both the X's and Y's resulting in signed decimals, next we must find the products of these decimals again paying close attention to the signs, and finally sweating profusely we add up the whole batch of signed decimal products to arrive at the final ∑xy. Whew!!!

Fortunately, we are blessed with a neat computational formula displayed to the right where the ingredients are stored in memories in the calculator as you enter the pairs of scores (The proof will be omitted). Some calculators will (some won't) allow you to push still another key and ∑xy will appear upon the display. If not, you can still pull out from memories the sum of raw score products and the sums of the raw scores (ie. ∑XY, ∑X, and ∑Y) and finish the simple calculation on the right. Remember, that in this formula most raw scores will be whole numbers so this formula will be comparatively easy if done separately by hand on the calculator. Oh, I must mention we have finally arrived at what I call "STEP ONES COUSIN" because the procedure is so "analagously similar" (neat expression Huh?) to "STEP ONE". In other words,the right hand side of ∑x2 involves sum of raw squares and the square of the raw sum whereas here the ∑xy on the right involves sum of the raw products and the product of the raw sums. Hope you can see how similar they are! If STEP ONE is gold studded then STEP ONES COUSIN must rate silver studded!!!

Now we present two widely publicized formulas that are just tiny steps away from STEP ONES COUSIN and will give the measures of "togetherness" that we want for the X and Y pairs. Examine the formulas that are labeled (A) and (B) below:

To obtain result (A), we simply divide STEP ONES COUSIN by N, the number of pairs of scores and this produces the widely known Covariance of the X and Y pairs. In simple language, the Covariance is the mean product of the deviations of the X and Y scores about their respective means. Some authors refer to this as the mean cross product of the deviation scores. Recall that ∑xy/N can either be positive or negative and can range between negative infinity and positive infinity. If your calculator is top of the line it possibly has a button that will recall this result. But don't count on it. The Covariance is a very crucial index when you have a multiple number of variables. For example, with 4 variables we would arrange the 4 Variances down the main diagonal of a matrix with the 6 possible Covariances located in the off diagonal positions in the matrix. A wealth of information is contained in this 4x4 variance-covariance matrix with the Variance of each individual variable and the Covariance of all possible pairs of variables being displayed. Matrix algebra becomes the mode of operation when you delve into multivariate analysis.

Now in result (B) we move one more tiny step and divide the Covariance of X and Y by the product of the the standard deviations of X and Y. Putting it in statistical language, we are simply standardizing the covariance with this maneuver. Lo and Behold, the result may surprise you. We have now arrived at one the most celebrated statistics ever employed...The Pearson Product-Moment Correlation Coefficient. This index, of course, behaves very well and a full range of values between -1 and +1 may occur and includes the value of 0 as a possiblity. A high positive index such as +.90 or +.80 would suggest that high X's occur very frequently with high Y's, intermediate X's occur often with intermediate Y's, and low X's tend to be paired with low Y's. An inverse or negative correlation such as -.85 or -.90 would suggest low X's being paired with high Y's and high X's being paired with low Y's. A 0 index suggests no correspondence whatsoever. That is, given a high or low X value, it is impossible to predict where the Y will be. In terms of a scientific calculator, a high-end unit will almost always give you a button that will crank out the (B) result after all pairs are entered.

Finally, we shall calculate an example to show you how things work but will use a reasonably small set of pairs so you can use any type of calculator including a basic $5 unit. Please realize you will have to make three passes at the data if you use an el cheapo unit but it is still doable. Here are the paired data or the (Xi, Yi)'s which you may think of as pretest posttest scores for 10 individuals:

(10, 8) (8, 6) (5, 4) (12, 12) (4, 5) (3, 5) (14, 9) (12, 8) (6, 8) (12, 10)

After all the pairs of scores are entered, we recall from the calculator memories the following basic calculations and statistical indices:

∑X=86, ∑X2=878, ∑Y=75, ∑Y2=619, ∑XY=717, =8.6, =7.5, sX=3.72, sY=2.38 (WFTSM steps will be omitted to show the last two)

Now for the three new calculations of this entire blurb and substituting from the above

STEP ONES COUSIN
xy = ∑XY - (∑X)(∑Y) / N
        =717 - (86)(75) / 10
        =717 - 645
        =72

(A) COVARIANCE
COVXY = ∑xy / N
              =72 / 10
              =7.2

(B) CORRELATION COEFFICIENT
r = COVXY / (sXsY)
  =7.2 / [(3.72)(2.38)]
  =7.2 / 8.85
  =.814   This r of .814 would indicate there is a fairly strong tendancy for high X's to be paired with high Y's and low X's to be paired with low Y's.

With this example we now finish our presentation of three very useful formulas that were an outgrowth of a single set of scores representing a single variable to a format of pairs of scores representing two different variables. We have seen that statistical indices are important for each set of scores separately but now we also need indices that measure so called "togetherness" relationships between the two variables. This type of need has brought into play STEP ONES COUSIN and sequentially the Covariance and the Correlation Coefficient. This wicket has become indeed a little more sticky. In fact, a point of confusion is that these three formulas take on many different forms and in each case its equivalent may look nothing like the original symbollically. These formulas require much study and practice to develope depth of understanding. I will leave you with a neat little exercise just for fun. The formula for the correlation coefficient r can be thought of as STEP ONES COUSIN divided by the product of the square roots of two different STEP ONES! See if you can verify this goofy statement in your mind. Thanks for reading this somewhat rambling presentation and please tune in again.


Thank You For Visiting The Archives Of Statistics Fun.

BACK TO TOP OF PAGE

For more statistics humor please visit the First Internet Gallery of Statistics Jokes.

Please email comments about this page to gcramsey@ilstu.edu
Page last revised on 2 SEPTEMBER 2008

RETURN to Home Page of Gary Ramseyer.

Member of the Science Humor Net Ring
[ Previous 5 Sites | Previous | Next | Next 5 Sites ]
[ Random Site | List Sites ]

Copyright ©1997-2008 Ramo Productions. All Rights Reserved.