Teaching Chi squared test

I tried something new this when I taught the Chi squared test. Instead of focusing on the formal procedure that one must follow in order to use the test correctly, I focused on what we were actually DOING when we were doing the test. As a result my students understood the test much more easily, and I had far fewer questions about how to actually use the test.

First, we talked about the expected values, starting with the sums of the expected values. See the following table. Here the "Yes" and "No" refer to whether or the particular person being surveyed watches the TV show Glee.

	Yes, I watch Glee	No, I do not watch Glee
Male			100
Female			100
	120	80	200

Essentially this backwards from where I started normally, with the observed values table, because I wanted students to understand how we construct this table, rather than relying on rote memorization of the formulas for the expected values. I started with this fact, that we have 50% males and females in our data. Hence, I said, if gender is independent of our question about Glee, how many males should we expect to answer Yes to the question about Glee? Students pretty easily came up with 60 males, reasoning that 50% of 120 is 60. I focused their attention on how what they did to get this answer, and then we filled in the rest of the table.

	Yes, I watch Glee	No, I do not watch Glee
Male	60	40	100
Female	60	40	100
	120	80	200

I then set up an observed values table, and filled in the table used way different numbers. I made sure students understand that the previous table represented the expected survey results if Gender wasn’t a factor in people choosing to watch Glee, but that the following table represented our actual survey results.

	Yes, I watch Glee	No, I do not watch Glee
Male	25	75	100
Female	95	5	100
	120	80	200

Right away, one of the students said that this second table obviously meant that there was a relationship between Gender and choosing to watch Glee? The reason he gave, "Well those numbers are way off the expected values."

I then asked a really important question. "How much different do they have to be before the results are significantly different than our expected results?"

Students realized that we’d probably want to start by subtracting our two sets of information like so.

O	E	O – E
25	60	-35
75	40	35
95	60	35
5	40	-35

I pointed out that we don’t really care if the difference is positive or negative, since either way the results are "way off" if the difference is big. So we squared the observed minus the expected values to make the answer positive. Note that we haven’t really done much work with absolute value, so I chose to ignore it for this example, which helped make my case for the calculation, but probably needs some discussion.

O	E	O – E	(O – E)²
25	60	-35	1225
75	40	35	1225
95	60	35	1225
5	40	-35	1225

Next I pointed out that the numbers were too big, and that I wanted to be able to make a comparison between the size of the difference between observed and expected values in one chart with the size of the same type of difference in another chart. Normalization (which is not a word I used with the students) is a useful way to do this, and hence we want to divide by the expected values.

O	E	O – E	(O – E)²	(O – E)²/E
25	60	-35	1225	20.42
75	40	35	1225	30.63
95	60	35	1225	20.42
5	40	-35	1225	30.63

Someone noticed that this was a bunch of different numbers and that it would be more useful if we had a single number, so I suggested adding up all of these normalized differences. This number, I labelled as Χ²_calculated and we had our result. Note that my objective here was not to formalize the calculation, but more to justify the calculation informally, so that the students would feel like the understood what they were doing.

From there the rest of the lesson was relatively easy, we had a discussion about how to compare Χ²results which led to the table of critical values, and with this table I introduced the notion of degrees of freedom. I didn’t try to explain why the critical values tend to increase with the degrees of freedom or with the significance level, but the students were okay with these additional steps because the initial calculation, and reasons for the calculation made a lot more sense. I was able to talk about the test at a deeper level than I had before, and when the students came time to actually practicing the calculation themselves, they had a lot fewer difficulties than I had anticipated.

One observation a student had was, "Using those list operations you taught us on the calculator sure would make this a lot easier," and a bunch of the students had an "Aha!" moment as they realized why I bothered to show them how to do the list operations on their calculator.

The essential difference in teaching here was using a conceptual framework versus my old method of rote memorization of the process. I think I know which way I’ll try in the future, even with something which seems so mechanical in nature.

5 Comments

Add yours →

Dvora Geller says:

I love this way to approach chi squared! It is easy to pick something the kids are interested in as a starting point. I think this would help their understanding greatly especially if they are then doing an IB Math Studies Stats project. They will be better able to explain and choose when to do a chi squared test.

Thanks for sharing!

May 29, 2010 — 8:33 am

- David Wees says:
  
  Yeah I guess I didn’t even realize that I had automatically chosen an example that my students would appreciate. They either love Glee or hate it, so either way it is an issue of interest for them. I’m actually teaching them this unit on statistics in preparation for their IB Math Studies projects, which we will be starting in earnest a week or so.
  
  May 29, 2010 — 1:00 pm
George Woodbury says:

That’s a great approach. I think my students would “understand” the Chi-Squared test with this approach. Thanks for sharing.
@georgewoodbury on Twitter

May 29, 2010 — 12:56 pm

Lauren Meyer says:

Thanks for posting this great example! I’ll be using a modified version in the stats lab I’m teaching tomorrow 🙂

April 30, 2013 — 12:12 am

James Davis says:

A great way to introduce Chi-squared and using a graphical calculator

March 28, 2015 — 9:45 am

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

5 Comments

Add yours →

Dvora Geller says:

David Wees says:

George Woodbury says:

Lauren Meyer says:

James Davis says:

Leave a Reply to Lauren Meyer Cancel reply

David Wees

Archived posts

Calendar

Recent Posts

Subscribe

Teaching Chi squared test

5 Comments

Add yours →

Dvora Geller says:

David Wees says:

George Woodbury says:

Lauren Meyer says:

James Davis says:

Leave a Reply to Lauren Meyer Cancel reply

Introducing Probability Using Settlers

Museum of Math

David Wees

Popular posts

Archived posts

Calendar

Recent Posts

Subscribe