Error bars on grading

Educators make mistakes when grading. It happens. Sometimes we mark a student’s work lower than we should, compared to their peers, and sometimes we mark it higher than we should. The question is, what effect does this have on a student’s overall mark?

Here are some sample grades. The sample column is the original grade, the low column is a mark 1 lower than the sample, the high column is a mark 1 higher than the sample.

Grades	Sample	Low	High
Quizzes	5	4	6
	6	5	7
	7	6	8
	5	4	6
	6	5	7
	7	6	8
	5	4	6
	6	5	7
	5	4	6
	6	5	7
average	5.8	4.8	6.8

Homework	5	3	5
	5	3	5
	3	1	5
	3	1	5
	1	1	3
	1	1	3
	3	1	5
	5	3	5
	3	1	3
average	3.222222	1.666667	4.333333

Tests	40	35	45
	45	40	50
	35	30	40
	40	35	45
	30	25	35
average	38	33	43

Overall Grade	70.1	55.9	82.5

The overall grade was calculated here by finding the averages of the three categories (quizzes, homework, and tests – standard categories in many classes) with quizzes worth 20%, homework worth 20%, and tests worth 60% of the overall grade. These aren’t particularly unusual grades. Note, however, how wide the possible error is in the final grade, which could potentially actually range from 55.9% to 82.5%, which is a 26.6%, or a HUGE amount in any grading system.

Of course, teachers aren’t likely to mark everything low, or everything high. One could make an assumption that both of these cases are equally likely, and then instead of using the likely minimum mark, and the likely maximum mark, we could try and aim for 2 standard deviations from the mean of the possible grading outcomes. In other words, what’s a likely range?

I created a script (warning: takes a while to run in some browsers) which randomly generates a sample of 10,000 overall grades, starting with the baseline above, and randomly adding errors in grading for each assignment, assuming that teachers were equally likely to assign a lower grade as a higher grade, and as getting the grade exactly correct (this assumption is probably false, but I had to start somewhere). For one sample of 10,000 grades, the minimum grade is 60.2, and the maximum grade is 77.5, suggesting that the distribution of grades isn’t symmetrical (teachers are more likely to assign a grade which is too low to students who are at above 50% overall, and too high for students who are at below 50% overall). The standard deviation of these scores is 2.32, which means that 95% of the time, the grade will fall between 64.6% and 73.9% (the mean of the data set was 69.2). This is a range of likely values of over 9%!

Note that this script doesn’t account for a host of other reasons that the grades for this individual student could be in error. It doesn’t account for lost assignments, misread names, addition errors, etc…

How many teachers know that there are error bars on the percentages they are expected to give to students? Maybe if we reported this student’s grades as 70.1% ± 4.6%, students and parents might recognize that grading is more subjective than they realize? Maybe we could stop the practice of assigning letter grades to students work based on strict boundaries?

I remember than in grade 12, I was assigned a grade of 84% overall in English 12, with an A being an 86% in my school. This meant that I missed out on a major award at university (it was my only B in grade 12) and that I had to write an entrance exam to get into my first year English course (I passed). I’ve obviously done fine despite this grade, but I remember it often, and it is a reminder to me of the often arbitrary nature of teacher assigned grades.

Mr. Wees,
Hello, my name is Kathleen Wilhelm. I am currently a student at the University of South Alabama where I am taking a class about technology in schools. I enjoyed reading your blog post about grading. To be honest, I never really cared what the number was in my grade, just the letter. However, in many of my college classes my teachers use a grading scale of 92.5 = A and 92.4 and below is a B (and lower letter grades). After making a 92.4 in one of my classes, making a B, I now care about every point I receive in my grades. This is why I think it is important for teachers to be careful how they grade. Teachers can become tired and give slightly lower grades during a different time of day or higher scores. This post was interesting because it reminded me (a teacher in training) how important each number is in a grade.

3 Comments

Add yours →

John at TestSoup says:

I absolutely love the idea of including a +/- with grades. Everybody knows that grades are somewhat arbitrary — this would finally acknowledge it.

November 2, 2011 — 12:40 pm

Kathleen Wilhelm says:

Mr. Wees,
Hello, my name is Kathleen Wilhelm. I am currently a student at the University of South Alabama where I am taking a class about technology in schools. I enjoyed reading your blog post about grading. To be honest, I never really cared what the number was in my grade, just the letter. However, in many of my college classes my teachers use a grading scale of 92.5 = A and 92.4 and below is a B (and lower letter grades). After making a 92.4 in one of my classes, making a B, I now care about every point I receive in my grades. This is why I think it is important for teachers to be careful how they grade. Teachers can become tired and give slightly lower grades during a different time of day or higher scores. This post was interesting because it reminded me (a teacher in training) how important each number is in a grade.

November 13, 2011 — 6:25 pm

- David Wees says:
  
  It should also highlight the importance of focusing more on the bands, and less on the exact percentages. Thanks for sharing your experience Kathleen.
  
  November 13, 2011 — 7:25 pm

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

3 Comments

Add yours →

John at TestSoup says:

Kathleen Wilhelm says:

David Wees says:

Leave a Reply to Kathleen Wilhelm Cancel reply

David Wees

Archived posts

Calendar

Recent Posts

Error bars on grading

3 Comments

Add yours →

John at TestSoup says:

Kathleen Wilhelm says:

David Wees says:

Leave a Reply to Kathleen Wilhelm Cancel reply

Computer based math – hand or machine? DRAFT

Math in the real world: Marshmallow constructions

David Wees

Popular posts

Archived posts

Calendar

Recent Posts