On Grading

I got to thinking about grading the other day after seeing a question on Facebook. We’ll get to that question in a minute, but first I want to try to outline a grading scheme I used towards the end of my teaching career. It is based on how free-response questions on AP Calculus exams are graded, but the ideas are usable in any course. Here are some suggestions and examples of how to do that. There are also some suggestions for grading multiple-choice and True-False questions.

Free-response questions

The AP Calculus scoring standards are considered as a guide for awarding partial credit. Partial credit is earned for taking correct steps on the way to the solution. Points are earned, not deducted. Examples that follow will expand on these principles:

  • Each step is worth 1 or 2 points. For 2-point steps, it must be possible to earn only one point.
  • Students earn the point(s) for showing they are doing a good thing.
  • Once earned, the point cannot be lost by some later mistake. (It’s “in the bank,” as readers say.)
  • Since a mistake will affect the final answer, the student may earn later points, including the answer point, for continuing correctly. However, some mistakes are so bad that earning the rest of the points is not possible. Mistakes must not simplify the remaining work.
  • The standard must allow for different methods of solution.

Example 1: Consider a typical volume problem worth four points. Students are required to write a definite integral. By the washer method the work should look like

\pi {{\int_{a}^{b}{{\left( {f(x)} \right)}}}^{2}}-{{\left( {g\left( x \right)} \right)}^{2}}dx= a numerical answer.

  • 1-point is earned for the constant and both limits of integration.
  • 2-points are earned for the integrand. If the integrand is of the form something squared minus something else squared, they earn 1-point; if both correct quantities are squared they earn the second point. If the integrand is something squared plus something else squared this is considered a calculus (major) mistake they earn 0 points and are not eligible for the answer point. (No deduction for a missing dx)
  • 1-point for the answer from their calculator. Saying that the correct answer is equal to an incorrect integral such as \pi {{\int_{a}^{b}{{\left( {g(x)} \right)}}}^{2}}-{{\left( {f\left( x \right)} \right)}^{2}}dx= the correct answer is a mistake (negative = positive) and does not earn the answer point. However, the reversed integrand and the correct answer not connected by an equal sign recoups the integrand point and earns the answer point (i.e. full credit). (Subtracting in the wrong order and taking the opposite of your (negative) answer is a correct algorithm, even if inefficient and “ugly.”)

You can see how much consideration goes into setting the grading standards.

There is no reason you must use the exact AP exam standard. In your class you may want to be more specific in hopes of helping your students be more precise. You may make this question worth more points. So, you could use this standard:

  • 1-point for the \pi .
  • 2-points for the limits of integration (one each)
  • 1-point for the form of the integrand (square minus square)
  • 1-point for the first squared quantity
  • 1-point for the second squared quantity
  • 1-point for dx
  • 1-point for the answer from their calculator

Example 2: An example from Algebra 1. Find the solution of 4x-\left( {x-3} \right)=x+7. The expected solution is

4x-x+3=x+7

3x+3=x+7

2x=4

x=2

You could count this as 3-points:

  • 1-point for removing parentheses
  • 1-point for collecting like terms
  • 1-point for the answer

Or you could count it as

  • 1-point for knowing to remove parentheses
  • 1-point for removing parentheses correctly
  • 1-point for collecting the x-terms
  • 1-point for collecting the constant terms
  • 1-point for the answer – any arithmetic mistakes in collecting terms fails to earn the answer point

Example 3: from Algebra 1: Solve {{x}^{2}}-8x=9

Expected solution:

{{x}^{2}}-8x-9=0

\left( {x-9} \right)\left( {x+1} \right)=0

x=9\text{ or }x=-1

As a 3-point standard

  • 1-point for setting equal to zero
  • 1-point for factoring
  • 1-point for answers

Or a 5-point standard

  • 1-point for setting equal to zero
  • 2-points, one for each correct factor
  • 2-points, one for each answer.

However, whatever method you use should allow for a solution by quadratic formula, or completing the square, or even by graphing. (Unless the direction specifically read “Solve by factoring.”)

Example 4: This is the question that got me started on this post. It is from a September 20, 2018 post on the AP Calc TEACHERS – AB/BC Facebook page. The teacher’s question is at the top.

The teacher is right about being concerned with proper notation and right about requiring students to use it.

On an AP exam this limit would be a multiple-choice question (see below) and so notation does not enter in. Even on a free-response question – judging from past exams – only the answer would be required. Just because it’s an AP class, does not mean that you must do things only as they are done on the exams.

To provide for notation, this could be scored as a 3- or 4-point question:

  • 1 point for knowing what algebra to use to find the limit
  • 1-point for doing the algebra correctly (For a 3-point value, this could be included in the answer point, but there is a fair amount to do and it’s not straightforward, so an additional point here is reasonable.)
  • 1-point for the answer (If the student does not earn both of the first two points, then the answer should agree with their work.)
  • 1-point for correct use of limit notation throughout the problem.

The student in the example does not earn the last point. The solution shown earns 3 of 4 points (or 2 of 3).

But there is more here. Suppose the student did not write “lim” in the second through fifth lines of the solution; there is no reason the must since they are just doing some algebra. Also, the lines are not connected with equal signs. Then, he or she has not misused the notation and should earn full credit.

Some comments on the Facebook post were also concerned about dividing out the x’s and not mentioning that x\ne 0. If that is a concern for you, then another point could be included for that.

So, the idea is to be very precise about what earns a point and what does not. Seeing a “+3” instead of a “-1” next to their work encourages students. Noticing that a number of your students are not earning the same point will help you see where the class is confused. One of the reasons you give tests is to help you see where your class as a whole is missing some idea. Consider that when giving multiple-choice and True-False questions.


 Multiple-choice questions – forget scan sheets.

As a teacher, you need to see the students work, so you can find their mistakes and help them do better (a/k/a formative assessment). When giving a multiple-choice question, require students to show their work and award partial credit for incorrect answers. Two- or three-points seem to work well – one-point for knowing what to do, one-point for doing it, and one-point for the answer.

Examples:

  • Find where a function is increasing: one-point for knowing to examine the derivative, one-point for finding the derivative, one point for the answer.
  • Find the acceleration: one-point for finding the derivative of velocity, one-point for the answer.
  • Questions with statements I, II, and III: one-point for each statement identified correctly as true or false. (Think of the answer “I only” as T,F,F etc.)
  • Set up the integral: one-point for limits of integration, two-points for integrand (Algebra/notation mistake loses one point, calculus mistake loses both points).

True or False questions

For many years I used a textbook (I think it was Larson and Hostetler 2nd edition – I’m showing my age) that had True-False questions for each set of exercises. I really liked them. Newer textbooks rarely have them. One exception is the new Calculus for AP by Steward and Kokoska that starts almost every exercise set with a few True-False questions.

For two-points, have students say if the statement is true or false, AND require them to explain why the statement is true or false: what theorem or idea is illustrated; what hypothesis is not met, give counterexamples, etc.

Another approach to True-False questions is to change them to Always, Sometimes, or Never True questions.

For example: Is the statement “If f ”(a) =0, the (af(a)) is a point of inflection,” sometimes, always, or never true?

Answer: Sometimes. If f(x) = x3, then (0,0) is a point of inflection, but if f(x) = x4, then (0,0) is not a point of inflection.

Another answer: Sometimes: If f ”(x) changes sign at (af(a)), then (af(a))  is a point of inflection, if f ”(x) does not change sign at (af(a)) , then it is not a point of inflection.

You can then have the class discuss and criticize each other’s answers. These become good writing questions and good preparation for the “Justify your answer” and “Explain your reasoning” questions on the AP exams. For you, they help you see what the student is thinking and, if wrong, help them correct it (a/k/a formative assessment again).


Next Friday some thoughts On Scaling – and all tests are scaled.


Percentages Don’t Make the Grade

Well, the AP exams have been written and the dust has settled. Folks are posting their answers on the Community Bulletin Boards. (I never post mine – too many mistakes.) The other thing that always gets discussed at this time of year is whether this year’s exam is more difficult or less difficult than last year’s.

I am sure this year’s was more difficult or less difficult than last year’s because it is impossible to make two exams of the same difficulty.

But it doesn’t matter.

The grades will reflect, as best as possible, that a student knows as much calculus as students with the same score did last year. That’s the important thing.

Because it is impossible for anyone or any group to make two exams of the same difficulty, percentages tell you nothing. The percentage of the number of points that a student earns out of the number possible tells you just that and nothing more. If the tests are not of the exact same difficulty, then percentages are meaningless.

What to do?

The Educational Testing Service (ETS) who writes and administers the AP exams for the College Board carefully pretests each question. Also, there are a number of questions from last year’s exam on this year’s exam. These questions, called equators, allow ETS to judge the difficulty of the other questions on this year’s exam compared to last year’s. It allows them to judge the ability of this year’s student cohort compared to last year’s. Each question is considered individually. Questions that score poorly or questions that identifiable groups of students do far worse compared to the entire group taking the test are not counted in the final score. (For example, in 2008 question AB 19 was not counted; too many missed it.) They compare the results of questions within each exam. With this information they “scale” the exams and decide on the cut points, the high and low raw scores that earn a 5, 4, 3, 2, or 1.

A teacher on a day-to-day basis cannot do so detailed an analysis. Yet we still need to give students grades. We need to scale the exams.  I was quite happy this year using a scheme Dan Kennedy suggested some years ago (see resources tab above). This worked quite well for me in BC Calculus and in 8th grade Algebra 1. Perhaps you have another system.

Percentages just don’t make the grade.

Update September 22, 2014: Matthew Braddock, Mathematics Instructor & Webmaster, at the Dr. Henry A. Wise, Jr., High School in Prince George’s County, Maryland sent me a GeoGebra applet that will calculate the grades using Dan Kennedy’s scheme described in the link above. It runs at a website so you do not need GeoGebra on your computer or iPad to use it. Simply enter the information and it will do the rest. Thank you Matthew. 

Update December 3, 2018. The link above is no longer active. This link is to a similar app by Dan Anderson on Desmos. Thank you Dan. For more on this scaling test see the post: On Scaling.


Updated: September 22,2014, Kennedy link fixed February 9, 2018