SAT CURVE

can someone please explain how the curve works?
is the curve predetermined, or determined after students take the test?

It’s determined after students take the test. From there, they judge how people did and the difficulty of each section (CR, M, and W). Then the curve is released. If the CR was relatively easy for example, one question wrong might bring you down to a 790, for 2 a 780 or 770 and so forth.

If the CR was hard, then the curve would be more lenient. Instead of 3 wrong bringing you down to a 770-760 you would rather have a 790, or if 2 wrong an 800.

Its a pattern for the most part, and can be predictable.

Thanks!

@FutureDoctor2028 -

Do you have a source for this? In my experience, recycled tests have kept the same curve as the initial sitting, suggesting that curves are predetermined.

This too–Erikthered’s data suggests that there is no such pattern–the array of curves and sittings shows hard curves showing up in something approaching genuine randomness.

@marvin100 But May’s test was not recycled for some reason.

Which May test? The one for Korea/Taiwan/HK was recycled, I believe.

@sonofgod908 - I was right, btw. May’s Korea/HK/Taiwan test was Dec 14 / Nov 13 US exam (pretty sure I got those dates right).

@marvin100

Do you happen to know what they used for the curve? I have always assumed that once they establish the curve the first time they use a test, that the curve is then “set” for good. To me, that seems like the logical thing to do. But I don’t know for sure whether that is the case. So for students taking a domestic, first-time-offered SAT, I feel confident saying that the curve is not set until after the test. International, not sure…

And I was of the understanding that the curve was determined before the test. There is a whole document the College Board has available online somewhere that describes the process of “equating” and how they arrive at the curve. And I am pretty sure that that is based on their judgement of the slight differences in level of difficulty of the sections based on the particular spread of questions that appears on each test, not on how students actually perform on the test. I never thought that made great sense but that was my understanding.

Just pulled this from the CB website, but this is not the from the full document that I had once seen…

“In our statistical analysis, equating adjusts for slight differences in difficulty between test editions and ensures that a student’s score of, say, 450 on one edition of a test reflects the same ability as a score of 450 on another edition of the test. Equating also ensures that a student’s score does not depend on how well others did on the same edition of the test.”

Ok I found the document (link below)…it is old so maybe the process is not exactly the same anymore. I just skimmed it but it seems to suggest that the curve is determined afterwards but not based on how students perform in general on that test. They use the experimental section to anchor that test to a previous test that had the same experimental section (the explanation in the document is long and confusing) and then “equate” the tests accordingly. So I would guess that once they judge the level of difficulty of the new test using that process, the curve would remain the same if they ever recycled that test.

The main point, if I understand this all correctly, is that the curve is not determined by the performance of everyone who takes the test. The curve just ensures that a 700 on one test equals a 700 on another, so it is really something that students should just ignore!

Woops, here is the link

https://research.collegeboard.org/sites/default/files/publications/2012/7/researchnote-2001-14-ensuring-comparable-scores-sat.pdf

Yep, this.

It took me a while to understand the process. A while back, @Fignewton explained it and now it totally makes sense. This is my internalized version of what he said:

Suppose as a physics teacher, I wrote a brand new final exam every year (ha!) but I wanted to make sure that it was no harder or easier to get an A each year. In a year where the scores were extra high or extra low, how could I know whether it was the result of the new test’s difficulty level or the result of a stronger or weaker testing population? But suppose I embedded, say, 10 questions that I re-used each year. By seeing how each year’s group did on those questions compared to previous groups, I’d know how their skill level compared. If they did as well (or better) on those 10 “equating questions” but then worse on the test overall, I’d know that it was the test that was harder than previous years.

Also, I would not have to keep re-using the same 10 equating questions. Each year, I could swap in some new ones as long as I could compare the new group to the previous generation (and in turn the generations before that). I believe that is what they mean by the “pedigree” of an equating section.

When you think of it this way, it becomes clear that the curve can’t be set until after the new group takes the test. You need to be able to compare overall results to equating section results in order to set a curve. But then suppose I had 3 kids who were absent from the final. I may choose to give them a final from a few years ago. But I would have no reason to re-set the curve. The process that had already been done would be sufficient for the equating process. And besides, 3 kids would not be a large enough sample to establish a scale anyway. That’s why I suspect that re-used international SATs still retain their curve from when they were first given.

@Fignewton does not hang out here much any more so I can’t say for sure that I am right, but this is how I think of it anyway.

@pckeller that is a great explanation and now I feel pretty confident that I understand the process well. I think I was in the habit of thinking that the curve was predetermined in the sense that it was not dependent on the crop of students who take the test, but yeah obviously it needs to be done after and then only once since as you say once the exam has been equated it doesn’t need to be done again.

Excellent analogy you gave there!