An 'Easy' SAT and Terrible Scores

@sunnyschool

Yeah, tell me about it. I did not want him to take it again. It became a big argument until I decided it wasn’t worth arguing. i gave him my strong advice, but it’s his life to live. If he had NOT retaken it and subsequently gotten rejected from his top choices, who do you think he’d blame? He’d always wonder if he’d taken it again and gotten, say, a 1580, if he would have gotten into his top choices.

A large majority does, actually. Even back in 2010 they did. More than took AP courses.

https://nces.ed.gov/pubs2013/2013001.pdf

And frankly, not everyone has the time or desire or resources to prep endlessly for a one-off college admissions test either.

I’d guess math geniuses probably enjoy math teams and/or online self study like The Art of Problem Solving or math circles or whatever is in their community, more than a harder SAT.

“And frankly, not everyone has the time or desire or resources to prep endlessly for a one-off college admissions test either.”

  • No argument there!

“A large majority does, actually. Even back in 2010 they did. More than took AP courses”

  • Sure, but that doesn't mean equal access to all students. Not to mention the 18% - 31% of high schools who DON'T offer duel enrollment or AP/IB. Should they be left out of the loop? That was the entire original purpose of using a standardized test administered to everyone - it remedied inequities of opportunity.

“Discouraging retakes disadvantages those kids with test anxiety, of which there are many with real anxiety.”

  • Yes, and those students wouldn't do well in via AP or subject testing either. BTW, anxiety can be managed. Believe me when I say that :). Also, this wouldn't "discourage retakes" as much as encourage those who already scored high not to pursue perfection. Big difference. If you bombed your first individualized test due to poor answers that wouldn't let you proceed, why wouldn't you retake?

“Philosophically, sure. But practically, not possible. First, how do you determine the “right” test? Think about IQ tests, for example. The basic ones aim for a mean of 100, but they all have different nuances that are better for one kid or another. Then there are numerous other tests that can test the differences between a 132 and a 133 or a 160 and 165+. But those kids are so rare, there is no money to be made developing such a tests on a global scale.”

  • Would tend to agree with most of this; however, you need not "tailor" the test to one or another individual. The same overall content can still be administered just with increasingly more difficult content as you progress.

Again, as long as the colleges are continuing to rely on these tests, this is one way to ensure that the tails are more accurate. But yes, totally agree that some of this can also be remedied with subject tests and other demonstrations of ability. Individualized progress was a thought just because I suspect that “equating” is probably relatively poor at the tails of the distribution. But someone who’s been through the equations in the CB manual is free to correct or clarify on that.

The SAT (and ACT) was never designed to identify small differences in the tails. Equating has nothing to do with that fact.

If CB – really colleges, since they are the customer, not test takers – wanted more differentiation of the upper tail, CB could easily add a handful of more difficult (aka hard) questions in lieu of easy questions without making the test longer. Of course, the other end of the tail would be disadvantaged as the average kid would miss the hard ones and have fewer easy ones to ace.

The current SAT is geared to “college readiness” as opposed to aptitude. But you are incorrect that it was never designed to identify small differences in the tails. Prior versions definitely did a much better job with that. Whether most colleges need that at this point is a separate question. They are strongly signaling that they don’t.

Equating really should be accurate for all areas of the curve or else CB completely loses credibility. For the most part colleges don’t require or even recommend subject testing in order to flush out a math score more thoroughly. If equating isn’t represented accurately then students at the high end truly ARE damaged by a super easy exam and it becomes a matter of Luck in your test date. Needless to say, CB does NOT represent equating in this manner!

Which is my point: colleges don’t care for the extra detail, so there is no market for it.

Do you have any statistical evidence that equating does not represent the full spectrum fairly?

Undoubtedly, a big time screw up by CB. Dropping questions (‘unscoreable’) has to impact equating, perhaps at all levels, but noticeably more at the high end? Probably why they have offered a retake at no charge. (I think I read that online.)

Fortunately, such screw ups have been rare, i.e, once every xx years. Sure, June sux…

“Do you have any statistical evidence that equating does not represent the full spectrum fairly?”

Not at all. In fact, the technical manual stresses the importance of accuracy along the entire curve so they seem to prioritize a method that better supports that. They definitely pay attention to the top portion; in particular, two things “avoided” in that region are score gaps and the conversion of more than two raw scores into the same scaled score. So to the extent that these seem to exist on the raw score distribution, they are probably “smoothed” out of the scaled curve (they do prioritize smoothing when the solution appears otherwise). None of that implies any “non-fair” treatment of any particular part of the curve, though it’s likely that the ends get a bit more massaging than does the middle.

The technical manual covers this detail on pages 82 - 90 (Section 6: Equating).

“Undoubtedly, a big time screw up by CB. Dropping questions (‘unscoreable’) has to impact equating, perhaps at all levels, but noticeably more at the high end?”

-Have no idea how dropping questions impacts equating - that issue definitely beyond my pay grade LOL.

Am I too late in asking if CB will rescore June SAT?

They won’t IMO. The best you can hope for is a free October test.

Above my pay grade, too! But, to my way of (simplistic?) thinking: if the two unscored items happened to both be difficult problems, and they likely were, since this was an ‘easier’ test, not only does CB have a problem differentiating the high tail, but that also should mean that the remaining difficult questions take on a higher ‘equating’ value.

@acadecathlete my DD got 2 wrong on reading and 2 wrong in writing on the June SAT — just one less than you. Her verbal score was 770 and yours was 720. 50 pts for one question? Seems worse than the Math grading curve.

^^ Maybe another way of saying the same thing as @bluebayou in #169 is that removing two difficult questions would cause a steeper decline in the scaled scoring since a relatively higher percentage of “easier” - and correctly answered - questions remain. In that case a dumb mistake or two can really hose you.

Is there a way to know exactly which questions were removed? Did someone earlier post that any score verfication or validation won’t be offered for the June test?

@Winter2018 at #170 - and weren’t two questions also omitted for EBRW? (can’t remember if that was in the reading or writing section or both . . . ).

Is there a connection between a higher-than-usual number of omitted questions and the harshness of the curve?

^My understanding is that two were omitted in each of reading and writing, but nothing from math.

The QAS (that would include the questions) is not available for June.

I can’t imagine they could rescore it. I haven’t seen anything suggesting a free retake - that would be an admission by CB that there was something wrong with the June test, which I doubt will ever happen.

The question I have goes back to the technical manual and the need for the tests to be as parallel as possible in difficulty in order for equating to be effective. At some point, I imagine differences in difficulty would become too great. It would be interesting if some psychometric expert could explain this to us.

Ultimately, all this controversy does is reinforce quality issues of the College Board under Coleman. It seems to me that writing the New test in-house was a mistake; CB should have left it to ETS.

@JBStillFlying yes 2 were unscorable on reading and 2 on writing. Her report showed 0 omitted, and did not mention unscorable at all, but she said the correct and incorrect did not add up to however many there should be (I think one was 44?) and the difference was 2 in each section.

Your explanation makes sense, but I still feel bad for the kids.

“I can’t imagine they could rescore it. I haven’t seen anything suggesting a free retake - that would be an admission by CB that there was something wrong with the June test, which I doubt will ever happen.”

@evergreen5 - correct, because it would make no sense to rescore it. The equating was highly likely to have been done completely correctly. The problem - assuming there is a problem - would originate elsewhere, and not in the scoring methodology.

“The question I have goes back to the technical manual and the need for the tests to be as parallel as possible in difficulty in order for equating to be effective.”

  • Yes. To get nit-picky, the sampling etc. was probably fine. Test construction itself obviously wasn't if it turned out that four questions were "unscoreable" (whichever section(s) happened to be hit). What's curious is that Reading has it's own equating, as does Writing. Why on earth did they pair two tests together with multiple unscorable questions? Makes no sense.

The technical manual has a little blurb on page 88:

“The characteristics of a new raw-to-scale score conversion are influenced by several factors that are predisposed before equating, such as the range and the distribution of test takers’ ability, the quality of form construction, the characteristics of the base score scale, the magnitude of the difference in difficulty between the base and new forms, and the quality of form spiraling in case of the RG [random grouping] design.”

Difficult to translate from CB-speak, but “quality of form construction” might refer to the overall clarity and anwerability of questions they put in the particular tests. If they ended up yanking two on Reading and another two on Writing, those tests had crappy questions and weren’t constructed properly because such an elimination can easily change the balance of easy/difficult. Caveat: it might also refer to a larger issue about how well “Reading” or “Writing” measures the things they want to measure; however, that seems more of a “design” issue than a “construction” issue. If I’m reading the technical manual correctly, “construction” refers to how the form (ie test) is put together. By the way, in case anyone was wondering, “spiralling” refers to how the tests were assigned to which sample group (ie school involved in the sampling study) during the original field studies.

“At some point, I imagine differences in difficulty would become too great. It would be interesting if some psychometric expert could explain this to us.”

  • Amen!

“Ultimately, all this controversy does is reinforce quality issues of the College Board under Coleman. It seems to me that writing the New test in-house was a mistake; CB should have left it to ETS.”

  • Were these problems non-existent with ETS? Back in the dark ages, when I took the ETS-administered SAT, QAS didn't exist. So it's hard to compare. Unfortunately, we are living through CB's "learning curve" (which, speaking of "steep", is not as steep as it should be). Some of these earlier forms might be pretty crappy, and until they put them out of service will probably torment another group of testers again someday.

Not to hand out a “conspiracy theory” but how CONVENIENT that QAS isn’t available for the June test date, when a larger-than-usual number of questions were found to be “unscorable”.

I’m not a cheerleader for the ACT but I don’t recall any of these issues with it…

“yes 2 were unscorable on reading and 2 on writing. Her report showed 0 omitted, and did not mention unscorable at all, but she said the correct and incorrect did not add up to however many there should be (I think one was 44?) and the difference was 2 in each section.”

@Winter2018, the “unscorable” marking might only show up when you order Test verification. My son did that for the March 2018 test and once the payment was processed he was able to click a new link called “Test Questions” on his online report and see which ones he missed (the booklet showed up a few weeks later). Unscorable is definitely included in the scoring key: it’s a U. There’s also a green check for Correct and a Null (a zero with a slash) for Omitted. Anything incorrect would show up as the correct letter (bubbled) or number (grid-in).

@OHMomof2: the ACT is the reason colleges have dropped the Essay requirement! The degree to which they kept messing up the scoring was remarkable.

Really? I hadn’t read that anywhere - where did you?