FWIW, there are about 800-850 of these each year (meaning, single-sitting 1600 scores). The erroneous estimate of ~500 applies to the old number of perfect 2400s, but the current 1600 is the equivalent of both of the previous 2390 and 2400 scores… which according to the College Board was about 850 tests each year when that data was released in 2014/2015 ish.
One of the controls was for year of sample. Students from a particular year were compared to other students from the same year. REA kids were compared to RD kids from the same year, which only includes years for which REA was available.
As touched in my post, there is likely a variable strength of REA boost for different applicants, for a wide variety of reasons. You can get some idea of this variation by comparing the magnitude of the regression coefficient to the standard error, although the standard error magnitude also depends on sample size. The specific coefficient and standard error +1.44 (0.11). This suggests the boost is applied to a large portion of applicants, rather than strong boost for rare few and not advantage for anyone else. The mag/SE is not as small as URM, so REA advantage may be applied less consistently than URM. However, the REA advantage may be more consistently applied that certain other hooks, such as legacy. With this small of a standard error, I believe REA advantage is statistically significant on the 99.99999999999999999999999999999999999999% level, for the baseline sample with ALDC hooks removed.
The referenced model had a r of 77.7%, meaning that that it explained 60.4% of variance in admission decisions. The model could explain the majority of variance in admission decisions, but could not explain everything. Lack of distinction between +/- scores is one possible factor, as is the admission officers not being a slave to a simple formula based on scores. They may admit a kid with worse reader scores over a kid with better reader scores for reasons that are not captured by the control variables. It is possible that the admission models are missing a key control that explains the entirety of REA kids with similar ratings, hooks, and other controls have a ~4x higher average admit rate than RD; but this seems unlikely given the relatively high r of 77.7%.
Ah, I see this in the table… and while I’m still not precisely clear on how to translate logic estimates to odds ratios, I think this is the best answer the OPs question: It depends entirely on the year.
In Table B.7.1R, for Model 6, the specific co-efficients and standard errors for “Early” by year are:
Early X Year=2016 : -0.303 (0.135)
Early X Year=2017 : -0.188 (0.134)
Early X Year=2018 : 0.247 (0.134)
Given the sign changes and smaller numbers associated with the “by year” break-outs, doesn’t this imply that one can’t really state the REA improves one’s chances for Harvard?
(Frankly, I can’t understand quite how these three numbers line up with earlier one that is labelled “Early Decision” : 1.44 (0.110)
The ‘X’ signifies an interaction regression coefficient. You have one coefficient for influences of applying REA, one coefficient for influence from year applied, and an interaction coefficient between REA + year applied. This allows one to estimate whether REA boost was applied consistently each year or not. There is no interaction coefficient for 2019 because that is the default reference. You can add up the regression coefficients to get a rough estimation, like summarized below. The analysis suggests that the REA advantage was notably stronger in the 2018 than 2019, on average. This fits with 2018 being the year of the sample in which non-ALDC REA applicants had the highest admit rate (16% REA vs 3% RD). The REA advantage does not appear to be the same degree of advantage every year, but appears to be a very significant advantage in each year that was analyzed on average.
2019: RD = -1.350, REA = (-1.350) + (1.440) -------------- REA is 1.440 more than RD
2018: RD = -1.392, REA = (-1.392) + (1.440) + (0.247) ---- REA is 1.687 more than RD’
Average across all 4 REA years – REA is 1.4 more than RD for a 4.0x odds ratio advantage
To convert from regression coefficient to odds ratio, take e and raise to the power of the coefficient. So a regression coefficient of 1.33 = e^1.33 = odds ratio of 3.78. To convert from odds ratio to probability, take the OR and divide by 1+OR.
There are much higher odds in the Arcidiacono report. Students on the athlete lists had a staggering odds of 5070. So an OR of 3.7 is good, but no where near the boost from being an athlete.
The recruited athlete acceptance rate is misleading, because by the time the recruit submits their application they have already been vetted both athletically by the coach and academically by admissions. Many, many student-athletes seek those recruiting slots (full support by a coach) and don’t make the cut either athletically or academically. The real ‘acceptance’ rate when one considers all of the student-athletes vying for the slots set aside for recruits is sub 10%.
Sorry if I was not clear. The “Athlete List” refers to those sent to Harvard admissions from the varsity coaches. These are students who generally had a pre-read from an academic point of view. Once cleared, they still have to apply formally.
So the applicants in this category generally have 1) coach support and 2) pass academic pre-read. It still doesn’t guarantee admission because the full application also includes letters of recommendation and the guidance counselor/school report. I imagine there are a small percentage of students who get rejected at this point because of disciplinary problems, or some other issue. Thats why the OR is so high. Not 100% but pretty darn close.
That’s still much lower than the percent of non-athletes vying for a spot if you want to talk about people that considered Harvard but didn’t actually apply. That’s why the relevant comparison is Harvard applicants.