<p>Review books say that “calculator talk” is not what graders what to see, but I looked back in the scoring guidelines and it mentions that you can write down what you put in your calculator if, like jerrry4445 said, you state the parameter. Personally, I write everything out because the formulas are given and I can just plug in the numbers to get full credit.</p>
<p>You might want to take a look at
[AP</a> Central - AP FAQs](<a href=“Supporting Students from Day One to Exam Day – AP Central | College Board”>Supporting Students from Day One to Exam Day – AP Central | College Board)</p>
<p>This is a response by Chris Olsen to a question about using the notation normalcdf. Olsen says “calculator-speak” is always risky, and says that “this notation is not consistent with the mathematical definition of the normal cumulative distribution function.”</p>
<p>Will see if I can hunt down a little more info–the guidelines I posted earlier were ones that we had developed, based on scanning graded FRQ’s on the College Board site.</p>
<p>(The one above is faqid 1156.)</p>
<p>A really useful source is the AP Statistics Teacher Guide, which is available at
[AP</a> Central - AP Statistics Course Home Page](<a href=“Supporting Students from Day One to Exam Day – AP Central | College Board”>AP Statistics Course – AP Central | College Board)</p>
<p>It’s the second item on the list. The full guide is 260 pages. However the most useful part, in terms of understanding what the graders are looking for, is just 11 pages, from p. 226 to p. 236 of the download. These pages are numbered 212 to 222, if you are looking at the page numbers themselves.</p>
<p>There are also some useful examples on pages numbered 24 to 38 and 55 to 56. (Add 14, to get the corresponding page numbers of the download.)</p>
<p>Most of the document you don’t need to read–it has sample syllabi and suggestions for AP Stats teacher.</p>
<p>The scoring guidelines for the FRQ’s on the 1997 released AP Stat Exam are available at
[AP</a> Central - Statistics – Previously Released Materials](<a href=“Supporting Students from Day One to Exam Day – AP Central | College Board”>Supporting Students from Day One to Exam Day – AP Central | College Board)
Pages 62, 66, 70, 74, and 79 of the download provide scoring criteria. (These are numbered as pages 58, 62, 66, 70, and 75 at the bottom of the pages of the document.)
Scoring is “holistic.” This means that your commentary in English winds up being pretty important to the assessment–CB states that communication and statistical knowledge are weighted equally in scoring the FRQ’s.</p>
<p>I think byubound suggested a reliable approach in #21.</p>
<p>One of the useful things I found is to always state the obvious. Even things like “males have a higher mean score than females” or summat is considered statistical analysis.</p>
<p>Agree with EightforEight.</p>
<p>In terms of the style of answer the graders are looking for, a comparison of responses and scores on pages 76-78 of the download of the 1997 released exam is useful. One of the students who scored a 4 on the question computed n(sub 1) p(sub 1), etc. explicitly, even though all of these values are larger than 190.</p>
<p>Not specifically related to the FRQ, but a few points that might be useful to note. (Some of these are obvious–I haven’t tried to weed through the info.)</p>
<p>Correlation does not imply causation.
Correlation (or the correlation coefficient) measures the strength of linear relationships (only).
Changing units does not change the correlation.
The correlation is not affected by interchanging the labels X and Y of all of the variables.
r always lies in the range [– 1, 1]. Watch for the possibility of negative r values if you have r^2 and the question refers to r.
The slope of the least squares regression line is r s(sub y)/s(sub x) where s(sub y) is the sample standard deviation of the y values and s(sub x) is the sample standard deviation of the x values. If you plot z-scores for the y variable vs. z-scores for the x variable, the slope of the regression line is r.</p>
<p>When examining residuals,
Check whether or not the residuals are randomly distributed (roughly)
Look at the pattern of positive and negative residuals
Compare the size of the residuals with the associated y values</p>
<p>Quick question: how does the removal of an outlier affect r^2? And the removal of an influential point?</p>
<p>Removal of an outlier should increase r^2.
Note that a point might be an outlier in terms of its x value relative to the other x values and in terms of its y value relative to the other y values, and yet not be an outlier in the context of regression, because it lies near the straight line fit.</p>
<p>When you remove an influential point, I think the situation is not so clear. Removal of an influential point would sharply change the regression line. However, it is possible for an influential point to have a small residual–in which case, I think that r^2 could actually drop when you remove it.</p>
<p>I am going on logic and recollection, here–so anyone who has corrections/refinements should please jump in.</p>
<p>^Alright that’s what I figured for an outlier. I’m not as sure about the effect of removing an influential point though. Logically, I would guess that removing it also increases r^2, but I could have sworn I read somewhere/did a practice exam in which it said that r^2 increased</p>
<p>I’ll chime in; speaking of outliers, how do you identify them in a regression model? For a 1-Var situation it’s just anything beyond 1.5(Q3-Q1) but for the regression model how do we get it?</p>
<p>A reference that the presence of influential points may increase the correlation coefficient:
[AP</a> Statistics Tutorial: Residuals, Outliers, and Influential Points](<a href=“Residual Analysis in Regression”>Residual Analysis in Regression)</p>
<p>Also, I’m pretty sure that Barron’s agrees on this point. Although you might often have an increase in the correlation coefficient when you remove an influential point, it’s not guaranteed. (Correction: I really mean an increase in r^2, in both spots.)</p>
<p>As far as identifying outliers in a regression model goes: Other than “outliers have large residuals,” at the moment, I can’t come up with a formula analogous to the 1-Var one. Anyone else know?</p>
<p>^^remember that that formula is only a very rough estimate. there is no such formula for a two-variable [regression] model. you just have to eye-ball it to see if it ‘falls outside the general pattern’. [yuck to the word eye-ball D:]</p>
<p>oh my lawdd, i just knew the answer to something.</p>
<p>~am i the only one who is not very far in her reviewing?! chapter 4 out of like 13. sheeet. i’ll be cramming o’plenty the weekend before :/</p>
<p>Not the answer to anything, but a couple of additional observations (again, some obvious).</p>
<p>An influential score may have a small residual, yet still have a greater effect on the regression line than scores that have larger residuals. Points at the extremes of the x range are often influential scores.</p>
<p>If a scatterplot shows a distinctive curved pattern, or if the residuals show a distinctive pattern rather than random scatter, a nonlinear model for the data may be best. The nonlinear model can sometimes be obtained by transforming one or both of the variables, and then looking for a linear relation between the transformed variables. </p>
<p>Examples of transformations to obtain a linear relation:
Take the logarithm of the y data.<br>
Take the logarithm of the y data and the logarithm of the x data
Take the square root of y.<br>
Take the reciprocal of y.<br>
Leave y alone, but take the logarithm of x.</p>
<p>(In applications in science and engineering, there would usually be some
logical reason for checking for a linear relation between some functions f(x) and g(y).</p>
<p>No worries, thisgirlisaG, as long as you’ve covered the chapters once and you’re reviewing, there is plenty of time.</p>
<p>haha, thanks, quantum. as for the transformations…we spent very little time on that section of the book. i don’t even think we got quizzed/tested on it. is there usually a lot [or, an open-ended question] on the test about those?</p>
<p>btw, thanks for sharing all of your awesome information, guys!</p>
<p>Barron’s covers the topic. My personal impression is that it would be good to be able to recognize that a nonlinear fit is appropriate, either from a scatter plot or from the pattern of the residuals. However, it wouldn’t make sense in the context of an FRQ for you to have to come up a nonlinear fit–too many options to try, and too high odds of some students lucking out with a guess, and others not getting it. I could see the possibility of a question that suggested a particular transformation, vs. the simple linear fit. Princeton Review is usually pretty accurate about the level and type of the exam; Barron’s is a bit harder, but good prep if you understand it. Anyone know what Princeton says about this issue?</p>
<p>Soooo wait, what if we think something in a regression model is an outlier and the test graders don’t? If our “eyeballings” don’t agree with their “eyeballings” we get marked off?</p>
<p>Wow, the Stat exam is one week from today. We really need to cram it all in now. Any more helpful tips? :)</p>
<p>Chapter 5 of Barron’s:
Not too much that is noteworthy, for this crowd. Review vocab: “two-way contingency table, marginal frequencies, marginal distribution, lurking variables.”
It’s worthwhile to review Simpson’s paradox (although I don’t know if that’s part of the AP syllabus–you probably know). Info can be found on pages 116 and 117 of Barron’s. Other examples are given on p. 108-109 of Barron’s and in question 12 on p. 112 (answer on p. 113).</p>