Wednesday, October 29, 2008

trendlines

I have been a fan of http://www.fivethirtyeight.com/ for a couple of months now. Being numbers oriented, I have thoroughly appreciated the systematic analyses the authors perform on poll numbers. The quantitative sophistication and rigor are at a level rarely seen in political analysis reported to the general public.

So I was pretty taken aback by one of today's posts: In Oregon, Turnout is Down, But Especially in Red Counties. Here is the plot toward the bottom of the post:

Forget what the graph is plotting, what the heck is that line?? Nate Silver, who I normally greatly admire, goes on to very proudly explain:


"Note also that I have extended the regression line to predict the behavior of hypothetical counties consisting entirely of Bush voters, or entirely of Kerry voters. In [sic] regression line predicts that, in a county consisting of 100 percent Bush voters, turnout would be off by about 40 percent. Conversely, in a county consisting entirely of Kerry voters, it would be essentially unchanged."


Thanks Nate, for making what you did in 5 minutes in Microsoft Excel sound sophisticated. Thanks, too, for explaining to me how to read a line on a graph. However, the line you refer to nowhere mirrors the green dots. Fitting a line to that data is just plain wrong. Did you check the R^2 value of that fit, Nate?

My qualitative interpretation is that the fit for the data is a step function. Turnout change vs. 2004 is flat for all Bush Share of Votes under ~55%, after which there is a precipitous drop in turnout change. So the more interesting question to me is why there would be a threshold around 55%. I didn't need to hit the "Trendline" button in Excel to tell me that.

For the first time, I am disappointed in http://www.fivethirtyeight.com/.

2 comments:

geekhiker said...

All this math... I think my head just exploded...

Hadley said...

Agreed--I'm a postdoc at Caltech, and we got an Institute wide email a few months ago directing us to this site for high quality, quantitative analysis.

I won't ding him too much since this graph is not representative of most of the content on the site, but yeah...that "fit" is outrageously bad.