How Good Was 538?

llimllib · on Nov 7, 2008

The race between 538 and intrade was basically too close to call. See:

http://www.bayesianinvestor.com/blog/index.php/2008/11/06/in...

and

http://www.portfolio.com/views/blogs/odd-numbers/2008/11/05/...

davidmathers · on Nov 7, 2008

That portfolio.com link is being misleading by only showing the EV totals. If we assume, as he does, that McCain will win MO that means the betting markets got the right number because they got two states wrong, whereas 538 got the wrong number because Nate only got one state wrong. So which one was actually more accurate?

I posted the predictions broken out by state here on HN the night before the election: http://news.ycombinator.com/item?id=353182

llimllib · on Nov 7, 2008

I tend to agree that 538 was better; I shouldn't have editorialized that the race was too close to call. I really just meant to summarize the two articles I'd come across that were on exactly this topic.

Thanks for collecting that data.

mechanical_fish · on Nov 7, 2008

Wow, that BayesianInvestor link is completely insane. According to the guy's own data, on Oct 1 Intrade predicted a 51% chance of Democratic victory in North Carolina, while FiveThirtyEight predicted a 50% change. In other words, their predictions were identical to within the margin of error. Here's how he characterizes that:

Intrade got North Carolina right on that date (just barely) while FiveThirtyEight rated it a toss-up.

Uh, yeah.

As for the fact that the trading markets are, to within a small margin of error, perfectly reflective of the best possible sources of empirical data (e.g. FiveThirtyEight, as of the night before the election): Yawn. Google "efficient market hypothesis". Or read up on the concept of "equilibrium".

llimllib · on Nov 7, 2008

Both I and the author of that blog both know what the EMH is and equilibrium is, and the problem is not as simple as you claim it is. Please don't be condescending.

While I agree that he seems to have overvalued the 1% difference in intrade and 538 re:North Carolina, do note that he concludes that 538 was the better election predictor. That one sentence does not make the whole article "completely insane", and I'd like to know what else from the article you thought was insane.

Furthermore, your last paragraph is completely and utterly incoherent. For one thing, you claim that a complicated, proprietary, untested, statistical data model, whose data is gathered by a person of unknown sympathies, represented the best available model of the empirical data. (How can you even argue that now, after a single election result?)

Then you claim that markets should be efficient in reflecting that data, except that they had quite large differences in some states. Which somehow adds up to "yawn".

Color me unimpressed.

eru · on Nov 8, 2008

"by a person of unknown sympathies"

That's not quite true. 538 was run by a self-confessed Obama-phile.

sown · on Nov 7, 2008

Electoral-vote.com was pretty accurate, too.

http://electoral-vote.com/evp2008/Pres/Maps/Nov03.html

Chris Bowers wrote, "So much information is publicly available now that a few nerds obsessed with poll numbers are much better sources for election information than you will ever get from big media." http://www.openleft.com/showDiary.do;jsessionid=16AE5E72EE66...

tallanvor · on Nov 7, 2008

Yeah, Electoral-vote's results were pretty much the same as 538's. --And I consider Tanenbaum's site to be better overall, because while it's not as pretty, he's completely open about how he arrives at his results and makes all the data he uses available.

sown · on Nov 7, 2008

I also like that it is heavy on data.

Also, I get tempted by their master's program that I don't qualify for.

ivankirigin · on Nov 7, 2008

Meta news.yc comment: when 538 called it for obama, the thread was killed: http://news.ycombinator.com/item?id=354128

That was a mistake. People paying attention knew 538 was right, and consistently better than the major news outlets.

Also, I feel the geek behind 538 deserved to be the top story, not some play-it-safe news outlet.

hugh · on Nov 7, 2008

I thought it was killed for being politics.

Unfortunately the later winner-of-the-election thread stayed up.

ivankirigin · on Nov 7, 2008

Politics that matters should be on HN. Winning an election matters. He called it with certainty.

kqr2 · on Nov 7, 2008

To be statistically meaningful, 538 will have to be this good for several more elections.

randallsquared · on Nov 7, 2008

I don't think so. They successfully predicted quite a few elections on Tuesday. I think it's a statistics error to group them into a single datum simply because they were all held on the same day.

khafra · on Nov 7, 2008

Or an information-theoretic error: At the very least, we have one bit of information per state, and one per Senate race. You could probably eke out some more by comparing win percentages, although that's problematic without knowledge of his algorithm.

llimllib · on Nov 7, 2008

But you can't assume that these events are independent of each other, can you?

dangoldin · on Nov 7, 2008

Can you assume that the next elections are independent of the current ones?

llimllib · on Nov 7, 2008

Yes.

Which is not to say that this year's election outcomes don't influence future events, but rather that the information flow from this year's elections will be so insignificant compared to future information by the time the next elections occur that we may accept them as independent variables.

(Certainly as more independent than dependent, but I'd conjecture that the effects of the current election are so small that they may be safely handwaved away. Interesting arguments that this year's elections on any level are a major determining factor in future elections are of course welcome.)

dangoldin · on Nov 7, 2008

I'll take a stab at it but I'm gathering data now - I have a feeling that it's pretty common for a president to get elected for a 2nd term - and repeat offices in general.

llimllib · on Nov 7, 2008

A person getting elected to repeat offices doesn't mean that their elections are dependent on each other - it's easy to argue that a party usually allows an incumbent to run, and each election is usually between only two people, so a high re-election rate can be modeled without making elections dependent on each other.

What you'd need to somehow show in order to claim that elections are not independent is that the data from the previous election itself influenced the next election.

(right? I'm no statistician so if I'm being dumb somebody please correct me)

dangoldin · on Nov 7, 2008

But if I am elected to an office and there is a 70% chance that if I run again I'll be reelected doesn't that mean the latter is dependent on the former.

Anyways, I did some analysis of the Presidential elections data using wikipedia and it turns out that there were 8 presidents who ran for reelection and lost and there were 16 presidents who had more than 1 term in office.

I'll write this up in a blog post to show the data.

dfranke · on Nov 8, 2008

http://en.wikipedia.org/wiki/Degrees_of_freedom_(statistics)

Readmore · on Nov 7, 2008

I fell in love with this site over the last few months. Not only were the numbers almost dead-on but the daily updates were all well written and insightful.

Great work!

Oompa · on Nov 7, 2008

The real question is: How good were the betting markets?

mattmaroon · on Nov 7, 2008

I don't think any of them had more than a few wagers. President, maybe overall Senate control, etc.

They all predicted Obama to win, though generally gave him a chance in the low 80s the day before, whereas 538 said 98. It's impossible to tell who was more accurate.

davidmathers · on Nov 7, 2008

http://news.ycombinator.com/item?id=353182

Everyone got IN wrong. MO was almost 50/50, but Nate had it going to McCain and the markets had it going to Obama. Right now, pre-recount, McCain won MO by 0.2%.

On the morning of the election the markets flipped IN and gave it to Obama.

llimllib · on Nov 7, 2008

> FiveThirtyEight got Indiana right on Oct 1

from http://www.bayesianinvestor.com/blog/index.php/2008/11/06/in...

(OTOH, intrade got NC when 538 didn't)

davidmathers · on Nov 7, 2008

Why is Oct 1 significant? On the night before the election Nate had IN as a 64% McCain win. Also, Nate, intrade, and betfair all predicted that Obama would win NC. See the link above for all the numerical details.

nihilocrat · on Nov 7, 2008

I'm sorry I can't give specific dates, but in some of 538s most recent models (say, the last week of October), Silver was projecting NC as being a very light carolina blue. :)

time_management · on Nov 7, 2008

538 indeed kicked ass. They also got all of the Senate races right, except for the Steven race in Alaska (way off, apparently, but who'd have expected a felon to win?) and possibly Franken/Coleman in Minnesota, where they gave Franken a 52% chance.

dejb · on Nov 7, 2008

Evaluating the effectiveness by 'how many they got right' can be pretty flawed. For example with Franken/Coleman. How 'wrong' are they when something they projected as a 48% chance happens. If you had 10 races that where similarly close you would have to expect to get about 4 or 5 of them 'wrong'. You may actually find that according to their own probabilities they were somewhat 'lucky' in having so many results tip to their most favoured result. Or perhaps they could be 'unlucky' to miss Minnesota.

A better way would be to check their results against other 'odds producing' predictive mechanisms. So if a another method has Stevens at 20% and 538 has Stevens at 30% then that actually counts as a win for 538 because they were 'less wrong'. The bigger the difference in their odds the bigger the win. In the real world this would translate to 538 winning money from the other method.

I'm sure there are better and more valid statistical techniques for doing this though. I'm hoping that someone gets around to it because I'd be keen to see the results.

msg · on Nov 7, 2008

As Silver himself wrote yesterday, Alaska's voting pattern was nothing short of bizarre, with Young, Stevens, and McCain getting votes at 12 to 14 percent each above their pre-election polls, and turnout about 14 percent below 2004 levels.

"Indeed, it seems possible that the number of "questionable" ballots could be quite high. So far, about 220 thousand votes have been processed in Alaska. This compares with 313 thousand votes cast in 2004. After adding back in the roughly 50,000 absentee and early ballots that Roll Call accounts for, that would get us to 270 thousand ballots, or about a 14 percent drop from 2004. It seems unlikely that turnout would drop by 14 percent in Alaska given the presence of both a high-profile senate race and Sarah Palin at the top of the ticket."

http://www.fivethirtyeight.com/2008/11/what-in-hell-happened...

Read on for more of his WTF explanation. We should all stay tuned for this one.

jwilliams · on Nov 7, 2008

I'm not sure how it works, but I assumed the polls in Alaska closed quite a bit later than the other states?

Perhaps seeing the coverage caused some voters to change their behaviour (ie. not vote)?

(Nate muses on the same possible explanation in his post).

evgen · on Nov 7, 2008

Now that the "Bradley effect" has been truly buried as a meme we should start a new one: the "Stevens effect" is the unwillingness of people to admit to a pollster that they are going to vote for a crook because he occasionally brings home pork to their remote little welfare state.

thomasmallen · on Nov 7, 2008

What's with the Alaska hate? Oh, and that "little" state is nearly two and a half times the size of Texas. You may call it a "welfare state", but clearly, we have a very strong interest in making sure it's somewhat populated.

gnaritas · on Nov 7, 2008

Land mass isn't relevant, it's little because it has a small population.

unalone · on Nov 7, 2008

The Alaska hate is partly because of the Palin "horror stories" (her wanting to ban books, her abuse-of-power controversies, and her being popular despite that), and because Ted Stevens is a convicted felon who Alaska reelected.

thomasmallen · on Nov 7, 2008

Politicians are corrupt...more at 11.

unalone · on Nov 7, 2008

I was trying to explain why people are hating on Alaska. Politicians aren't why Ted Stevens got reelected. The people of Alaska did that themselves.

thomasmallen · on Nov 7, 2008

Ah, I see what you're saying. If you judge a people based on their elected leaders, then the Bush administration reflects pretty negatively on Americans in general. But I don't judge a people on these terms: Otherwise, I'd think very poorly of Russians.

utnick · on Nov 7, 2008

has he open sourced his model for election predictions?

I would be curious to see if it outperformed a naive averaging of the major polls.

robg · on Nov 7, 2008

No, and he probably won't. His baseball projections are based on a proprietary algorithm (PECOTA).

They did seem to do better than Pollster which is a naive averaging.

river_styx · on Nov 7, 2008

From Wikipedia (http://en.wikipedia.org/wiki/FiveThirtyEight.com):

The site compiles polling data through a unique methodology derived from Silver's experience in baseball sabermetrics to "balance out the polls with comparative demographic data"[1] and "weighting each poll based on the pollster's historical track record, sample size, and recentness of the poll."