Olympic overfitting?

According to William Heuslein, “The man who predicts the medals“, Forbes Magazine, 1/19/2010

Daniel Johnson makes remarkably accurate Olympic medal predictions. But he doesn’t look at individual athletes or their events. The Colorado College economics professor considers just a handful of economic variables to come up with his prognostications.

The result: Over the past five Olympics, from the 2000 Summer Games in Sydney through the 2008 Summer Games in Beijing, Johnson’s model demonstrated 94% accuracy between predicted and actual national medal counts.

First question: what do you think it means to “demonstrate 94% accuracy between predicted and actual national medal counts”?

If you guessed that it means “the predicted number of medals matched the actual number of medals 94 times out of 100″, I’m sorry, you’re wrong.

The next sentence of the Forbes article suggests what it really means:

For gold medal wins, the correlation is 87%.

And we can confirm from reading one of the papers cited on Johnson’s web site (Daniel Johnson and Ayfer Ali, “A Tale of Two Seasons: Participation and Medal Counts at the Summer and Winter Olympic Games”, Social Science Quarterly 85(4) 2004)  that “94% accuracy between predicted and actual national medal counts” means that “the vector of predicted numbers of medals by country has a correlation of r=0.94 with the actual numbers of medals by country”.  (N.B. the text between the double quotes there is my summary, not Johnson & Ali’s wording. And if you’re a bit rusty on what correlation actually means, the wikipedia page is not bad; or just keep in mind that if you can turn one sequence of numbers into another by some combination of adding constants and multiplying by constants, their correlation will be “100%”.)

This still sounds like pretty impressive predicting, but it could be true without any of the medal-count numbers actually coinciding.  I wonder what proportion of Forbes’ readers understood that? In fairness, Heuslein did slip in that “correlation”, but then at the bottom of the piece, he lists what he calls the “Accuracy rate of Johnson’s predictions” for the summer and winter games from 2000 to 2008, which (for total medals) vary from “93%” to “95%”.

Anyhow, how well did “the man who predicts the medals” do this time around?

Forbes gives Johnson’s “In Depth: Medal Predictions for Vancouver“. And now that the Olympics is over, we can compare them with the actual medal counts, as documented by the New York Times. So I entered the data into a table, and calculated the correlations using this trivial R script:

X <- read.table("Olympics.table", header=TRUE)
totalcor <- cor(X[,"PredictedTotal"],X[,"ActualTotal"])
goldcor <- cor(X[,"PredictedGold"], X[,"ActualGold"])

The correlation for total medals? 0.625. The correlation for gold medals? 0.279.

How did 94% and 87% turn into 63% and 28%?

I’m not sure, and I don’t have time this morning to nail it down — I’ve got to finish my laundry, and hike over to the train station to catch the 8:30 Regional Rail for NYC, where I’m giving a talk at noon. But four possible explanations come to mind, all of which might simultaneously be true:

(1) Scribal error. Maybe Forbes’ list of Johnson’s predictions is wrong, or the NYT list of medal totals is wrong, or I made a mistake copying the numbers into my table. If so, someone will probably tell us in the comments.

(2) Regression to the mean. Maybe Johnson’s luck ran out, like a mutual-fund manager whose string of good years was based more on good fortune rather than good information.

(3) False advertising. Maybe the 94% correlation only applies to Johnson’s predictions if you include not only the 13 countries in the Forbes list, but also all the other countries in the world, most of whom can trivially be predicted to win no medals at all, or almost none. If so, then r=0.94 may not really be very impressive.

(4) Overfitting. Although the cited paper does claim “out of sample” predictions — that is, they calculate the model parameters on historical data, and then looks at the fit to recent data which was not used in “training” the model — it’s possible that they made some adjustments in the model structure in order to get it to work well, and perhaps a different set of adjustments would be needed to make it work well for this year’s data.

My prediction: some combination of (3) and (4). With respect to (3), note that just padding the medal vectors with 70 zeros brings the total correlation up from r=0.63 to r=0.91:

padding <- vector(mode="numeric",70)
totalcor1 <- cor(c(X[,"PredictedTotal"],padding),c(X[,"ActualTotal"],padding))

My tentative conclusion: this is more evidence that when an economist is talking about numbers, you should put your hand on your wallet.

And, of course, when a journalist is interpreting a press release about a technical paper, you may need both hands and some help from your friends to avoid getting intellectually mugged.

[Via Phil Birnbaum, “An economist predicts the Olympic medal standings“, Sabermetric Research 2/18/2010]