Podium Ownage: Early results
With all of the hand wringing about the Own the Podium program, and was the money worth it I thought I might apply some basic statistics to the problem. Testing weather or not a technique works is one of the basic things we do in Computer Science; if you make a change to a piece of software you test to see if the change improves it. Otherwise it’s not worth doing.
Methodology: each class of sport will be considered a sample: the OTP program has sports classified into 14 categories, but Wikipedia has them in grouped slightly differently. Where possible, and where the sample size is large enough I’ve separated Men’s and Women’s events just to see if there is a difference in bang for the buck. A Did not Finish (DNF) or Did Not Start (DNS) or Disqualified (DSQ ) is considered no measurement of performance, and is discarded.
Data Sources: Wikipedia’s articles “Canada at the 2006 Winter Olympics” and “Canada at the 2010 Winter Olympics“
I imported all of the placements for each athlete, and averaged the placements per sport. I also calculated the standard deviations, which as you can imagine were quite large. The following chart compares the results per sport between the 2006 and 2010 Winter Olympics:
As you can see, there are some improvements and some “disprovements”: biathlon, bobsleigh and cross country skiing show the largest improvements overall, even though we won no medals in those sports.
Now, I understand that the analysis is flawed in many ways, not the least of which is that the standard deviation is greater than the difference in results for every event; this means that the analysis is not statistically significant, but then again neither is the number of gold medals:
However, as a simple comparison of improvement over one season for a team of people it does make a point: improvement is not about being first. Ultimately the name for the OTP program is probably the cause of much of the concern (and pressure) over weather the athletes were winning enough metal. I think the blame for the name can be put squarely into the ultra-competitive nature of corporations, and the idea that first is the only place worth mentioning.
By the way, did anyone notice how well Eric Guay did at the alpine events? 5th in both the Super-G and the Downhill, and possible the most deserving athlete with the least press coverage during the games. 0.33 and 0.34 seconds from the Gold Medal time.
Now I’m not a fan of the “everyone gets a medal” stuff that’s happening in schools these days, but 0.34 seconds is pretty damn close to Gold, and in the words of my mother “Someone should give him a medal”.
Actually, on further thought, the relation to the standard deviation does not make the comparison statistically insignificant. It just means that the effect is lower than one std dev. Statistical significance is a function of the sample size in relation to the population, and in this case the sample is the population.
A friend of mine has called into question my use of the "average" as a descriptive statistic when comparing the group of athletes from 2006 and 2010. He states the following:
"Let's say last year we were able to send 3 athletes to downhill skii but this year we qualified more so we sent 5 athletes. The mean might go down."
And the mean might go up, and the mean might stay the same. The average is a good tool for comparing numbers that are generated through the same process, in this case the selection and performance of an athlete at given sport in the winter games.
In fact, if we qualify more athletes and the mean goes down then this is a valid effect on the performance of the group, and is reflected in the average. Thus the average measure when I intended it to measure: the average performance of the Canadian team.
It also works to compare the performances.
However, this brings me to 2 valid criticisms of my use of statistics:
1) are the groups doing the same thing. In the case of the Olympics I have grouped all alpine events, all ski events and all short track, long track, snowboard, and freestyle events.
2) is it valid to separate the men and women.
For #1 I've lumped some very different sports into one group. The rationale is that these sporting groups are close enough that they are comparable. For instance, Lindsay Vonn participates in all of the alpine events, as do most of her team mates. This is also true of short track, long track, and many of the Nordic events. Biathalon is different because it is indeed a very different skill than the rest of the Nordic events.
On separating out the women; I haven't looked into the funding arrangement, but I do know that for Hockey the amount given to each sport (men's and women's) is different. I thought it would be interesting to separate some of them out to see if the difference in performance was significant on a gender basis.
I'll do some more work on this presently.