Again, I have to credit Chris Malumphy for originally making this suggestion. The Pearson correlation coefficient is described here, and can be thought to be a measure of how clustered around a line the data set is. Right around a Pearson coefficient  of 0.4 a data set shows a kind of elliptical shape, as in this diagram, borrowed from the Wikipedia site.

Diagram from Wikipedia article on correlation.

The reason this intro is important is that the data set from 1994 to 2010 is correlated, as in winning percentage is correlated with draft picks per year, with a Pearson correlation coefficient of 0.378. Yes, that’s significant.

The statistical test for significance of  the data set is found, among other places, here. And for a data set with 32 points,  the 0.05 confidence interval for the two tailed test is about 0.35. In other words, with a correlation coefficient of 0.378, I have a better  than 95 percent confidence that these data are real, and not a product of chance.

Total correlations and summed round per year correlations

To note, that’s a small correlation over a very long period of time. If in fact it’s a small effect, when you divide the data set up into smaller chunks, the correlation should  grow smaller. And if you calculate correlations for data sets from 1994-2004 and 2000-2010 you in fact get smaller correlations than the 17 year data set. Well, in the first set, the correlation is positive *only* if you throw out the Texans, who are a huge outlier.

My gut reaction: in the sciences, they call this a “publishable result”. I could start making the posters now, if there were some kind of Georgia Academy of Sciences. Who knows, it might not hold up under  the scrutiny of harder core analysis, but for now,  it’s nice to know that the trend I saw in the data has a physical representation. In geometric terms, the data set is elliptical.

The lifespan of the effect is interesting. It’s not viewable in the typical life span of a coach, or a player. It’s only viewable on the time scale of a dynasty. It’s a small effect of dynastic scale. Pretty cool, that.

Update: editing some bad sentences. Replaced plot that cut off some poorly performing teams.

About these ads