It’s a classic Bill James formula and yet another tool that points to scoring being a more important indicator of winning potential than actually winning. The formula goes:

win percent = (points scored)**2/((points scored)**2 + (points allowed)**2)

The Wikipedia writes about the formula here, and Pro Football Reference writes about it here, and well, is it really true that the exponent in football is 2.37, and not 2? One of the advantages in having an object that calculates these things (i.e. version 0.2 of Sport::Analytics::SimpleRanking, which I’m testing) is that I can just test.

What my code does is compute the best fit exponent, in a least squares sense, with the winning percentage of the club. And as Doug Drinen has noted, the Pythagorean expectation translates better into next years winning percentage than does actual winning percentage. My code is using a golden section search to find the exponent.

Real percentage versus the predicted percentages in 2010.

Anyway, the best fit exponent values I calculate for the years 2001 through 2010 are:

  • 2001: 2.696
  • 2002: 2.423
  • 2003: 2.682
  • 2004: 2.781
  • 2005: 2.804
  • 2006: 2.394
  • 2007: 2.509
  • 2008: 2.620
  • 2009: 2.290
  • 2010: 2.657

No, not quite 2.37, though I differ from PFR by about 0.02 in the year 2006. Just glancing at it and knowing how approximate these things are, 2.5 probably works in a pinch. The difference between an exponent of 2 and 2.37, for say, the Philadelphia Eagles in 2007 amounts to about 0.2 games in predicted wins over the course of a season.

Code snippet, showing how the Perl module is used.