When I was an undergrad at the University of Guam, all the science majors hung out in the Biology Department office. In part, this was because some of the biologists had licenses to fish and scuba outside the coral reef of Guam, and so you never knew what would be dragged into the building. Another reason was a small but efficient library of science books, one of which was by George Gamow. I wish I recalled the title, as one topic in this book had a powerful influence on me.

It discussed dimensional analysis, and showed an example of using dimensional analysis to derive a formula for some physical process. I’ve long forgotten the analysis and the page, but it left an indelible impression of  the power of accurately accounting for the  physical dimensions of the components of a formula.

On August 15th, Pro Football Focus introduced a new passer rating formula. It is:

Ranking = 4.66667*[ 20*Completions + 20*Drops + Yards in Air +20*Tds - 45*Ints ]/(Attempts – Spikes – Throw Aways)

There are some interesting ideas in this formula, but it seems seriously flawed from my point of view. Complaints in order are:

1. It is double counting yards.

2. It is trying to add two different kinds of yardage metrics in the same formula.

3. It doesn’t seem to understand the origin of the TD and interception terms it actually is using.

4. Items 1 and 3 interact in ways that I suspect the author never intended, yielding a scoring model that seriously undervalues turnovers.

We’ll address each of these issues in turn. As Brian Burke has pointed out and we’ve discussed in more detail here, completions and yardage are related  through the equation yardage = completion*yards per completion. If we note that YPC in the modern NFL is actually 11.4 yards, within a relative error of 9%, the first two terms in the numerator can be rewritten:

20/11.4*[ Yards + Extra Yards] = 20/11.4*Equivalent yards = 1.75*U*Yards

Yards is equal to 11.4*Catches. Extra Yards would be defined as 11.4*Drops, and is equal to the yards a QB would have gotten if  those passes hadn’t been dropped. The sum 11.4*(Catches + Drops) can be defined as Equivalent Yards, the total yards a QB would have gotten without any dropped passes. U, a dimensionless parameter, is Equivalent Yards/Yards. U, pretty much by definition, is greater than or equal to 1.0.

The third term in the numerator, by contrast, is Yards in the Air, the yards a QB is responsible for, or Yards – Yards after the catch. If V is YIA/Yards, then V is a dimensionless positive valued term less than 1. So, not only are there two yardage terms, there are two different kinds of yardage terms. This touches on items 1 and 2. Item 3 will be discussed in a footnote.

To get to item 4, the yardage components in this formula can be combined into a term like this:

20*Completions + 20*Drops + YIA = [1.75*U + V]*Yards

Leading to a numerator like this

4.6667*[ (1.75*U + V)*Yards +20*TDs -45*Ints]

whose functional scoring model becomes this:

(Yards +20/[1.75*U + V]*Tds -45/[1.75*U + V]*Ints)/Equivalent Attempts

I don’t think that was the intended result of the author of this model.

I suspect that U is in the vicinity of 1.1 and V, who knows? Call it 0.5 for the sake of argument.  The term  1.75U + V = 2.425 (which might as well be 2.4) and the core formula then becomes

Yards + 8*Tds – 19*Ints/Equivalent Attempts

So to ask the question that occurs to me, does the author think an interception is only worth about 2 points?

Solutions?

My gut feeling is that this is a formula trying to do too many things. You don’t want to add two different kinds of yardage metrics. So, initially, either dropping the completion + drops terms or getting rid of the YIA terms would yield a formula logically and algebraically sound in its treatment of yardage. A formula like

[11.4*(Completions + Drops) + 20*TDs - 45*Ints]/Equivalent Attempts

or

[YIA + 20*TDs - 45*Ints]/Equivalent Attempts

or better yet, since Brian Burke’s expected points formulas linearize to a surplus value for TDs of 23.3 yards, and the value of a turnover in yards is about 67 yards, use this:

[YIA + 23.3*TDs - 60*Ints]/Equivalent Attempts [1]

An even better formula, since PFF must have excellent data on how many yards an interception is run back, would be:

(YIA + 23.3*TDs – [ 67 - average net field position relative to original LOS]*Ints)/Equivalent Attempts [2]

So there you have it. With a little work, PFF can have a self consistent formula encompassing many of the new ideas they wish to add to a modern passer rating.

Update 9/27/2011: just noted that average YPC I previously calculated is actually 11.4 ± 0.96, instead of the originally published 14.7. Correcting the math  (which I’ve done) doesn’t affect the argument.

~~~~~

[1] I say this because Chase Stuart’s “derivation” of 20 yards, while it turns out to be a fairly good number, goes through too  many concepts that do not make sense in a world where football is treated as a Markov chain, or alternatively, a finite state machine. Seriously, does anyone believe yardage gained running and yardage gained passing differ? That completely breaks the notion of path independence in a Markov chain. Further, as we explain here and here, the idea that the TD term is “the value of the touchdown” is broken. It’s not something you can measure on the field by calculating, say, the net value of a touchdown relative to the one yard line, as it’s related to total scoring (i.e. TDs plus field goals) of all kinds.

Likewise, the 45 yard term for the interception is based on on the THGF model.  It’s the THGF value of a turnover (4 points or 50 yards) less the net value of field position after the runback (estimated at 5 yards beyond the original LOS).

[2] I’m hesitant to point this out, but yet another variation on these formulas would be to use the dimensionless parameter U or the dimensionless parameter V as a multiplier into the yardage term. Something like

U*YIA or V*11.4*(Catches + Drops)

comes to mind. Just, you’re not really measuring what was actually left on the field, in these instances. You’re measuring what could have been. The use solely of YIA appeals to me,  if the idea is to have a formula that measures the quarterback’s real contribution to scoring.

Update 9/29/2011: U simplifies to (Catches + Drops)/Catches, and as such, U*YIA has a particularly simple, appealing form.

About these ads