Ok, this whole article is a kind of speculation on my part. DVOA is generally sold as a kind of generalization of the success rate concept, translated into a percentage above (or below) the norm. Components of DVOA include success rate, turnover adjustments, and scoring adjustments. For now, that’s enough to consider.

Adjusted yards per attempt, as we’ve shown, is derived from scoring models, in particular expected points models, and could be considered to be the linearization of a decidedly nonlinear EP curve. But if I wanted to, I could call AYA style stats the generalization of the yardage concept, one in which scoring and turnovers are all folded into a single number valued in terms of yards per attempt.

So, if I were to take AYA or its fancier cousin ANYA, and replace yards with success rate, and then refactor turnovers and scoring so that turnovers and scoring were scaled appropriately, I would end up with something like the “V” in DVOA. I could then add a SRS style defensive adjustment, and now I have “DV”. If I now calculate an average, and normalize all terms relative to my average, I’d end up with “Homemade DVOA”, wouldn’t I?

The point is, AYA or ANYA formulas are not really yardage stats, they are scoring stats whose units are in yards. So, if really, DVOA is ANYA in sheep’s clothing, where yardage has been replaced by success rate, with some after the fact defense adjustments and normalization from success rate “units”.. well, yes, then DVOA is a scoring stat, a kind of sophisticated and normalized “adjusted net success rate per attempt”.

Summary: The NFL passer rating can be considered to be the sum of two adjusted yards per attempt formulas, one cast in units of yards and the other using catches as a measure of yards. We show, in this article, how to build such a model by construction.

My previous article has led to some very nice emails back and forth with the Pro Football Focus folks. In thinking about ways to explain the complexities of the original NFL formula,  it occurred to me that there are two yardage terms because the NFL passer rating can be regarded as the sum of two adjusted yards per attempt formulas. Once you begin thinking in those terms, it’s not all that hard to derive an NFL style formula.

Our basic formula will be

<1> AYA = (yards + α*TDs – β*Ints)/Attempts

The Hidden Game of Football’s new passer rating is a formula of this kind, with α = 10 and β = 45. Pro Football Reference’s AY/A has an α value of 20 and a β value of 45. On this blog, we’ve shown that these formulas are tightly associated with scoring models.

Using the relationship Yards = YPC*Catches, we then get

<2> AYA = (YPC*Catches + α*TDs – β*Ints)/Attempts

Since the point of the exercise is to end up with an NFL-esque formula, we’ll multiply both sides of equation <2> with 20/YPC.

<3> 20*AYA/YPC = (20*Catches + 20*α*TDs/YPC – 20*β*Ints/YPC)/Attempts

Now, adding equations <1> and <3>, we now  have

<4> (20/YPC + 1)*AYA = (20*Catches + Yards + [20/YPC + 1]*α*TDs – [20/YPC + 1]*β*Ints)/Attempts

and if we now define RANKING as the left hand side of equation <4>, A as [20/YPC + 1]*α and B as [20/YPC + 1]*β, formula <4> becomes

RANKING = (20*Catches + Yards + A*TDs – B*Ints)/Attempts

Look familiar? This is the same form as the NFL passer  rating, when stripped of its multiplier and the additive coefficient. To complete the derivation, multiply both sides of the equation by 100/24 and then add 50/24 to both sides. You end up with

RANKING = 100/24*[(20*Catches + Yards + A*TDs - B*Ints)/Attempts] + 50/24

which is the THGF form of the NFL passer rating, when A = 80 and B = 100.

If YPC equals 11.4, then the conversion coefficient (20/YPC + 1) becomes 2.75. The relationship between the scoring model coefficients α and β and the NFL style passer model coefficients A and B become

A = 2.75*α
B = 2.75*β

Just for the sake of argument, we’re going to set alpha to 25, pretty close to  the 23.3 that we get from a linearized Brian Burke model, and beta we’ll set to 60, 6.7 yards less than  the 66.7 yards we calculated from the linearized Brian Burke scoring model. using those values, we get 68.75 for A and 165 for B. Rounding the first value to the nearest 10 and rounding B down a little, our putative NFL style model becomes:

RANKING = (20*Catches + Yards + 70*TDs – 160*Ints)/Attempts

Note that formulas <1> and <2> do not contribute equally to the final sum. Equation <2> is weighted by the factor (20/YPC)/(20/YPC + 1) and equation <1> is weighted by the factor 1/(20/YPC + 1). When YPC is about 11.4 yards, then the contribution of equation <2> to the total is about 63.6% and equation <1> adds about 35.4% to the total. Complaints that the NFL formula is heavily driven by completion percentage are correct.

Using the values α = 20 and β = 45, which are values found in Pro Football Reference’s version of adjusted yards per attempt, we then get values of A and B that are 55 and 123.75 respectively. Rounding down to the nearest 10, and plugging these values into the NFL style formula yields

RANKING = (20*Catches + Yards + 50*TDs – 120*Ints)/Attempts

Note that the two models in question have smaller A values than the core of the traditional NFL model (80) and larger B values than the traditional NFL model (100). This probably reflects the times. The 1970s were a defensive era. It was harder to score then. As it becomes harder to score, the magnitude of the TD term should increase. TD/Interception ratios were smaller in the 1950s, 1960s, and 1970s. As interceptions were more a part of the job, perhaps their effect wasn’t as valued when the original NFL formula was constructed.

Afterward: in many respects, this article is just the reverse of the arguments here. However, the proof by construction yields some useful formulas, and in my opinion, is easier to explain.

Update: more exhaustive derivation of the NFL passer rating.


Get every new post delivered to your Inbox.

Join 243 other followers