After the previous post in this series, I realized there is a scoring model buried within the NFL passer rating formula. Pretty much any equation of the form

RATE = (yards + a*TDs – b*(INTS + FUMBLES) – sacks)/plays

implies the existence of one of these models. Note that this form suggests a single barrier potential for touchdowns, while there equally well could be one for the 0 yardage side (“the sack side”) of the equation. To plot the one suggested by Pro Football Reference adjusted yards per attempt formula,

RATE = (yards + 20*TDs – 45*Ints)/attempts

we see this

The refactored NFL passer rating has the form

RATE = 100/24*2.75[( yards + 29.1*TDs - 36.4*Ints)/attempts] + 50/24

when the completion and yards terms are combined using yards per completion as a constant. The term in brackets is a scoring model. To figure out the model, some algebra is needed to determine the value of the line at 100 yards.

0.291(x + 2 ) + (x + 2) = 6.4 + 2 = 8.4

1.291 x + 2.582 = 8.4

1.291x = 5.818

x ≈ 4.5

This yields a slope of 0.065, a barrier potential of 1.9 points or so, and a value for a turnover of 2.5 points. Plotted, it looks like this

and is not all that much different from the implied model in the PFR aya formula.

To get to the idea that the barrier potential represents a difference between a model that allows a 100% chance to score, and a model that has an imperfect chance of scoring, we’re going to build a scoring potential model from just a single data point. Understand, as a line has two points, and -2 at 0 yards is generally assumed, the slope of the line can be determined by solving for the expected points at a single yard line.

If on first down at the 1 yard line, you have an 80% change of scoring a touchdown and a 15% chance of scoring a field goal, and a 5% chance of just losing possession, then solving for the expected points on first and one, you get

*expected points = 0.8*6.4 + 0.15*3 = 5.57 points*

*value of yards at 100 = 5.57*100/99 ≈ 5.63 points*

*barrier potential = 6.4 – 5.63 = 0.77 points = 10.1 yards*

*turnover value = 5.63 – 2 = 3.63 points ≈ 47.6 yards*

and expressed as a passer ranking formula, you might get something like

RATE = (yards + 10.1*TDs – 48*Int)/attempts

and plotted, look something like this:

The synthetic first and one data above differ little from the real first and one data given here, but PFR’s adjusted yards per attempt is a formula that averages data over all downs, as opposed to being the data for a single down.

**Conclusions**

The size of the barrier potential is a measure of how hard it is to score. The smaller the barrier potential, the easier it is to score. When the barrier potential is zero, scoring approaches 100% as the team approaches the goal line. Therefore, in more realistic scoring models, barrier potentials tend to appear.

It is entirely possible that the larger barrier potentials of the NFL passer formula merely reflect the times in which the model was created. The 1970s was an era dominated by defense and a running game. It was harder to score then. It would be interesting to calculate scoring rates for first and one situations from, say, 1965 to 1971, when the NFL passer formula was created, and see if the implied formula actually matches the data of the times.

Other issues these models suggest: since they are easy to construct with very modest data sets, they can be individualized for college and high school conferences, leagues, and even teams. They suggest trends that can be useful for analyzing particular times and ages. Note that as scoring gets harder and barrier potentials grow larger, the value of the turnover grows less. It’s not that hard also, to set up an equation representing a high scoring team with one that doesn’t score much at all. Since the slope of the line of the low scoring team is less than that of the high scoring team, turnover value becomes dependent on field position, as the slopes don’t cancel. The turnover becomes more valuable towards the goal line of the low scoring team.

September 7, 2011 at 9:18 am

[...] value merge in Figure 1, but remain apart in Figure 2. This value, which I’ve called a barrier potential previously, is the product of a chance to score that’s less than a 1.0 probability as you [...]

September 7, 2011 at 9:18 am

[...] value merge in Figure 1, but remain apart in Figure 2. This value, which I’ve called a barrier potential previously, is the product of a chance to score that’s less than a 1.0 probability as you [...]

September 7, 2011 at 9:32 am

[...] 3/4/5, 2011 Weekend Recap: Over the holiday weekend, Code and Football looked at passer rating as a scoring model… Outside the Hashes determined what a yard is worth in terms of expected points… Keith [...]

September 26, 2011 at 9:15 am

[...] breaks the notion of path independence in a Markov chain. Further, as we explain here and here, the idea that the TD term is “the value of the touchdown” is broken. It’s not [...]

September 28, 2011 at 9:13 am

[...] The Hidden Game of Football’s new passer rating is a formula of this kind, with α = 10 and β = 45. Pro Football Reference’s AY/A has an α value of 20 and a β value of 45. On this blog, we’ve shown that these formulas are tightly associated with scoring models. [...]

December 23, 2011 at 6:55 pm

Data Transformation

superb post. Ne’er knew this, appreciate it for letting me know Data Transformation

August 23, 2012 at 9:29 am

[...] references, for those who need them: here and here and here. Pro Football Reference’s AYA statistic as a scoring potential model. The barrier [...]