Summary: with some calculations based on adjusted yards per attempt, Matt Ryan’s value as a passer in the 2016 season can be shown to be almost 9 points a game more than the average QB.

Mark Zinno is a host on a sports talk show, 92.9 the Game, in the 7pm ET time slot. Often booted out of the slot by Atlanta Hawks games, he nonetheless has been a dogged supporter of Matt Ryan. This isn’t new, btw. Even in years where Matt Ryan wasn’t at his best, he would doggedly argue that Matt Ryan was an elite quarterback, and said repeatedly that compared to an average NFL team, that Atlanta was blessed.

So, we’re dedicating this blog post to Mark Zinno.

It’s hard to understand the scope of what Matt Ryan has done until you look at his adjusted yards per attempt in 2016. Pro Football Reference lists it as 10.1, which is one of the highest I’ve seen, and comparable to Peyton Manning’s 2004 season, where PM’s AYA was 10.2. Looking a little further, you can see that PFR ranks this the 4th best performance in history. Aaron Rogers is in the top 4, and for some reason, so is Nick Foles.

The value in using AYA is that you can build an expected points curve that satisfies all the requirements of the AYA function, and then use the slope of that curve to relate yards to points. Don’t worry, I did that long ago, and the result is documented here. The simple take home is the magic conversion 2.25, which converts AYA from yards to “expected points generated per 30 passes”.

Then, using the 2016 annual data from Pro Football Reference, you can calculate  what the average QB did, by calculating an AYA using the overall season’s statistics.  So the formula is:

(123639 yards + 20*786 TD – 45*415 Ints)/  18295 attempts 

(123639 yards + 15720 “TD” yards – 18675 “Int” yards) / 18295 attempts

120684 yards / 18295 attempts

6.60 AYA to 3 significant digits.

Now things become simpler. Matt Ryan generated 10.1*2.25 = 22.7 points per 30 attempts, while Joe QB generated 14.8 points per 30 attempts. The difference, rounded to a whole number, suggests that Matt Ryan was worth about 8 more points in 30 attempts than the average NFL QB this season.

That doesn’t entirely encompass his per game value. Matt threw 534 attempts  this season for an average of 33.4 passes per game. So his per game value, to the nearest tenth of a point, was more like 8.8 points a game more than the average quarterback.

But if the numbers baffle you, then the simple take home is that Matt’s statistical efficiency in 2016 is comparable to the best single season Peyton Manning ever had.

There are two well known adjusted yards per attempt formulas, which easily reduce to simple scoring models. The first is the equation  introduced by Carroll et al. in “The Hidden Game of Football“, which they called the  New Passer Rating.

(1) AYA = (YDs + 10*TDs- 45*INTs)/ ATTEMPTS

And the Pro Football Reference formula currently in use.

(2) AYA  = (YDs +20*TDs – 45*INTs)/ATTEMPTS.

Scoring model corresponding to the THGF  New Passer Rating, with opposition curve also plotted. Difference between curves is the turnover value, 4 points.

Scoring model corresponding to the THGF New Passer Rating, with opposition curve also plotted. Difference between curves is the turnover value, 4 points.

Formula (1) fits well to a scoring model with the following attributes:

  • The value at the 0 yard line is -2 points, corresponding to scoring a safety.
  • The slope of the line is 0.08 points per yard.
  • At 100 yards, the value of the curve is 6 points.
  •  The value of a touchdown in this model is 6.8 points.

The difference, 0.8 points, translated by the slope of the line,  (i.e 0.8/0.08) is equivalent to 10 yards. 4 points, the value of a turnover, is equal to 50 yards. 45 was selected to approximate a 5 yard runback, presumably.

Pro Football Reference AYA formula translated into a scoring model. Difference in team and opposition curves, the turnover value, equals 3.5 points.

Pro Football Reference AYA formula translated into a scoring model. Difference in team and opposition curves, the turnover value, equals 3.5 points.

Formula (2) fits well to a scoring model with the following attributes:

  • The value at the 0 yard line is -2 points, corresponding to scoring a safety.
  • The slope of the line is 0.075 points per yard.
  • At 100 yards, the value of the curve is 5.5 points.
  • The value of a touchdown in this model is 7.0 points.

The difference, 1.5 points, translated by the slope of the line,  (i.e 1.5/0.075) is equivalent to 20 yards. 3.5 points, the value of a turnover, is equal to 46.67 yards. 45 remains in the INT term for reasons of tradition, and the simple fact this kind of interpretation of the formulas wasn’t available when Pro Football Reference introduced their new formula. Otherwise, they might have preferred 40.

Adjusted yards per attempt or adjusted expected points per attempt?

Because these models show a clearly evident relationship between yards and points, you can calculate expected points from these kinds of formulas. The conversion factor is the slope of the line. If, for example, I wanted to find out how many expected point Robert Griffin III would generate in 30 passes, that’s pretty easy, using the Pro Football Reference values of AYA. RG3’s AYA is 8.6, and 0.075 x 30  = 2.25. So, if the Skins can get RG3 to pass 30 times, against a league average defense, he should generate 19.35 points of offense. Matt Ryan, with his 7.7 AYA, would  be expected to generate 17.33 points of offense in 30 passes. Tony Romo? His 7.6 AYA corresponds to  17.1 expected  points per 30 passes.

Peyton  Manning, in his best  year, 2004, with a 10.2 AYA, could have been expected to generate 22.95 points per 30 passes.

This simple relationship is one reason why, even if you’re happy with the correlation between the NFL passer rating and winning  (which is real but isn’t all that great), that  you should sometimes consider thinking in terms of AYA.

A Probabilistic Rule of Thumb.

If you think about these scoring models in a simplified way, where there are only two results, either a TD or a non-scoring result, an interesting rule of thumb emerges. The TD term in equation (1) is equal to 10 yards, or 0.8 points. 0.8/6.8 x 100 = 11.76%, suggesting that the odds of *not* scoring, in formula (1), is about 10%. Likewise, for equation (2) whose TD term is 20, 1.5/7 x 100 = 21.43%, suggesting the odds of *not* scoring, in formula (2), is about 20%.

Ok, this whole article is a kind of speculation on my part. DVOA is generally sold as a kind of generalization of the success rate concept, translated into a percentage above (or below) the norm. Components of DVOA include success rate, turnover adjustments, and scoring adjustments. For now, that’s enough to consider.

Adjusted yards per attempt, as we’ve shown, is derived from scoring models, in particular expected points models, and could be considered to be the linearization of a decidedly nonlinear EP curve. But if I wanted to, I could call AYA style stats the generalization of the yardage concept, one in which scoring and turnovers are all folded into a single number valued in terms of yards per attempt.

So, if I were to take AYA or its fancier cousin ANYA, and replace yards with success rate, and then refactor turnovers and scoring so that turnovers and scoring were scaled appropriately, I would end up with something like the “V” in DVOA. I could then add a SRS style defensive adjustment, and now I have “DV”. If I now calculate an average, and normalize all terms relative to my average, I’d end up with “Homemade DVOA”, wouldn’t I?

The point is, AYA or ANYA formulas are not really yardage stats, they are scoring stats whose units are in yards. So, if really, DVOA is ANYA in sheep’s clothing, where yardage has been replaced by success rate, with some after the fact defense adjustments and normalization from success rate “units”.. well, yes, then DVOA is a scoring stat, a kind of sophisticated and normalized “adjusted net success rate per attempt”.

Ed Bouchette has a good article, with Steelers defenders talking about Michael Vick. Neil Payne has two interesting pieces (here and here) on how winning early games is correlated with the final record for the season.

Brian Burke has made an interesting attempt to break down EP (expected points) data to the level of individual teams. I’ve contributed to the discussion there. There is a lot to the notion that slope of the EP curve reflects the ease with which a team can score, and the more shallow the slope, the easier it is for a team to score.

Note that the defensive contribution to a EP curve will depend on how expected points are actually scored. In a Keith Goldner type Markov chain model (a “raw” EP model), a defense cannot affect its own EP curve. It can only affect an opponent’s curve. In a Romer/Burke type EP formulation, the defensive effect on a team’s EP curve and the opponent’s EP curve is complex. Scoring by the defense has an “equal and opposite” effect on team and opponent EP, the slope being affected by frequency of the scoring as a function of yard line. Various kinds of stops could also affect the slope as well. Since scoring opportunities increase for an offense the closer to the goal line the offense gets, an equal stop probability per yard line would end up yielding nonequal scoring chances, and thus slope changes.

The Fifth Down blog features an article about a new phone app, one that says it will give you the winning chances of every play during the game. In the article, we get this little gem:

According to his analysis, a team that returns a kickoff to its 40-yard line can be expected to score an average of 3 more points on the drive than if it had started at the 20-yard line.

“If you make it to the 40, you essentially just made a field goal, even if you don’t realize that immediately,” Bessire said.

I seriously doubt this. 3 points * 100 yards / 20 yards = 15 points. I don’t know of anyone who scales an expected points curve to be worth 15 points. I don’t know of a single reliable EP solution with slopes routinely greater than 0.08 points per yard in between the 20 yard lines.

You pay your money, and you take your chances. Simply put, I can’t recommend this app.

I’ve been looking at this model recently, and thinking.

Backstory references, for those who need them: here and here and here.

Pro Football Reference’s AYA statistic as a scoring potential model. The barrier potential represents the idea that scoring chances do not become 100% as the opponents goal line is neared.

If the odds of scoring a touchdown approach 100% as you approach the goal line, then the barrier potential disappears, and the “yards to go” intercept is equal to the value of the touchdown. The values in the PFR model appear to always increase as they approach the goal line. They never go down, the way real values do. Therefore, the model as presented on their pages appears to be a fitted curve, not raw data.

The value they assign the touchdown is 7 points. The EP value of first and goal on the 1 is 6.97 points. 6.97 / 7.00 * 100 = 99.57%. How many of you out there think the chances of scoring a touchdown on the 1 yard line are better than 99%?

More so, the EP value, 1st and goal on the 2 yard line is 6.74. Ok, if the fitting function is linear, or perhaps quadratic, then how do you go 6.74, to 6.97, to 7.00? The difference between 6.74 and 6.97 is 0.23 points. Assuming linearity (not true, as first and 10 points on the other end of the curve typically differ by 0.03 points per yard), you get an extrapolated intercept of 7.20 points.

The PFR model has its issues. The first down intercept seems odd, and it lacks a barrier potential. To what extent this is an artifact of a polynomial (or other curve) fitted to real data remains to be seen.

Update: added a useful Keith Goldner reference, which has a chart giving probabilities of scoring a touchdown.

After watching one or another controversy break out during the 2011 season, I’ve become convinced that the average “analytics guy” needs a source of play-by-play data on a weekly basis. I’m at a loss at the moment to recommend a perfect solution. I can see the play-by-play data on, but I can’t download it. Worst case, you would think you could save the page and get to the data, but that doesn’t work. I suspect the use of AJAX or equivalent server side technology to write the data to the page after the HTML has been presented. Good for business, I’m sure, but not good for Joe Analytics Guy.

One possible source is now Pro Football Reference (PFR), which now has play by play data in their box scores, and has tended to present their data in AJAX free, user friendly fashion. Whether Joe Analytics Guy can do more than use those data personally, I doubt. PFR is purchasing their raw data from another source. And whatever restrictions the supplier puts on PFR’s data legally trickle down to us.

Further, along with the play by play, PFR is now calculating expected points (EP) along with the play by play data. Thing is, what expected point model is Pro Football Reference actually using? Unlike win probabilities, which have one interpretation per data set, EP models are a class of related models which can be quite different in value (discussed here, here, here). If you need independent verification, please note that Keith Goldner now has published 4 separate EP models (here and here), his old Markov Chain model, the new Markov Chain model, a response function model, and a model based on piecewise fits.

That’s question number one. Question that have to be answered to answer question one are things like:

  • How is PFR scoring drives?
  • What is their value for a touchdown?
  • If PFR were to eliminate down and distance as variables, what curve do they end up with?

This last would define how well Pro Football Reference’s own EP model supports their own AYA formula. After all, that’s what a AYA formula is, a linearized approximation of a EP model where down and to go distance are ignored, with yards to score is the only independent variable.

Representative Pro Football Reference EP Values
1 yard to go 99 yards to go
Down EP Down EP
1 6.97 1 -0.38
2 5.91 2 -0.78
3 5.17 3 -1.42
4 3.55 4 -2.49


My recommendation is that PFR clearly delineate their assumptions in the same glossary where they define their version of AYA. Make it a single click lookup, so Joe Analytics Guy knows what the darned formula actually means. Barring that, I’ve suggested to Neil Paine that they publish their EP model data separately from their play by play data. A blog post with 1st and ten, 2nd and ten, 3rd and ten curves would give those of us in the wild a fighting chance to figure out how PFR actually came by their numbers.

Update: the chart that features 99 yards to go clearly isn’t 1st and 99, 2nd and 99. Those are 1st and 10 values, 2nd and 10, etc at the team’s 1 yard line. The only 4th down value of 2011, 99 yards away, is a 4th and 13 play, so that’s what is reported above.