This is going to be a mixed bag of a post, talking about anything that has caught my eye over the past couple weeks. The first thing I’ll note is that on the recommendation of Tom Gower (you need his Twitter feed), I’ve read Josh Katzowitz’s book: Sid Gillman: Father of the Passing Game.


I didn’t know much about Gillman as a young man, though the 1963 AFL Championship was part of a greatest games collection I read through as a teen. The book isn’t a primer on Gillman’s ideas. Instead, it was more a discussion of his life, the issues he faced growing up (it’s clear Sid felt his Judaism affected his marketability as a coach in the college ranks). Not everyone gets the same chances in life, but Sid was a pretty tough guy, in his own right, and clearly the passion he felt for the sport drove him to a lot of personal success.

Worth the read. Be sure to read Tom Gower’s review as well, which is excellent.

ESPN is dealing with the football off season by slowly releasing a list of the “20 Greatest NFL Coaches” ( does its 100 best players, for much the same reason). I’m pretty sure neither Gillman nor Don Coryell will be on the list. The problem, of course, lies in the difference between the notions of “greatest” and “most influential”. The influence of both these men is undeniable. However, the greatest success for both these coaches has come has part of their respective coaching (and player) trees: Al Davis and Ara Parseghian come to mind when thinking about Gillman, with Don having a direct influence on coaches such as Joe Gibbs, and Ernie Zampese. John Madden was a product of both schools, and folks such as Norv Turner and Mike Martz are clear disciples of the Coryell way of doing things. It’s easy to go on and on here.

What’s harder to see is the separation (or fusion) of Gillman’s and Coryell’s respective coaching trees. Don never coached under or played for Gillman. And when I raised the question on Twitter, Josh Katzowitz responded with these tweets:

Josh Katzowitz : @smartfootball @FoodNSnellville From what I gathered, not much of a connection. Some of Don’s staff used to watch Gillman’s practices, tho.

Josh Katzowitz ‏: @FoodNSnellville @smartfootball Coryell was pretty adament that he didn’t take much from Gillman. Tom Bass, who coached for both, agreed.

Coaching clinics were popular then, and Sid Gillman appeared from Josh’s bio to be a popular clinic speaker. I’m sure these two mixed and heard each other speak. But Coryell had a powerful Southern California connection in Coach John McKay of USC, and I’m not sure how much Coryell and Gillman truly interacted.

Pro Football Weekly is going away, and Mike Tanier has a nice great article discussing the causes of the demise. In the middle of the discussion, a reader who called himself Richie took it upon himself to start trashing “The Hidden Game of Football” (which factors in because Bob Carroll, a coauthor of THGF, was also a contributor to PFW). Richie seems to think, among other things, that everything THGF discussed was “obvious” and that Bill James invented all of football analytics wholesale by inventing baseball analytics. It’s these kinds of assertions I really want to discuss.

I think the issue of baseball analytics encompassing the whole of football analytics can easily be dismissed by pointing out the solitary nature of baseball and its stats, their lack of entanglement issues, and the lack of a notion of field position, in the football sense of the term. Since baseball doesn’t have any such thing, any stat featuring any kind of relationship of field position to anything, or any stat derived from models of relationships of field position to anything, cannot have been created in a baseball world.

Sad to say, that’s almost any football stat of merit.

On the notion of obvious, THGF was the granddaddy of the scoring model for the average fan. I’d suggest that scoring models are certainly not obvious, or else every article I have with that tag would have been written up and dismissed years ago. What is not so obvious is that scoring models have a dual nature, akin to that of quantum mechanical objects, and the kinds of logic one needs to best understand scoring models parallels that of the kinds of things a chemistry major might encounter in his junior year of university, in a physical chemistry class (physicists might run into these issues sooner).

Scoring models have a dual nature. They are both deterministic and statistical/probabilistic at the same time.

They are deterministic in that for a typical down, distance, to go, and with a specific play by play data set, you can calculate the odds of scoring down to a hundredth of a point. They are statistical in that they represent the sum of dozens or hundreds of unique events, all compressed into a single measurement. When divorced from the parent data set, the kinds of logic you must use to analyze the meanings of the models, and formulas derived from those models, must take into account the statistical nature of the model involved.

It’s not easy. Most analysts turns models and formulas into something more concrete than they really are.

And this is just one component of the THGF contribution. I haven’t even mentioned the algebraic breakdown of the NFL passer rating they introduced, which dominates discussion of the rating to this day. It’s so influential that to a first approximation, no one can get past it.

Just tell me: how did you get from the formulas shown here to the THGF formula? And if you didn’t figure it out yourself, then how can you claim it is obvious?

The value of a turnover is a topic addressed in The Hidden Game of Football, noting that the turnover value consists of the loss of value by the team that lost the ball and the gain of value  by the team that recovered the ball. To think in these terms, a scoring model is necessary, one that gives a value to field position. With such a model then, the value is

Turnover = Value gained by team with the ball + Value lost by team without the ball

In  the case of the classic models of THGF, that value is 4 points, and it is 4 points no matter what part of the field the ball is recovered.

That invariance is a product of the invariant slope of the scoring model. The model in THGF is linear, the derivative of a line is a constant, and the slopes, because this model doesn’t take into account any differences between teams, cancel. That’s not true in models such as the Markov chain model of Keith Goldner, the cubic fit to a “nearly linear” model of Aaron Schatz in 2003, and the college expected points model (he calls his model equivalent points, but it’s clearly the same thing as an expected points model)  of Bill Connelly on the site Football Study Hall. Interestingly, Bill’s model and Keith’s model have a quadratic appearance, which guarantees better than constant slope throughout their curves. Aaron’s cubic fit has a clear “better than constant” slope beyond the 50 yard line or so.

Formula with slopes exceeding a constant result  in turnover values that maximize at the end zones and minimize in the middle  of the field, giving plots that Aaron calls the “Happy Turnover Smile Time Hour”. As an example, this is the value of a turnover on first and  ten (ball lost at the LOS) for Keith Goldner’s model

First and ten turnover value from Keith Goldner’s Markov chain model

And this is the piece of code you can use to calculate this curve yourself.

Note also, the models of Bill Connelly and Keith have no negative expected points values. This is unlike the David Romer model and also unlike Brian Burke’s expected points model. I suspect this is a consequence of how drives are scored. Keith is pretty explicit about his extinction “events” for drives in his model, none of which inherit any subsequent scoring by the opposition. In contrast, Brian suggests that a drive for a team that stalls inherits some “responsibility” for points subsequently scored.

A 1st down on an opponent’s 20 is worth 3.7 EP. But a 1st down on an offense’s own 5 yd line (95 yards to the end zone) is worth -0.5 EP. The team on defense is actually more likely to eventually score next.

This is interesting because this “inherited responsibility” tends to linearize the data set except inside  the 10 yard line on either end. A pretty good approximation to the first and ten data of the Brian Burke link above can be had with a line that is valued 5 points at one end,  -1 points at the other. The value of the slope becomes 0.06 points, and the value of the turnover becomes 4 points in this linearization of the Advanced Football Stats model. The value of the touchdown is 7.0 points minus subsequent field position, which is often assumed to be 27 yards. That yields

27*0.06 – 1.0 = 1.62 – 1.0 = 0.62 points,  or approximately 6.4 points for a TD.

This would yield, for a “Brianized” new passer rating formula, a surplus yardage value for the touchdown of 1.4 points / 0.06 = 23.3 yards.

The plot is below:

Eyeball linearization of BB’s EP plots yield this simplified linear scoring model. The surplus value of a TD = 23.3 yards, and a turnover is valued 66.7 yards.

Update 9/29/2011: No matter how much I want to turn the turnover equation into a difference, it’s better represented as a sum. You add the value lost to the value gained.

The value of a touchdown is a phrase used in formulas like this one

PASSER RANKING = (yards + 10*TDs – 45*Ints)/attempts

where the first thing that comes to mind is that the TD is worth 10 yards and the interception is worth 45 yards. But is it? A TD after all, is worth about 7 points, and in The Hidden Game of Football formulation, a turnover is worth 4 points. Therefore, a TD is worth considerably more than a turnover, but the formula values the TD less. How is that?

Well, let me reassure you that in the new passer rating of the Hidden Game of Football, the value of a touchdown is a constant, equal to 6.8 points or 85 yards. The interception of 4 points is usually valued at 45 yards instead of 50, because most interceptions don’t make it back to the line of scrimmage.

The field itself is zero valued at the 25 yard line. That means once you get to the one yard line, you have one yard to go of field and the TD is worth an additional 10 yards of value. That’s where the 10 comes from. It’s not the value of the touchdown, but the additional value of the touchdown not measured on the field itself.

But what does this additional term actually mean?

Figure 1. The basic linear scoring model of THGF. TD = 6, linear slope = 0.08 points/yard. The probability of a score goes to 1.0 as the goal line is approached.

Figure 2. The model of THGF's new passer rating. The difference between y value at 100 yards and TD equals 0.8 points or 10 yards. Maximum probability of a score approaches 75/85.

If you check out the figures above, Figure 1 is introduced in The Hidden Game  of Football on page 102, and features in just about all the descriptions of worth up until page 186, where we run into this text. The authors appear to be carving out a new formula from the refactored NFL formula they introduce in their book.

Awarding a 80 yard bonus for a touchdown pass makes no sense either. It’s like treating every TD pass as though it were a 80-yard bomb. Yet, the majority of touchdown passes are from inside the 25 yard line.

It’s not the bonus we’re objecting to-after all, the whole point of throwing a pass is to get the ball into the end zone-but the size of the bonus is way out of kilter. We advocate a 10 yard bonus for each touchdown pass. It’s still higher than the yardage on a lot of TD passes, but it allows for the fact that yardage is a lot harder to get once a team gets inside the opponent’s 25.

and without quite saying so, the authors introduce the model in Figure 2. To note, the value of the touchdown and the yardage value merge in Figure 1, but remain apart in Figure 2. This value, which I’ve called a barrier potential previously, is the product of a chance to score that’s less than a 1.0 probability as you reach the goal line.  If your chances maximize at merely 80%, you’ll end up with a model with a barrier potential.

If I have an objection to the quoted argument, it’s that it encourages the whole notion of double counting the touchdown “yardage”. The appropriate way to figure out the slope of any linear scoring model is by counting all scoring at a particular yard line, or within a particular part of the field (red zone scoring, for example, which could  be normalized to the 10 yard line). These are scoring models, after all, not touchdown models.

Where did 6.8 come from, instead of 7?

Whereas before I was thinking  it was 6 points for the TD and 0.8 points for the extra point, I’m now thinking it came from the same notions that drove the score value of 6.4 for Romer and 6.3 for Burke. It’s 7 points less the value of the runback. I’ve used 6.4 points to derive scoring models for PFR’s aya and the NFL passer rating, but on retrospect, those aren’t appropriate uses. These models tend to zero in value around 25 yards, whereas the Romer model has much higher initial slopes and reaches positive values faster than these linear models.

This value can be calculated, but the formula that results can’t be calculated directly. It can be solved iteratively, though, with a pretty short piece of code

Figure 3. Perl code to solve for slope, effective TD value and y value at 100 yards in linear scoring models.

Figure 4. Solving for barriers of 10 and 20 yards.

And the solution is close enough to 6.8 that it’s easy enough to ignore the difference. Plugging 7 points for the touchdown, 20 and 29.1 yards respectively for the barrier potential yields almost no changes in the touchdown value for  the PFR aya model and the NFL passer rating formula, and we end up with these scoring model plots.

Figure 5. PFR aya amended model. TD = 7 points, slope = 0.075 points/yard, y at 100 = 5.5 points.

Figure 6. Amended NFL prf scoring model. TD = 7.05 points, slope = 0.07 points/yard, y at 100 = 5.0 points.

ESPN has unveiled a new passer rating formula (see also here and here, discussion of the ratings here, here, and here), one that is complex and to be plain, not very straightforward to interpret. In the age of stats that purport to give the contribution to winning in terms of wins per season a player contributes above replacement(i.e. WARP), one really has to wonder about the value of an arbitrary 0 to 100 scale. It’s in all honesty as meaningless as the NFL’s original scale, which maxes at something less than 160.

But in order to critique the new scale at all, in anything other than emotional terms, perhaps it’s best to step back and look at some of the previous critiques of the NFL’s old formula. The one we’ll start with is Brian Burke’s 2007 critique, where he points out that TDs are a pretty arbitrary criterion, and removes them from his formula. He finally decides that the best formula he can come up with is:

QB Wins Added = (Comp% * 0.18) - (Int/Att * 50.5) - (Sack Yds/Att * 1.57) - 8

This formula has the advantage of being scaled properly. It is also simple, not as sophisticated as other formulas. How well it works is beyond the scope of this survey, but we note it for those digging for more details.

Football Outsiders uses a method called DVOA to rank quarterbacks. Again, the scale is measured in terms of “success points”, and this is abstract. But it attempts to treat the game of football as something of a state machine, using NFL play by plays as the fundamental data source, and therefore is potentially a better stat than stateless formulas. However, DVOA is a rate stat, not a cumulative stat, and there can be times when a rate stat lies to you (i.e. a high performing player who can’t stay on the field can have a very high DVOA and a very low real value to a team). Nonetheless, this is FO’s attempt to improve on the QBR.

The best and most thorough critique is also an old one, the critique of the NFL QBR by Carroll, Palmer and Thorn in the book “The Hidden Game of Football“. They devote the whole of Chapter 11 to the various formulas the NFL has used, why they were busted, and why the NFL went to the formula they do use. They then critique the formula and offer two ranking formulas of their own. We’re going to spend a lot of time on the THGF critique. To be plain, those who really want to understand it should buy the book, as used copies are cheap.

One thing to note about the Carroll et al’s historical introduction to this problem is that a stat a lot of analysts drool over, YPA, was once used as the sole criterion to judge quarterbacks. When in 1957 Tommy O’Connell won the passing trophy, it became pretty obvious that not only a rate criterion was necessary, but also a cumulative statistical component as well. YPA alone isn’t a good way to rate quarterbacks.

Original and refactored NFL ratings formulas

Later in the chapter, Carroll et al give the NFL formula as the NFL gives it to others, and then refactor the formula so that analyzing the components is easier to do. The original formula is:

RATE = 100 x [( Completion % - 30)/20 + (Average_Gain - 3)/4 + TD%/5 +
(9.5 - INT%)/4]/6

and after some mathematical gyrations, they break the formula down into the form RATE = A x [ (Completion_term + Yards + TD_term – INT_term)/attempts ] + B

and that formula is (results in the same points, but easier to conceptualize)

RATE = 100/24 * [ (Completions * 20 + yards + Tds * 80 - ints * 100)/attempts] + 50/24

Once the easier-to-understand formula is established, they begin their critique in earnest.
The critical passage is as follows:

How do you feel about giving a 20 point bonus for each completion? Not sure? Think of this. If one passer throws 2 passes and completes them both for 10 yards each, he’ll have 60 points. Another passer misses his first toss and then hits his second for 40 yards. He also has 60 points. Both passers rate the same even though the second guy moved his team twice as far!

The NFL system favors the high percentage, nickel passer. It always did, but that wasn’t nearly do obvious until lately, when several teams began to use short passes out in the flat as, in effect, running plays. If Joe Montana dumps off to Roger Craig and the play loses 5 yards, Joe still gets 15 points.

Note that the example in the first paragraph of the quote is stateful. If the example has started at the 20 yard line, then the final state of the short passer would have been a first down on the team’s 40 yard line, while the final state of the “long” passer would have been a first down on the opponent’s 40 yard line. The net expected points (see also here) from the improved field position is higher, so the second scenario should be rewarded more thoroughly. But to get that kind of evaluation requires at the least, play by play stats and to the highest level of detail, video of the game itself.

Finally, Carroll et al give two formulas they regard as superior to the NFL formula:

RATE = ( yards + TD x 10 – int X 45) / att

RATE = ( yards – sacks allowed + TD x 10 – int x 45 ) / (att + sacks)

We’re not here to analyze this formula either, but to present it to those who might be looking at ESPN’s QBR and trying to figure out alternatives.

Note: A NFL QBR calculator is here.

This book, by Carroll, Palmer, and Thorn, can be regarded as Deep Stats 1.0, a serious attempt to get past raw numbers and generate a Theory of Everything. Well, football Everything.

For a statistically minded crew,  it’s an absolute must read, because they completely destroy the NFL’s passer rating formula. They had thought a lot about the formula, and their critique is penetrating and incisive. It can also be treated as a critique of any goof who stands up and claims that today’s passers are superior because their ratings are better than the players of  yesteryear, because, yes, Carroll et al have taken that whole argument and flayed it open on the written page as well.

That it is an older theory can be seen by  the units the authors choose to use. They reduce everything to yards. Yards? Any self respecting creator of a theory of Football Everything knows that the unit du jour is wins. This has been true ever since Bill James’s Win Shares, at least, and as stats like WARP (i.e. wins above replacement player) have become common. This need to express everything in terms of wins, or better yet, playoff wins, is part of what is fueling the current micro-revolution in football stats (see, for example, this recent Fifth Down Blog article by Brian Burke). We don’t need no steenkin’ points, no yards. How does taking the head off the secondary receiver and separating him from the ball translate into wins, padre? What things does my team need to do to win games, win playoff games, and win championships? That’s what any self respecting data geek wants to know.

Any other issues? I note that they have a rather unique description, in their “how the game evolved” pages, of Earle Neale’s Eagle defense and Steve Owens’s umbrella defense, differing from the descriptions given by Dr Z in Thinking Man’s or Jean Bramel in the Fifth Down blog. And no, I don’t think the Eagle was a 6-2 or that Steve Owen’s “Umbrella” was a 7-diamond. I think Dr Z and Jean are correct and this otherwise fine book wrong.

That said, they go over all aspects of the game, analyze them in terms of yards.. yes, they even convert scoring to .. yards, and then present their version of football Everything to the reader. It’s actually a fine first attempt, and were it not for the trends of the day, to think and eat and breathe in terms of wins, we might still be rating offenses by how many yards they “score”, and defenses by how many “yards” they prevent.