October 2011


It’s an easy thing to say and claim, that any offense that has a quarterback 4-5 yards back from the line of scrimmage and that has a running orientation must descend from the single wing formation. In the case of the spread option, I don’t know how comfortable I am with this idea. For one, the name spread option suggests a lineage that comes from the spread itself, or the shotgun, which Y.A. Tittle once compared to the short punt formation.

Many teams had put the quarterback in a Short Punt Formation before, but Hickey’s version apparently caught everyone’s fancy. It was an overnight sensation.

That, in a nutshell, is the idea I’m interested in developing, that shotgun + option = spread option, and signs of single wing descent aren’t in any sense as easily proven as people claim.

A point, critical in thinking about this, is how someone like Urban Meyer or Gus Malzahn could have been taught single wing principles in the first place. By the early 1970s, when I first became aware of football, the single wing was a dead offense. The single wing was functionally obsoleted by 1940. Fritz Crisler and the invention of platooning notwithstanding, Clark Shaughnessy’s version of the T was just too explosive for the old single wing to survive. By the 1970s, the only formation where the quarterback wasn’t behind center was the shotgun, and the shotgun, in those days, was primarily a passing formation.

Single Wing ca 1945. Line spacing 6 inches, except for wingback and ends.

By contrast, the single wing was a poor downfield passing formation. Linemen were all squished together,  perhaps 6 inches apart. A “flexed” end, as Knute Rockne might have put it, was no more than a yard away from this compatriots. Play development was slow, as plays couldn’t begin until the ball actually reached the tailback. The centers of the 1930s hiked the ball with their heads down, looking at the person they hiked it to. This was necessary because they could hike it to any one of three people. Blind hikes, freeing the center to block, weren’t common until the Shaughnessy T. And to quote Dana X. Bible:

Except for the spinner cycle, it does not afford much opportunity for deception.

Now, to note, as the site Hickock Sports points out, there really were 5 formations in common use before the Shaughnessy T came into prominence, and those included the double wing, the short punt, the Notre Dame box, and the old T formation (played largely by the Chicago Bears). We’ll show some photos of the double wing and the short punt from Dana’s book, followed by a sample of a spread option formation.

double wing formations

Short Punt formation

A modern spread option formation

So of the formations above, which does the modern spread option most resemble? The “A” version of the double wing, by my eyes.

What passing trends are of note between the 1930s and today? A more aerodynamic ball, and the ability to pass anywhere behind the line of scrimmage (rule change, 1933) helped power a ever growing passing explosion into the 1940. In the 1950s, Paul Brown introduced timing patterns, by carefully watching how Don Hutson played. The late 1950s gave us, via Johnny Unitas and Raymond Berry, the 2 minute drill. The 1960s gave football Sid Gillman and his foray into attacking the whole field. In the 1970s, the Dallas Cowboys revived the shotgun, and one of the elements introduced then was a blind shotgun hike. Get to the early 1980s, and the more wide open passing games of the San Diego Chargers and later, the Washington Redskins, and formations (pro I, pro T) that were almost etched in stone begin to evolve. Also, in the 1980s, the West Coast Offense emerged, and the ideas of stretching a passing defense horizontally, and further, that passing can substitute for running as a ball control weapon.  By the late 1990s and into the 2000s, “ace” backfields became more common, the shotgun was used more and more. And as teams pushed for more and more wideouts, to spread the defense, to get  defenders to cover more and more of the field, the counterbalancing question began to emerge: how do I get more running out of an essentially passing formation?

Consider the running game, from single wing to now. The single wing excelled in power off tackle running, perhaps exemplified by the cutback. Blocking was sustained, double teams by the wingback and tackle forming a crucial part of the game.  Once the Shaughnessy T was introduced, blocks weren’t nearly as enduring. Away from the play, brush back blocks were enough. Because the blocks were fast, and the play started earlier (blind hikes), the game became faster.

The single wing cutback later formed the archetype for the Green Bay sweep. But nuances introduced around this time span include area or do-dad blocking, and the whole notion of running to daylight.

The option itself dates back as far as Don Faurot and the Split T offense he developed for Missouri. With Don’s notion of keying off unblocked defenders, and getting the ball to the man the opposition can’t defend, football now had a running game that resembled a 2 on 1 fast break in basketball. This was only reinforced when the wishbone triple option, created by Emory Bellard, became a dominant offense in the late 1960s – early 1970s. Adding zone run concepts a la Alex Gibbs (check out, for example, John  T Reed’s zone run entry in his dictionary) to unblocked keys leads to the zone read:

The first read of a “zone-read,” it will be recalled is by the quarterback: he reads the backside defensive end, who typically goes unblocked in a zone-rushing scheme to free up blockers for double-teams on the frontside. If the defensive end sits where he is or rushes upfield, the quarterback simple hands the ball off to the runner. But if he chases the runningback, the quarterback pulls the ball. On the base zone-read, the quarterback just looks for any crease to the backside.

The zone read is the backbone of the spread option, and simply put, the option, much less the blocking patterns of the zone read, didn’t exist back in 1936.

Q: If the two offenses don’t come from a common origin, why so many apparent commonalities?

In explanation, consider how in biology there are cases of convergent evolution.  Though of unrelated origin, the eye in squids and mammals are very structurally similar, with the interesting exception that the squid eye, nerves are wired to the retina in the back, while with mammals, the retina is wired to the nerves in the front.  Often, little details tell the story when distinguishing lineages.

Or, as Chris Brown, of Smart Football, has said when examining pretty much this same question:

Certainly, the coaches who developed today’s modern offenses, like Rodriguez and Malzahn, did not spend their time meticulously studying the single-wing tapes of yesteryear. Instead, if there are similarities it’s because those coaches stumbled onto the same ideas through trial and error.

Update: Coach Wyatt has a nice summary of direct snap formations (and some history) at this link

To explain the columns below, Median is a median point spread, and can be used to get a feel for how good a team is without overly weighting a blowout win or blowout loss. HS is Brian Burke’s Homemade Sagarin, as implemented in Maggie Xiong’s PDL::Stats. Pred is the predicted Pythagorean expectation. The exponent for this measure is fitted to the data set itself. SOS, SRS, and MOV are the simple ranking components. MOV is margin of victory, or point spread divided by games played. SOS is strength of schedule. SRS is the simple ranking.

At this juncture, Baltimore was upset by Jacksonville 12-7, and consequently, there are changes at the top of the various stats. New Orleans sits atop the HS rankings, while Baltimore still has edges in median point spread and Pythagorean expectation. But I suspect for many the statistical darling of the moment is San Francisco, with solid Pythagoreans, a good record, and the best simple ranking of them all.

This is something I’ve wanted to test ever since I got my hands on play-by-play data, and to be entirely  honest, doing this test is the major reason I acquired play-by-play data in  the first place. Linearized scoring models are at the heart of the stats revolution sparked by the book, The Hidden Game of Football, as their scoring model was a linearized model.

The simplicity of the model they presented, the ability to derive it from pure reason (as opposed to hard core number crunching) makes me want to name it in some way that denotes the fact: perhaps Standard model or Common model, or Logical model. Yes, scoring the ‘0’ yard line as -2 points and  the 100 as 6, and everything in between as a linearly proportional relationship between those two has to be regarded as a starting point for all sane expected points analysis. Further, because it can be derived logically, it can be used at levels of play that don’t have 1 million fans analyzing everything: high school play, or even JV football.

From the scoring models people have come up with, we get a series of formulas that are called adjusted yards per attempt formulas. They have various specific forms, but most operate on an assumption that yards can be converted to a potential to score. Gaining yards, and plenty of them, increases scoring potential, and as Brian Burke has pointed out, AYA style stats are directly correlated with winning.

With play-by-play data, converted to expected points models, some questions can now be asked:

1. Over what ranges are expected points curves linear?

2. What assumptions are required to yield linearized curves?

3. Are they linear over the whole range of data, or over just portions of the data?

4. Under what circumstances does the linear assumption break down?

We’ll reintroduce data we described briefly before, but this time we’ll fit the data to curves.

Linear fit is to formula Scoring Potential = -1.79 + 0.0653*yards. Quadratic fit is to formula Scoring Potential = 0.499 + 0.0132*yards + 0.000350*yards^2. These data are "all downs, all distance" data. The only important variable in this context is yard line, because this is the kind of working assumption a linearized model makes.

Fits to curves above. Code used was Maggie Xiong's PDL::Stats.

One simple question that can change the shape of an expected points curve is this:

How do you score a play using play-by-play data?

I’m not attempting, at this point, to come up with “one true answer” to this question, I’ll just note that the different answers to this question yield different shaped curves.

If the scoring of a play is associated only with the drive on which the play was made, then you yield curves like the purple one above. That would mean punting has no negative consequences for the scoring of a play. Curves like this I’ve been calling “raw” formulas, “raw” models. Examples of these kinds of models are Kieth Goldner’s Markov Chain model, and Bill Connelly’s equivalent points models.

If a punt can yield negative consequences for the scoring of a play, then you get into a class of models I call “response” models, because the whole of the curve of a response model can be thought of as

response = raw(yards) – fraction*raw(100 – yards)

The fraction would be a sum of things like fractional odds of punting, fractional odds of a turnover, fractional odds of a loss on 4th down, etc. And of course in a real model, the single fractional term above is a sum of terms, some of which might not be related to 100 – yards, because that’s not where the ball would end up  – a punt fraction term would be more like fraction(punt)*raw(60 – yards).

Raw models tend to be quadratic in character.  I say this because Keith Goldner fitted first and 10 data to a quadratic here. Bill Connelly’s data appear quadratic to the eye. And the raw data set above fits mostly nicely to a quadratic throughout most of the range.

And I say mostly because the data above appear sharper than quadratic close to the goal line, as if there is “more than quadratic” curvature less than 10 yards to go. And at the risk of fitting to randomness, I think another justifiable question to look at is how scoring changes the closer to the goal line a team gets.

That sharp upward kink plays into  how the shape of response models behaves. We’ll refactor the equation above to get at, qualitatively, what I’m talking about. We’re going to add a constant term to the last term in the response equation because people will calculate the response differently

response = raw(yards) – fraction*constant*raw(100 – yards)

Now, in this form, we can talk about the shape of curves as a function of the magnitude of “constant”. As constant grows larger,  the more the back end of the curve takes on the character of the last 10 yards. A small constant and you yield a less than quadratic and more than linear curve. A mid sized constant yields a linearized curve. A potent response function yields curves more like  those of David Romer or Brian Burke, with more than linear components within 10 yards on both ends of the field. Understand, this is a qualitative description. I have no clues as to the specifics of how they actually did their calculations.

I conclude though, that linearized models are specific to response function depictions of equivalent point curves, because you can’t get a linearized model any other way.

So what is our best guess at the “most accurate” adjusted yards per attempt formula?

In my data above, fitting a response model to a line yields an equation. Turning the values of that fit into an equation of the form:

AYA = (yards + α*TDs – β*Ints)/Attempts

Takes a little algebra. To begin, you have to make a decision on  how valuable your touchdown  is going to be. Some people use 7.0 points, others use 6.4 or 6.3 points. If TD = 6.4 points, then

delta points = 6.4 + 1.79 – 6.53 = 1.79 + 0.07 = 1.86 points

α = 1.86 points/ 0.0653 = 28.5 yards

turnover value = (6.53 – 1.79) + (-1.79) = 6.53 – 2*1.79 = 2.95 points

β = 2.95 / 0.0653 = 45.2 yards

If TDs = 7.0 points, you end up with α = 37.7 yards instead.

It’s interesting that this fit yields a value of an interception (in yards) almost identical to the original THGF formula. Touchdowns are more close in value to the NFL passer rating than THGF’s new passer rating. And although I’m critical of Chase Stuart’s derivation of the value of 20 for  PFR’s AYA formula, the adjustment they made does seem to be in the right direction.

So where does the model break down?

Inside the 10 yard line. It doesn’t accurately depict  the game as it gets close to the goal line.  It’s also not down and distance specific in the way a more sophisticated equivalent points model can be. A stat like expected points added gets much closer to the value of an individual play than does a AYA style stat. In terms of a play’s effect on winning, then you need win stats, such as Brian’s WPA or ESPNs QBR to break things down (though I haven’t seen ESPN give us the QBR of a play just yet, which WPA can do).

Update: corrected turnover value.

Update 9/24/11: In the comments to this link, Brian Burke describes how he and David Romer score plays (states).

To repeat the nature of these stats, Median is a median point spread, and can be used to get a feel for how good a team is without overly weighting a blowout win or blowout loss. HS is Brian Burke’s Homemade Sagarin, as implemented in Maggie Xiong’s PDL::Stats. Pred is the predicted Pythagorean expectation. The exponent for this measure is fitted to the data set itself. SOS, SRS, and MOV are the simple ranking components. MOV is margin of victory, or point spread divided by games played. SOS is strength of schedule. SRS is the simple ranking.

Baltimore, by many measures, is now the #1 team in all of football, despite Green Bay’s 6-0 record. I’d tend to favor Green Bay in any matchup, because Aaron Rogers has been playing in an exceptional groove since about mid 2010. He evokes statements from QB savvy commenters like Trent Dilfer about how phenomenal his ball placement is.

In contrast to Baltimore, look at Kansas City. By many measures this team is one of the worst in football. But they are at 2-3, a better record than Philadelphia, and I suspect the stats of the day aren’t particularly  good at handling teams  that start poorly but improve. Stats are tools, no more, and anyone who has ever read, say, “The Boy Who Harnessed the Wind” can tell you, one man’s junk is another man’s wind powered electric generator.

Or, as Dallas Operator 7G once said, it’s not the wand, it’s the wizard.

Possession of a ball in a ball game is a binary act. You either have it or you don’t. That means that the total value of stats associated with possession is also binary. This is true regardless whether the sport splits the value of a turnover in two or not, and notions of shared blame can cause issues when thinking about football. Football isn’t like other sports. Some of its “turnovers”, the punt especially, aren’t as easily quantifiable in the terms of other sports.

As an example of shared blame, we’ll take on the turnover in basketball. The potential value of the shot in the NBA is one point. This is easy to see, because a shot is worth 2 points and a typical NBA shooting percentage is about 50 percent (or a 3 point shot, with a percentage around 33%). That said, the value of the possession is two points, and  the total value of the turnover is also two points.

Wait a minute, you say. The STL stat is generally only valued at 1 point. How can it be two? Well, there are two stats associated with a turnover in basketball. There is the TO stat, and the STL stat. And in metrics like the NBA Efficiency metric, each of  these stats is valued at a point. TO + STL = total value of 2 points. The turnover in basketball is worth 2 points, and thus the possession is worth two points. The sum gets hidden because half of it is credited to the thief, and half is debited from the one who lost the ball.

The value of the turnover is the difference in value between the curves.

The classic description of the turnover in football derives from  the Hidden Game of Football, and because their equivalent points metric is linear and independent of down and to go measures, the resultant value for the turnover is a constant. This isn’t easy to see in traditional visual depictions, but becomes easy to see when you flip the opposition values upside down.

See how the relative distance between the lines never change? By the way, you can do the same thing for basketball, though the graph is a bit on the trivial side.

This curve probably should have some distance dependence, actually.

These twin plots are a valuable way to think about the game,  turnovers, and for that matter, the game of football as a series of transitions between states. For now, by way of example, we’ll use these raw NEP data I calculated for my “states” post. We’ll plot an opposition set of data upside down and show what a state transition walk might look like using these data.

The game of football can be described as a "walk" along a pair of EP curves.

Not that complicated, is it? You could visualize these data two ways: as a kind of “Youtube video” where the specific value for the game changes as plays are executed, and the view remains 2D, or as a 3D stack of planes, each with one graph, each plane representing the game at a single play in the game.

Even in football, though, you could attempt to split the blame for the turnover into two parts: there is the person that lost the ball, and the person that recovers it. So  the value for the state transition from one team to the  next could be split in two, a la basketball, and credit give to the recovering side and a debit taken from the side losing the ball.

So what about  the punt? It has no equivalent in basketball or baseball, and in general, looks just like a single state transition.

The punt, in this depiction, is a single indivisible state transition from one team to the other.

It’s a single whole, and therefore, you can get yourself into logical conundrums when you attempt to split the value of the punt in two.

This whole discussion, by  the way, is something of an explanation for Benjamin Morris and folks like him, who saw his live blog on October 9, 2011. It’s not easy getting this point across using his graphics on his site. My point is more fully developed above, and why I was saying the things I did more evident from the graphics above.

Ben, btw, is an awesome analytics blogger. Please don’t take this discussion as any kind of indictment of his work, which is of a very high quality.

This is the week 5 edition of my NFL stats. Median is the median point spread, HS is Brian Burke’s Homemade Sagarin, as implemented through Maggie Xiong’s PDL::Stats module. Pct is the calculated win percentage, Pred is the Pythagorean expectation for the various teams. SRS, MOV, and SOS are the simple ranking statistics, as calculated by this module.

To note the potential value of a median point spread, please compare  the median point spread of San Francisco to the  MOV and Pythagorean expectations. It is relatively insensitive to one or two blowouts. Green Bay, by contrast, has a median point spread and a MOV far more closely aligned. Kansas City, notably, has a median point spread much better than the team MOV.

The formal phrase is “finite state automaton“, which is imposing and mathy and often too painful to contemplate, until you realize what kinds of things are actually state machines [1].

Tic-Tac-Toe is a state machine. The diagram above, from Wikimedia, shows the partial solution tree to the game.

Tic-tac-toe is a state machine. You have 9 positions on a board, a state of empty, X, or O, marks that can be placed on the board by a defined set of rules, and you have a defined outcome from those sets of rules.

Checkers is also a state machine.

Checkers (draughts) is a state machine. You have 64 positions on a board, pieces that move through the positions via a set of defined rules, with a defined outcome from those rules.

Chess is a state machine.

Chess is a state machine. You have 64 positions on a board, pieces that move through the positions via a set of defined rules, with a defined outcome from those rules.

If you can comprehend checkers, or even tic-tac-toe, then you can understand state machines.

To treat football as a state machine, start with the idea that football is a function of field position. There are 100 yards on the field, so 100 positions to begin with. Those positions have states (1st and 10, 2nd and 3, etc), there are plays that lead to a transition from position to position and state to state, there is a method of scoring, and there is a defined outcome that results from position, states, plays, scoring and the rules of the game of football.

A lot of the analytical progress that has been made over the past several years comes from taking play by play data, breaking it down into things like games, drives, scoring, and so forth, compiling that info into a state (i.e. down and distance) database, and then asking questions of that database of interest to the analyst.

You can analyze data in a time dependent or a time independent manner. Time dependence is important if you want to analyze for things like win probability. If you’re just interested in expected points models (i.e. the odds of scoring from any particular point on the field), a time independent approach is probably good enough (that’s sometimes referred to as the “perpetual first quarter assumption”).

Net expected point models, all downs included. The purple curve does not account for response opposition drives, the yellow one does. The yellow curve was used to derive turnover values.

Take, for example, Keith Goldner’s Markov chain model. As explained here, a Markov chain is a kind of state machine. The same kinds of ideas that are embedded in simple state machines (such as tic-tac-toe) also power more sophisticated approaches such as this one.

Once a set of states is defined, a game becomes a path through all the states that occur during the course of the game, meaning an analyst can also bring graph theory (see here for an interesting tutorial) into the picture. Again, it’s another tool, one that brings its own set of insights into the analysis.

[1] More accurately, we’re going to be looking at the subset of finite state automata (related to cellular automata) that can be represented as 1 or 2 dimensional grids.  In this context, football can be mapped into a 1 dimensional geometry where the dimension of interest is position on the football field.

Notes: The checkers board is a screen capture of a game played here. The chess game above is Nigel Short-Jan Timman Tilburg 1991, and the game diagram (along with some nice game analysis) comes from the blog Chess Tales.

Next Page »

Follow

Get every new post delivered to your Inbox.

Join 244 other followers