June 2011

I can’t say for certain if the 1991 Super Bowl (highlights here, DVD here) contains the oldest nickel front in the world, as there is a side of me that  thinks the Miami 4-3 is a thinly disguised 2-3-6 – think about it, using what kinds of players are placed where, as opposed to what kinds of names the positions are called. Isn’t a Miami 4-3 equivalent to this:

And not all that far removed from this:

Just sayin’.

In the book “Education of a Coach“, by David Halberstam, a book about Bill Belichick, and a decent read, Halberstam goes into great detail about  the base nickel front that Belichick used in the 1991 Super Bowl. And yes, isn’t this, the first offensive play of the Bowl, an argument that Belichick is your nickel front daddy?

I say, who is your nickel front daddy?

Halberstam says this defense was, in modern terms, a 2-3 dime. Of course,  with Lawrence Taylor as the rush linebacker, it was a rather stout 2-3.

Miami 4-3 notes..

  • This thread from Football Futures, I think, is one of the better reads on the Miami 4-3.
  • Coach Hoover: Miami 4-3 versus the flexbone.
  • Coach Huey: Miami 4-3 compared to the K State 4-3.
  • Fifth Down Blog on the 4-3 (including the Miami). The whole guide summarized here.
  • Linebackers in the Miami 4-3.

I ran into it via Google somehow, while searching for ideas on the cost of an offense, then ran into again, in a much more digestible form through Benjamin Morris’s blog. Brian Burke has at least 4 articles on the Massey-Thaler study (here, here, here and most *most* importantly here). Incidentally, the PDF of Massey-Thaler is available through Google Docs.

The surplus value chart of Massey-Thaler

Pro Football Reference talks about Massey-Thaler here, among other places. LiveBall Sports, a new blog I’ve found, talks about it here. So  this idea, that you can gain net relative value by trading down, certainly has been discussed and poked and prodded for some time. What I’m going to suggest is that my results on winning and draft picks are entirely consistent with the Massey-Thaler paper. Total draft picks correlate with winning. First round draft picks do not.

One of the  points of the Massey-Thaler paper is that psychological factors play in the evaluation of first round picks, that behavioral economics are heavily in play. To quote:

We find that top draft picks are overvalued in a manner that is inconsistent with rational expectations and efficient markets and consistent with psychological research.

I  tend to think that’s true. It’s also an open question just how well draft assessment ever gets at career  performance (or even whether it should). If draft evaluation is really only a measure of athleticism and not long term performance, isn’t that simply encasing in steel the Moneyball error? Because, ultimately, BPA only works the way its advocates claim if the things that draft analysts measure are proportional enough to performance to disambiguate candidates.

To touch on some of the psychological factors, and for now, just to show, in some fashion, the degree of error in picking choices, we’ll look at the approximate  value of the first pick from 1996 to 2006 and then the approximate value of possible alternatives. To note, a version of this study has already been done by Rick Reilly, in his “redraft” article.

Year Player AV Others AVs
1996 Keyshawn Johnson 74 #26 Ray Lewis 150
1997 Orlando Pace 101 #66 Rhonde Barber, #73 Jason Taylor 114, 116
1998 Peyton Manning 156 #24 Randy Moss 122
1999 Tim Couch 30 #4 Edgerrin James 114
2000 Courtney Brown 28 #199 Tom Brady 116
2001 Michael Vick 74 #5 LaDanian Tomlinson, #30 Reggie Wayne, #33 Drew Brees 124, 103, 103
2002 David Carr 44 #2 Julius Peppers, #26 Ed Reed 95, 92
2003 Carson Palmer 69 UD Antonio Gates, #9 Kevin Williams 88, 84
2004 Eli Manning 64 #126 Jared Allen, #4 Phillip Rivers, #11 Ben Roethlisberger 75, 74, 72
2005 Alex Smith 21 #11 DeMarcus Ware 66
2006 Mario Williams 39 #60 Maurice Jones-Drew, #12 Hlati Ngata 60, 55

If drafting were accurate, then the first pick should be the easiest. The first team to pick has the most choice, the most information, the most scrutinized set of candidates. This team has literally everything at its disposal. So why aren’t the first round picks better performers? Why is it across the 11 year period depicted, the are only 2 sure fire Hall of Famers (100 AV or more)  and only 1 pick that was better than any alternative? Why?

My answer is (in part) that certain kinds of picks, QBs, are prized as a #1 (check out the Benjamin Morris link above for why), and that the QB is the hardest position to accurately draft. Further, though teams know and understand that intangibles exist, they’re not reliably good at tapping into them. Finally, drafting in any position in the NFL, not just number 1, has a high degree of inaccuracy (here and here).

In the case of Tom Brady, the factors are well discussed here. I’d suggest that decoy effects, as described by Dan Ariely in his book Predictably Irrational (p 15, pp 21-22), affected both Tom Brady (comparisons to Drew Henson) and Drew Brees (compared to Vick). Further, Vick was so valued the year he was drafted that  he surely affected the draft position of Quincy Carter and perhaps Marques Tuiasosopo (i.e. a coattail effect). If I were to estimate of the coattail effect for Q, it would be about two rounds of draft value.

How to improve the process? Better data and deeper analysis helps. There are studies that suggest, for example, that the completion percentage of college quarterbacks is a major predictor of professional success. As analysts dig into factors that more reliably predict future careers, modern in-depth statistics will help aid scouting.

Still, 10 year self studies of draft patterns are beyond the ken of NFL management teams with 3-5 year plans that must succeed. Feedback to scouting departments is going to have to cycle back much faster than that. For quality control of draft decisions, some metric other than career performance has to be used. Otherwise, a player like Greg Cook would have to be treated as a draft bust.

At some point, the success or failure of a player is no longer in the scout’s hands, but coaches, and the Fates. Therefore, a scout can only be asked to deliver the kind of player his affiliated coaches are asking for and defining as a model player. It’s in the ever-refined definition of this model (and how real players can fit this abstract specification) in which progress will be made.

Now to note, that’s a kind of progress that’s not accessible from outside the NFL team.  Fans consistently value draft picks via the tools at hand – career performance – because that’s what they have. In so doing, they confuse draft value with player development and don’t reliably factor the quality of coaching and management out of the process. And while the entanglement issue is a difficult one in the case of quarterbacks and wide receivers, it’s probably impossible to separate scouting and coaching and sheer player management skills with the kinds of data Joe Fan can gain access to.

So, if scouting isn’t looking directly at career performance, yet BPA advocates treat it as if it does, what does it mean for BPA advocates? It means that most common discussions of BPA theory incorporate a model of value that scouts can’t measure. Therefore, expectations don’t match what scouts can actually deliver.

I’ve generally taken the view that BPA is most  useful when it’s most obvious. In subtle cases of near equal value propositions, the value of BPA is lost in the variance of draft evaluation. If that reduces to “use BPA when it’s as plain as the nose on your face, otherwise draft for need”, then yes, that’s what I’m suggesting. Empirical evidence, such as the words of Bobby Beathard, suggest that’s how scouting departments do it anyway. Coded NFL draft simulations explicitly work in that fashion.

Update 6/17: minor rewrite for clarity.

When trying to value drafts, we tend to think only in one direction:  how to get as much talent for the kinds of draft picks we have.There is another kind of optimization that often goes under the radar, and that is having the coaching talent and foresight to construct a winning offense that doesn’t require extreme athletes. If, for example, you can get the same caliber running game out of 4th round draft choices as other coaches would with a mid first round choice, you’ve lowered the cost of the offense (Mike Shanahan and his zone blocking-cutback running scheme). If you can get high quality play out of quarterbacks with modest physical skills by making their reads simpler and jobs easier, you’ve lowered the cost of your quarterbacks (the West Coast offense). If by looking for smaller players with plenty of speed, drafting linebackers from strong safety-linebacker tweeners, putting linebackers at defense end and defensive ends at defense tackle,  you markedly  increase  your team speed. Further, because you’ve fruitfully used so many tweeners, you’ve cut the cost of your defense (Miami 4-3, notes here and here and here. Coach Hoover talks about it here, defending the flexbone, and I’m pretty sure the Penn State defense, described here, is a derivative of this defense as well).

You can probably formalize the cost of an offense (or defense) by treating the draft as a market and assigning the players on a team their draft value, either by methods we touched on here, or a fit to a Weibull distribution, as shown in figure 1 of this manuscript, or by analogy using AdamJT13’s chart here. To note, the cost of a free agent in this context is zero, since no draft choice was spent purchasing them. I don’t claim ideas like these are original to me. On the site LiveBall Sports – very nice multisport site with a nice analytics bent – they  have a 2 part series (NFC and AFC) evaluating the usage of free agents, and the language of the author, Greg Trippiedi, makes it clear he’s thinking in terms of draft value. How valuable are these no-cost free agents? Please recall that in this article, we quote Bobby Beathard as saying the first Super Bowl team under his watch with the Redskins  had 26 free agents on the roster. But it also had excellent coaches, who could turn sow’s ears into.. well.. Hawgs.

Since a player  that makes a roster is occupying a slot that others could also occupy, I suspect a true valuation of the cost of a player would also have to include development time. If it takes 5 years for a player to become a starter (or major rotation player), there is the cost of his draft choice and the time cost of his development. Both need to be assessed in terms of his cost. A player that never starts, never plays and occupies space becomes a dead weight cost.

One final issue. Dynasties can’t be constructed with expensive players. Think about it. Dynasties don’t have particularly good draft position. Winning in the early years guarantees that. The average player lasts about four years. So in general, they will have a few elite players with long careers and a large corps of pretty good, inexpensive players. If costs of the team model can’t be lowered adequately, sustained winning can’t be achieved. Replacement players will simply come at an unsustainable cost.

“Adapt”, by Tim Harford, is a book focused on teaching how a corporation can survive “extinction events“, and what kinds of skills create sound and adaptive companies. It’s a worthwhile read on that basis alone, but what kinds of lessons can this book have to, say, the average high school football coach, and perhaps as well, decision making in terms of the NFL draft?

Chapter One lays the premise of the book. Tim Harford talks about a toaster,  talks about how incredibly difficult it is to make a toaster from scratch, and uses that as an example of how sophisticated the modern world is (think of a modern defense, or football offense, and how specialized these have become), and how interdependent the parts are. He then makes the parallel between corporations, their lifespans, and how even very successful corporations have disappeared over time, and biological evolution. The metaphor, though, is simply a framework for some provocative case studies.

Chapter Two gets into a lot of meaty detail. He talks about the failure of  the Iraq War and as well, the failures of Vietnam as well. He points out organizational parallels in both circumstances. I’ll note, on my own, that both Donald Rumsfeld and Robert McNamara were exceptionally smart and capable individuals. But when circumstances changed, they were both unwilling to take input that didn’t confirm their current viewpoints.

A point that Tim Harford hammers home again and again, it’s that  many heads are always better than one. Evidently, this is something demonstrable in a formal problem solving study. From page 49 of Tim’s book:

An alternative perspective on the value of an alternative perspective comes from the complexity theorists Lu Hong and Scott Page. Their decision-makers are simple automatons inside a computer, undaunted by social pressure. Yet when Hong and Page run simulations in which their silicon agents are programmed to search for solutions, they find that  the very smartest agents aren’t as successful as a more diverse group of dumber agents. Even though ‘different’ often means ‘wrong’, trying something different has a  value all of its own… Both because of  the conformity effect Asch discovered, and because of the basic usefulness of hearing more ideas, better decisions emerge from a diverse group.

This speaks to the idea of a leader being a good listener as well as someone who knows and inspires. Create a team. Make sure the team is diverse in terms of  its thinking, make sure everyone has a voice. Listen, because you don’t know whose solution will end up succeeding.

Chapter Three gets into the value of the unexpected solution. To me this provides the orthodox reason for mining the later draft choices. Tim Harford gives the example of the Supermarine Spitfire. When the British government decided to fund the development of the Spitfire, the orthodox military theory (see here and here) of the time said that pursuit aircraft, as fighters were then called, were useless. Airplanes such as the Boeing B-10 and B-17 were as fast as pursuit planes and bristled with guns. Waves of hundreds or thousands of these aircraft would  obliterate cities and make conventional war obsolete. Things such as

Billy Mitchell‘s sinking of the Ostfriesland

had so captured the imagination of military theorists they couldn’t conceive of a world where bombers could be stopped. But the government, hedging its bets, spent some money on the Spitfire anyway. Later, they were grateful they did.

This point, translated into draft theory, would go something like this: the net value of finding Pro Bowl or starter talent in the later rounds is almost incalculable. So that’s why you look, that’s why you do it. Further, you need to look for players that could start. Drafting players as perpetual backups isn’t  the point of the later rounds.

Tim goes on to develop  the theme of the affordable risk, to talk about the value of decoupling risk factors (a lot of interesting studies of failed oil rigs here). There  is a lot of meat for those in the business world, in the military (the organizational notes are exceptionally worthwhile), and as a foundation for the value of late round draft choices, one for which I’m personally grateful.

Read it sometime. You won’t regret it.

In the first part of this article, we talked about the context in which Don Hutson played, posted some stats, and then said we would  “translate” his stats into modern terms. We’re going to do this by calculating  his percentage catches and percentage yardage per year, tds per catch, and  then “implant” those into the statistics of the average team of 1995, the average team of 1999, the average team of 2010, the 1995 Dallas Cowboys, and two Green Bay teams, the one of 1995 and the Super Bowl winner of 2010.

We’re using the average stat initially to make a point, which is that Don Hutson’s average year translates into a better year than most modern receiver’s best year. This is especially true of his prodigious scoring rate. Deal is, he did play in a pre-modern era where

  • Coaches didn’t throw much behind the 50 yard line.
  • The Packers threw to score.
  • Don Hutson was used as a scoring machine

To factor out some of these effects, we created a set of modified stats for Don where

  • We reduced the number of catches by 20%. Some possession catches would be given to tight ends, backs, and #2 receivers if Don were to play a modern game.
  • Consequently, we increased his yardage by 20%, since people would be throwing longer passes to Don.
  • On top of the scoring loss caused by the decreased catches, we then subtracted his scoring by another 20%, to account for more distributed passing and better defenses in the modern era.

These are ad hoc correctives. Don’t assume I’ve justified these on statistical grounds. Nonetheless, the resulting stats look pretty real, for a typical receiver’s best year of all time.

In this context, and shorn of the crazy throwing rate of 1942, Don Hutson’s best season (also calculated in multiple offensive contexts) doesn’t look all that much better than Don Hutson’s typical season. His best season was partly a product of the team’ s extraordinary emphasis on passing that year.

Finally, if you’ll compare Don in the passing context of, say, the 1995 Green Bay Packers to that of, oh, the average team of the 1999 season (ironically the season the 1999 St Louis Rams, The Greatest Show on Turf won the Super Bowl), then the value of playing for a team with a high powered offense is clear. Jerry Rice openly benefitted in being in the #1 offenses of the San Francisco 49ers.

Using these same techniques and translating every season of Don Hutson’s career into modern terms yields the results above. The  shortening effect of using team stats (team YPC over the years has grown shorter, as passing became possession oriented) and the tendency to use Don as a scorer creates a year, 1935, whose stats aren’t as reasonable as Don’s average stats. To some extent,  you can’t take the 1935 out of 1935 stats and fit them into a 1995 or 2010 context.

Despite any flaws, I’d suggest the above approaches are far better than the typical translation, which multiplies Don Hutson’s 1942 season by 1.6 and then assumes they’ve accounted for all the differences between 1942 and 2010. They haven’t. All they’re doing is one of the greatest touchdown scoring receivers of all time a serious injustice.

Finally, I  think these results suggest that GOAT at receiver is a two man race. While I’d concede that anyone who looks at the length of Jerry Rice’s career and says, “This guy can’t be beat” has a point, it’s my contention that Don Hutson’s performances, especially in the 1940s, are so exceptional relative to his competition that they will be very hard to match.

Getting across how freakish Don Hutson was in his day is difficult to a typical modern football fan. They’ve been told since Day 1 that Jerry Rice is unquestionably the best receiver of all time, and so their brain cells turn off and they don’t question the notion. And yes, in at least one respect, Jerry was the best of all time, in the sense that no one had as long a productive career. The idea that someone could play at such a high level for 18 of his 20 years at a position  that demands athletic excellence is the foundation of the respect that the man has gathered.

However, in any discussion of the best of the best at WR, Don Hutson (see also here and here) has to be in the mix. Back when wide receivers were lucky to get 1 pass a game, he was catching 3 and 4. Back when scoring was difficult, he led the league in scoring 8 times. His YPC is decent but  hardly extraordinary. What Don Hutson was — is a ball catching freak, and a scoring freak.

It’s not entirely noticeable in the stats of the day, compared to modern football, because modern football is a more pass oriented game. It has specialists, guys who play one way, instead of two ways, and in particular, someone who specializes in just throwing the ball. It has a more aerodynamic football (see here and here) than the one those guys used to toss (check out Bill Belichick talking about Sammy Baugh, roughly a contemporary of Hutson’s, in NFL Network’s top 100). Passing was just primitive: the league completion percentage was 33.9% the year Don Hutson entered the league. When he left, it had risen to about 45.6%.

Because passing was primitive, the strategies of the day were not to pass until you reached the 40 yard line. Inside the 20, teams would run perhaps one play and then punt.

But in those days, and by the standards of the times, Green Bay was a passing offense. They featured Johnny McNally, a gifted tailback and receiver who scored 11 touchdowns through the air in 1931. Those two did team up effectively in 1935, when the two were clearly the star receivers for the club. But McNally moved on after 1936 and Don stayed put.

1942 is an exceptional year, and the year in which Don put up his best numbers. To note, Green Bay passed 330 times that year, when most clubs were throwing about 220 times. To place Green Bay’s relative passing frequency and success into a modern context, transferring its ratiometric advantages into the year 1995 would create a fictional team that passed 51 times a game and completed 69.8% of its passes. Don would be almost half that passing offense (43% of the catches, 50% of the yards), and he would score almost every fourth time he touched the ball. The resulting numbers would be freakish.

1995 is a good point in comparison. That’s one of Jerry Rice’s best years. The run to pass ratio that year is about 0.79. Green Bay of 1942 — a pretty wide open passing offense – was 1.29. How could we go about embedding the stats of Don Hutson into the year 1995 in such a way that it makes sense? That will be done in a following post.