The Massey-Thaler study and other value propositions.

I ran into it via Google somehow, while searching for ideas on the cost of an offense, then ran into again, in a much more digestible form through Benjamin Morris’s blog. Brian Burke has at least 4 articles on the Massey-Thaler study (here, here, here and most *most* importantly here). Incidentally, the PDF of Massey-Thaler is available through Google Docs.

The surplus value chart of Massey-Thaler

Pro Football Reference talks about Massey-Thaler here, among other places. LiveBall Sports, a new blog I’ve found, talks about it here. So this idea, that you can gain net relative value by trading down, certainly has been discussed and poked and prodded for some time. What I’m going to suggest is that my results on winning and draft picks are entirely consistent with the Massey-Thaler paper. Total draft picks correlate with winning. First round draft picks do not.

One of the points of the Massey-Thaler paper is that psychological factors play in the evaluation of first round picks, that behavioral economics are heavily in play. To quote:

We find that top draft picks are overvalued in a manner that is inconsistent with rational expectations and efficient markets and consistent with psychological research.

I tend to think that’s true. It’s also an open question just how well draft assessment ever gets at career performance (or even whether it should). If draft evaluation is really only a measure of athleticism and not long term performance, isn’t that simply encasing in steel the Moneyball error? Because, ultimately, BPA only works the way its advocates claim if the things that draft analysts measure are proportional enough to performance to disambiguate candidates.

To touch on some of the psychological factors, and for now, just to show, in some fashion, the degree of error in picking choices, we’ll look at the approximate value of the first pick from 1996 to 2006 and then the approximate value of possible alternatives. To note, a version of this study has already been done by Rick Reilly, in his “redraft” article.

Year	Player	AV	Others	AVs
1996	Keyshawn Johnson	74	#26 Ray Lewis	150
1997	Orlando Pace	101	#66 Rhonde Barber, #73 Jason Taylor	114, 116
1998	Peyton Manning	156	#24 Randy Moss	122
1999	Tim Couch	30	#4 Edgerrin James	114
2000	Courtney Brown	28	#199 Tom Brady	116
2001	Michael Vick	74	#5 LaDanian Tomlinson, #30 Reggie Wayne, #33 Drew Brees	124, 103, 103
2002	David Carr	44	#2 Julius Peppers, #26 Ed Reed	95, 92
2003	Carson Palmer	69	UD Antonio Gates, #9 Kevin Williams	88, 84
2004	Eli Manning	64	#126 Jared Allen, #4 Phillip Rivers, #11 Ben Roethlisberger	75, 74, 72
2005	Alex Smith	21	#11 DeMarcus Ware	66
2006	Mario Williams	39	#60 Maurice Jones-Drew, #12 Hlati Ngata	60, 55

If drafting were accurate, then the first pick should be the easiest. The first team to pick has the most choice, the most information, the most scrutinized set of candidates. This team has literally everything at its disposal. So why aren’t the first round picks better performers? Why is it across the 11 year period depicted, the are only 2 sure fire Hall of Famers (100 AV or more) and only 1 pick that was better than any alternative? Why?

My answer is (in part) that certain kinds of picks, QBs, are prized as a #1 (check out the Benjamin Morris link above for why), and that the QB is the hardest position to accurately draft. Further, though teams know and understand that intangibles exist, they’re not reliably good at tapping into them. Finally, drafting in any position in the NFL, not just number 1, has a high degree of inaccuracy (here and here).

In the case of Tom Brady, the factors are well discussed here. I’d suggest that decoy effects, as described by Dan Ariely in his book Predictably Irrational (p 15, pp 21-22), affected both Tom Brady (comparisons to Drew Henson) and Drew Brees (compared to Vick). Further, Vick was so valued the year he was drafted that he surely affected the draft position of Quincy Carter and perhaps Marques Tuiasosopo (i.e. a coattail effect). If I were to estimate of the coattail effect for Q, it would be about two rounds of draft value.

How to improve the process? Better data and deeper analysis helps. There are studies that suggest, for example, that the completion percentage of college quarterbacks is a major predictor of professional success. As analysts dig into factors that more reliably predict future careers, modern in-depth statistics will help aid scouting.

Still, 10 year self studies of draft patterns are beyond the ken of NFL management teams with 3-5 year plans that must succeed. Feedback to scouting departments is going to have to cycle back much faster than that. For quality control of draft decisions, some metric other than career performance has to be used. Otherwise, a player like Greg Cook would have to be treated as a draft bust.

At some point, the success or failure of a player is no longer in the scout’s hands, but coaches, and the Fates. Therefore, a scout can only be asked to deliver the kind of player his affiliated coaches are asking for and defining as a model player. It’s in the ever-refined definition of this model (and how real players can fit this abstract specification) in which progress will be made.

Now to note, that’s a kind of progress that’s not accessible from outside the NFL team. Fans consistently value draft picks via the tools at hand – career performance – because that’s what they have. In so doing, they confuse draft value with player development and don’t reliably factor the quality of coaching and management out of the process. And while the entanglement issue is a difficult one in the case of quarterbacks and wide receivers, it’s probably impossible to separate scouting and coaching and sheer player management skills with the kinds of data Joe Fan can gain access to.

So, if scouting isn’t looking directly at career performance, yet BPA advocates treat it as if it does, what does it mean for BPA advocates? It means that most common discussions of BPA theory incorporate a model of value that scouts can’t measure. Therefore, expectations don’t match what scouts can actually deliver.

I’ve generally taken the view that BPA is most useful when it’s most obvious. In subtle cases of near equal value propositions, the value of BPA is lost in the variance of draft evaluation. If that reduces to “use BPA when it’s as plain as the nose on your face, otherwise draft for need”, then yes, that’s what I’m suggesting. Empirical evidence, such as the words of Bobby Beathard, suggest that’s how scouting departments do it anyway. Coded NFL draft simulations explicitly work in that fashion.

Update 6/17: minor rewrite for clarity.

Code and Football