The recent success of DeMarco Murray has energized the Dallas fan base. Felix Jones is being spoken of as if he’s some kind of leftover (I know, a 5.1 YPC over a career is such a drag), and people are taking Murray’s 6.7 YPA for granted. That wasn’t the thing that got me in the fan circles. It’s that Julius Jones was becoming a whipping boy again, the source of every running back sin there is, and so I wanted to build some tools to help analyze Julius’s career, and at the same time, look at Marion Barber III’s numbers, since these two are historically linked.
We’ll start with this database, and a bit of sql, something to let us find running plays. The sql is:
select down, togo, description from nfl_pbp where season = 2007 and gameid LIKE "%DAL%" and description like "%J.Jones%" and not description LIKE '%pass%' and not description LIKE '%PENALTY on DAL%' and not description like '%kick%' and not description LIKE '%sacked%'
It’s not perfect. I’m not picking up plays where a QB is sacked and the RB recovers the ball. A better bit of SQL might help, but that’s a place to start. We bury this SQL into a program that then parses the description string for the statement “for X yards”, or alternatively, “for no gain”, and adds them all up. From this, we could calculate yards per carry, but more importantly, we’ll calculate run success and we’ll also calculate something I’m going to call a failure rate.
For our purposes, a failure rate is the number of plays that gained 2 yards or less, divided by the total number of running attempts, multiplied by 100. The purpose of the failure rate is to investigate whether Julius, in 2007, became the master of the 1 and 2 yard run. One common fan conception of his style of play in his last year in Dallas is that “he had plenty of long runs but had so many 1 and 2 yards runs as to be useless.” I wish to investigate that.
In terms of run success, there are three important versions out there. The first is the ad-hoc definition, generally credited to the authors of the Hidden Game of Football. NFL Minds’ introductory discussion of analytics puts it in good context, and it is also described by Brian Burke here and Chase Stuart here. Osama K. Solieman discusses it in a nice PDF, part of a master’s from the University of Arizona.
There are at least two popular ad hoc definitions, and they suffer from a certain arbitrariness. Take the 40% on first down, 50% on second down, all on third definition. This one has the advantage that a running back, on 3 consecutive plays, who runs 4, then 3, then 3 yards has a 100% run success. However, a running back who runs 3, then 4, then 3 yards only has a 67% success for the same number of yards. Why is that? What logic is behind that?
The second important success notion is that of the Football Outsiders, which expanded on the notion of success to create their DVOA stat. And although FO has been able to use this to great effect – they’re good analysts. mind you – the stat’s sophistication means the average blogger doesn’t really have a chance to understand the stat in the first place. Further, the stat has been changing in character over the years.. it’s a moving definition.
The third run success notion is that of Brian Burke, who pegs run success to the notion of expected points curves. The definition goes “A successful run is any run that increases expected points“. This definition, like a lot of good notions that Brian has introduced, combines conceptual simplicity with functionality. You don’t worry about how many yards are involved. The definition automatically takes into account the down and distance characteristics, and adjusts accordingly. Further, on introducing this measure, he noted that run success, defined in this fashion, correlated three times more with winning than did rushing YPC. It’s absolutely this last characteristic that has gotten people stoked about run success. Question is, how much of this high correlation actually extends to the old ad-hoc definitions?
To note, the Brian Burke link above isn’t the first time he talked about run success notions. There is an interesting discussion here, where he notes that 50% (or is it 55%?) would be better than 40%, and that ending up 3rd and 2 after a first and 10 better than 3rd and 3. In short, I think it is possible to calculate an ad-hoc definition that better correlates with Brian Burke’s expected points based definition, and I think that should be a design goal for football analytics types. For those who have the time (currently, I don’t), a data set adequate to begin was provided by Brian Burke here.
Ok, in view of issues discussed in Brian Burke’s first down article, I’ve amended the run success metric to work as follows
- First down success: 1/2 of yards to first down.
- Second down success: 2/3 of yards to first down.
- Third and fourth down success: All of yards to first down.
This I think correlates better with the expected points run success than other common definitions. That said, I’ll be calculating the “40%-50%-100%” definition as well. I’ll also calculate 3rd down success, and 4th down success. Finally, I’ll be calculating failure rates, so that people can get a feel for this stat as well. We’ll need this stat to look at Julius Jones’s final season with Dallas.
To start, we’ll introduce some numbers from backs who have had some great season over the past 5 years. We can use these as examples of what really good backs can do.
Some things to note. In my definition of run success, low 30s is a worrisome number, mid 30s is ordinary, high 30s is good, over 40 is awesome. In failure rates, Steve Jackson’s 42% is by far the lowest I’ve seen. It tends to value from the mid 40s to perhaps 50-52% at the worst. Turns out this number doesn’t vary too much at all. Chris Johnson’s low success rates are a curiosity to me. Like a low batting average big home run hitter, it appears that all of his value is in the long runs he provides (that sparkling 5.6 YPC in 2009).
We’ll show some Steve Jackson stats, from 2006 through 2010
I’m not sure what’s going on with his success rate. Is the player wearing out, or is it just the team getting worse and worse? Let’s continue with a deeper analysis of another runner, in this case Michael Turner
In Michael Turner’s case, the run success is good, but the third down run success is really good. This begs the question: how much of the value of run success is third down run success?
To finish off the study of good runners, we’ll end with Adrian Peterson.
Peterson’s success rates are so high, he spoils people for ordinary running backs. Again, how many of his stats are affected by a team whose strength is often ordinary?
Let’s now look at Julius Jones. There is something to be learned in his success rates.
Bill Parcells uses running backs in different ways than most coaches, and had Bill Parcells been around in 2007, Julius’s loss of productivity might not have mattered. He wasn’t losing yards especially much or putting the team in bad positions. But he wasn’t getting the team into good positions either, and that trend continued into 2008. At some point run success matters, and although Julius gave good value for two and a half years, his run success rates at the end of his Dallas career show why Dallas eventually let him go. It wasn’t bad runs that doomed him. It was the lack of good ones that did.
To show Marion Barber’s numbers over the past five years
Marion Barber had an absurd success rate in 2006. But fans were complaining about his performance as early as 2009, and his 2009 season was about as good as a running back can get. By 2010, there is a precipitous drop in success rate, and that almost 10% rise in failure rate loomed pretty heavily for the Barbarian. Ironically, the man who generally isn’t known for short runs was probably doomed by his short runs, and Julius, the running back credited with being a “short run specialist”, was likely doomed by his loss of explosiveness.
Update: added a crucial Brian Burke link. Also, by user request, the source code I used to generate this article is now available.