Modeling


This is going to be a mixed bag of a post, talking about anything that has caught my eye over the past couple weeks. The first thing I’ll note is that on the recommendation of Tom Gower (you need his Twitter feed), I’ve read Josh Katzowitz’s book: Sid Gillman: Father of the Passing Game.

img_6590

I didn’t know much about Gillman as a young man, though the 1963 AFL Championship was part of a greatest games collection I read through as a teen. The book isn’t a primer on Gillman’s ideas. Instead, it was more a discussion of his life, the issues he faced growing up (it’s clear Sid felt his Judaism affected his marketability as a coach in the college ranks). Not everyone gets the same chances in life, but Sid was a pretty tough guy, in his own right, and clearly the passion he felt for the sport drove him to a lot of personal success.

Worth the read. Be sure to read Tom Gower’s review as well, which is excellent.

ESPN is dealing with the football off season by slowly releasing a list of the “20 Greatest NFL Coaches” (NFL.com does its 100 best players, for much the same reason). I’m pretty sure neither Gillman nor Don Coryell will be on the list. The problem, of course, lies in the difference between the notions of “greatest” and “most influential”. The influence of both these men is undeniable. However, the greatest success for both these coaches has come has part of their respective coaching (and player) trees: Al Davis and Ara Parseghian come to mind when thinking about Gillman, with Don having a direct influence on coaches such as Joe Gibbs, and Ernie Zampese. John Madden was a product of both schools, and folks such as Norv Turner and Mike Martz are clear disciples of the Coryell way of doing things. It’s easy to go on and on here.

What’s harder to see is the separation (or fusion) of Gillman’s and Coryell’s respective coaching trees. Don never coached under or played for Gillman. And when I raised the question on Twitter, Josh Katzowitz responded with these tweets:

Josh Katzowitz : @smartfootball @FoodNSnellville From what I gathered, not much of a connection. Some of Don’s staff used to watch Gillman’s practices, tho.

Josh Katzowitz ‏: @FoodNSnellville @smartfootball Coryell was pretty adament that he didn’t take much from Gillman. Tom Bass, who coached for both, agreed.

Coaching clinics were popular then, and Sid Gillman appeared from Josh’s bio to be a popular clinic speaker. I’m sure these two mixed and heard each other speak. But Coryell had a powerful Southern California connection in Coach John McKay of USC, and I’m not sure how much Coryell and Gillman truly interacted.

Pro Football Weekly is going away, and Mike Tanier has a nice great article discussing the causes of the demise. In the middle of the discussion, a reader who called himself Richie took it upon himself to start trashing “The Hidden Game of Football” (which factors in because Bob Carroll, a coauthor of THGF, was also a contributor to PFW). Richie seems to think, among other things, that everything THGF discussed was “obvious” and that Bill James invented all of football analytics wholesale by inventing baseball analytics. It’s these kinds of assertions I really want to discuss.

I think the issue of baseball analytics encompassing the whole of football analytics can easily be dismissed by pointing out the solitary nature of baseball and its stats, their lack of entanglement issues, and the lack of a notion of field position, in the football sense of the term. Since baseball doesn’t have any such thing, any stat featuring any kind of relationship of field position to anything, or any stat derived from models of relationships of field position to anything, cannot have been created in a baseball world.

Sad to say, that’s almost any football stat of merit.

On the notion of obvious, THGF was the granddaddy of the scoring model for the average fan. I’d suggest that scoring models are certainly not obvious, or else every article I have with that tag would have been written up and dismissed years ago. What is not so obvious is that scoring models have a dual nature, akin to that of quantum mechanical objects, and the kinds of logic one needs to best understand scoring models parallels that of the kinds of things a chemistry major might encounter in his junior year of university, in a physical chemistry class (physicists might run into these issues sooner).

Scoring models have a dual nature. They are both deterministic and statistical/probabilistic at the same time.

They are deterministic in that for a typical down, distance, to go, and with a specific play by play data set, you can calculate the odds of scoring down to a hundredth of a point. They are statistical in that they represent the sum of dozens or hundreds of unique events, all compressed into a single measurement. When divorced from the parent data set, the kinds of logic you must use to analyze the meanings of the models, and formulas derived from those models, must take into account the statistical nature of the model involved.

It’s not easy. Most analysts turns models and formulas into something more concrete than they really are.

And this is just one component of the THGF contribution. I haven’t even mentioned the algebraic breakdown of the NFL passer rating they introduced, which dominates discussion of the rating to this day. It’s so influential that to a first approximation, no one can get past it.

Just tell me: how did you get from the formulas shown here to the THGF formula? And if you didn’t figure it out yourself, then how can you claim it is obvious?

Of all the teams in the NFC playoffs, the San Francisco 49ers had the best strength of schedule, as measured by the simple ranking system. Of all the teams in the AFC playoffs, the Baltimore Ravens had the best strength of schedule, as measured by the simple ranking system. But San Francisco’s SOS is markedly higher than Baltimore’s, to the point our system favors San Francisco by around 7.5 points.

 

2013 Super Bowl
NFC Team AFC Team NFC Win Pct Est. Point Spread
SF BAL 0.735 7.5

 

I suspect if Atlanta had won, we would be asking ourselves the question of whether SOS can be fooled. Advanced NFL Stats said, among other things, that Carolina was seriously underrated. If that were true of the whole NFC South, the Atlanta was actually playing better teams than their rankings suggested, and thus should have been more highly rated. But in the end, with 1:18 left to play, 3rd and 4 on the San Francisco 10 yard line, Atlanta was unable to get a first down, and San Francisco won a tough fought victory by 4 points. Two pivotal plays will markedly affect the narrative of this game.

Now to note, last year the New York Giants had the best strength of schedule of all the playoff teams, and they also won the Super Bowl. So I have to ask myself, at what point does this “coincidence” actually make it into the narrative of the average sports writer, or do they still keep talking about “teams of destiny” or other such vague language? Well, this kind of “sports journalist talk” hasn’t gone away in sports where analytics is an ever bigger factor in the game, sports like baseball or basketball. I suspect it doesn’t disappear here.

I suspect  to a first approximation almost no one other than Baltimore fans, such as Brian Burke, and this blog really believed that Baltimore had much of a chance(+). Well, I should mention Aaron Freeman of Falc Fans, who was rooting for Baltimore but still felt Denver would win. Looking, his article is no longer on the Falcfans site. Pity..

WP graph of Baltimore versus Denver. I tweeted that this graph was going to resemble a seismic chart of an earthquake. Not my work, just a screen shot off the excellent site Advanced NFL Stats.

WP graph of Baltimore versus Denver. I tweeted that this graph was going to resemble a seismic chart of an earthquake. Not my work, just a screen shot off the excellent site Advanced NFL Stats.

After a double overtime victory by 3 points, it’s awfully tempting to say, “I predicted this”, and if you look at the teams I’ve  favored, to this point* the streak of picks is 6-0. Let me point out though, that you can make a limiting assumption and from that assumption figure out how accurate I should have been. The limiting assumption is to assume the playoff model is 100% accurate** and see how well it predicted play. If the model is 100% accurate, the real results and the predicted results should merge.

I can tell you without adding up anything that only one of my favored picks had more than a 70% chance, and at least two were around 52-53%. So 6 times 70 percent is 4.2, and my model, in a perfect world, should have picked no more than 4 winners and 2 losers. A perfect model in a probabilistic world, where teams rarely have 65% chances to win, much less 100%, should be wrong sometimes. Instead, so far it’s on a 6-0 run. That means that luck is driving my success so far.

Is it possible, as I have argued, that strength of schedule is an under appreciated playoff stat, a playoff “Moneyball” stat, that teams that go through tough times are better than their offense and defensive stats suggest? It’s possible at this point. It’s also without question that I’ve been lucky in both the 2012 playoffs and the 2013 playoffs so far.

Potential Championship Scenarios:

 

Conference Championship Possibilities
Home Team Visiting Team Home Win Pct Est. Point Spread
NE BAL 0.523 0.7
HOU BAL 0.383 -3.5
ATL SF 0.306 -6.1
SF SEA 0.745 7.9

 

My model likes Seattle, which has the second best strength of schedule metric of all the playoff teams, but it absolutely loves San Francisco. It also likes Baltimore,  but not enough to say it has a free run throughout the playoffs. Like many modelers, I’m predicting that Atlanta and Seattle will be a close game.

~~~

+ I should also mention  that Bryan  Broaddus tweeted about a colleague of his who predicted a BAL victory.

* Sunday, January 13, 2013, about 10:00am.

** Such a limiting assumption is similar to assuming the NFL draft is rational; that the customers (NFL teams) have all the information they should, that they understand everything about the product they consume  (draft picks), and that their estimates of draft value thus form a normal distribution around the real value of draft picks, and that irrational exuberance, or trends, or GMs falling in love with players play no role in picking players. This, it turns out, makes model simulations much easier.

Though the results for the divisional round are embedded in the image of my playoff spreadsheet in my previous article, the table below is certainly easier to read.

 

Divisional Playoff Round
Home Team Visiting Team Home Win Pct Est. Point Spread
DEN BAL 0.477 -0.7
NE HOU 0.638 4.2
ATL SEA 0.462 -1.1
SF GB 0.700 6.3

 

I suspect other systems will rank Seattle as stronger than mine does, and Baltimore as weaker. That said, the Vegas line as of this Sunday gives Atlanta a 2 point advantage over Seattle, and my system slightly favors Seattle. We can calculate odds and points via other mechanisms, say, Pythagoreans, SRS and median point spreads, and if we do, what do we get?

 

Atlanta Versus Seattle
Technique Home Win Pct Est. Point Spread
Median Point Spread 0.632 4.0
Simple Ranking System 0.407 -2.8
Pythagorean Expectation 0.486 -0.4

 

Certainly different systems yield different emphases. For me, the one lasting impression I had was the Washington Seattle game was an almost picture perfect demonstration that home field advantage is strongest in the first quarter.

Of all the teams playing, my system likes San Francisco the best. I suspect it likes it more than others. We’ll learn more as other analytics oriented folks post their odds for the divisional round.

We can’t work with my playoff model without having a set of week 17 strength of schedule numbers, so we’ll present those first.

2012_stats_week_17

Between a difficult work schedule this last December and a very welcome vacation (I keep my stats on a stay at home machine), I haven’t been giving weekly updates recently. Hopefully some of my various thoughts will begin to make up for that.

Though with SOS values, you could crunch all the playoff numbers yourselves, this set of data should help in working out the possibilities:

Odds as calculated by my formula

Odds as calculated by my formula, with home field advantage adjusted to 60%. Point spread calculated with formula 3.0*logit(win probability)/logit(0.60). Click on image twice to expand.

What I find interesting is the difference between Vegas style lines, and my numbers, and the numbers recently posted by Brian Burke on the New York Times Fifth Down blog. My model is very different from Brian’s, but in three of the four wild card games, our percentage odds to win are within 2-3 percent of each other.

Point spreads were estimated as follows: if an effect of 60% were valued at 3 points (i.e. playoff home field advantage is about 60% and home field advantage is usually judged to be worth 3 points), then two effects of that magnitude should be worth 6 points. But it’s only on a logit scale that these effects can be added, so it only makes sense to relate probabilities of winning through their logits. As the logit of 0.60 is about 0.405465, then an estimated point spread can be had with the formula

point spread = 3.0*logit(win probability)/0.405465

Update (1/9/2012) – even simpler is:

est. point spread = 7.4*logit(win probability)

A simplified table of the wild card games, with percentages and estimated point spreads is:

Wild Card Playoff Round
Home Team Visiting Team Home Win Pct Est. Point Spread
GB MIN 0.682 5.6
WAS SEA 0.482 -0.5
HOU CIN 0.642 4.3
BAL IND 0.841 12.3

How many successes is a touchdown worth?

We’ve spoken about the potential relationships between success rates, adjusted yards per attempt, and stats like DVOA here, but to make any progress, you need to consider possible relationships between successes and yards. Let me point out the lower bound of the relationship is known, as 3 consecutive successes must yield at least 10 yards, and 30 consecutive successes must end up scoring a touchdown. In this case, the relationship is 1 success is equal to or greater than 3 1/3 yards.

Thus, if the surplus value of a touchdown is 20 yards, that’s 6 successes. If a turnover is worth 45 yards, that’s about 13.5 successes.

A smarter way to get at the mean value of this kind of relationship, as opposed to a limiting value, would be to add up the yards of all successful plays in the NFL and divide by the number of those plays. For now, that’s something to be pursued later.

Ok, this whole article is a kind of speculation on my part. DVOA is generally sold as a kind of generalization of the success rate concept, translated into a percentage above (or below) the norm. Components of DVOA include success rate, turnover adjustments, and scoring adjustments. For now, that’s enough to consider.

Adjusted yards per attempt, as we’ve shown, is derived from scoring models, in particular expected points models, and could be considered to be the linearization of a decidedly nonlinear EP curve. But if I wanted to, I could call AYA style stats the generalization of the yardage concept, one in which scoring and turnovers are all folded into a single number valued in terms of yards per attempt.

So, if I were to take AYA or its fancier cousin ANYA, and replace yards with success rate, and then refactor turnovers and scoring so that turnovers and scoring were scaled appropriately, I would end up with something like the “V” in DVOA. I could then add a SRS style defensive adjustment, and now I have “DV”. If I now calculate an average, and normalize all terms relative to my average, I’d end up with “Homemade DVOA”, wouldn’t I?

The point is, AYA or ANYA formulas are not really yardage stats, they are scoring stats whose units are in yards. So, if really, DVOA is ANYA in sheep’s clothing, where yardage has been replaced by success rate, with some after the fact defense adjustments and normalization from success rate “units”.. well, yes, then DVOA is a scoring stat, a kind of sophisticated and normalized “adjusted net success rate per attempt”.

Ed Bouchette has a good article, with Steelers defenders talking about Michael Vick. Neil Payne has two interesting pieces (here and here) on how winning early games is correlated with the final record for the season.

Brian Burke has made an interesting attempt to break down EP (expected points) data to the level of individual teams. I’ve contributed to the discussion there. There is a lot to the notion that slope of the EP curve reflects the ease with which a team can score, and the more shallow the slope, the easier it is for a team to score.

Note that the defensive contribution to a EP curve will depend on how expected points are actually scored. In a Keith Goldner type Markov chain model (a “raw” EP model), a defense cannot affect its own EP curve. It can only affect an opponent’s curve. In a Romer/Burke type EP formulation, the defensive effect on a team’s EP curve and the opponent’s EP curve is complex. Scoring by the defense has an “equal and opposite” effect on team and opponent EP, the slope being affected by frequency of the scoring as a function of yard line. Various kinds of stops could also affect the slope as well. Since scoring opportunities increase for an offense the closer to the goal line the offense gets, an equal stop probability per yard line would end up yielding nonequal scoring chances, and thus slope changes.

Next Page »

Follow

Get every new post delivered to your Inbox.

Join 245 other followers