Data


Of all the teams in the NFC playoffs, the San Francisco 49ers had the best strength of schedule, as measured by the simple ranking system. Of all the teams in the AFC playoffs, the Baltimore Ravens had the best strength of schedule, as measured by the simple ranking system. But San Francisco’s SOS is markedly higher than Baltimore’s, to the point our system favors San Francisco by around 7.5 points.

 

2013 Super Bowl
NFC Team AFC Team NFC Win Pct Est. Point Spread
SF BAL 0.735 7.5

 

I suspect if Atlanta had won, we would be asking ourselves the question of whether SOS can be fooled. Advanced NFL Stats said, among other things, that Carolina was seriously underrated. If that were true of the whole NFC South, the Atlanta was actually playing better teams than their rankings suggested, and thus should have been more highly rated. But in the end, with 1:18 left to play, 3rd and 4 on the San Francisco 10 yard line, Atlanta was unable to get a first down, and San Francisco won a tough fought victory by 4 points. Two pivotal plays will markedly affect the narrative of this game.

Now to note, last year the New York Giants had the best strength of schedule of all the playoff teams, and they also won the Super Bowl. So I have to ask myself, at what point does this “coincidence” actually make it into the narrative of the average sports writer, or do they still keep talking about “teams of destiny” or other such vague language? Well, this kind of “sports journalist talk” hasn’t gone away in sports where analytics is an ever bigger factor in the game, sports like baseball or basketball. I suspect it doesn’t disappear here.

I suspect  to a first approximation almost no one other than Baltimore fans, such as Brian Burke, and this blog really believed that Baltimore had much of a chance(+). Well, I should mention Aaron Freeman of Falc Fans, who was rooting for Baltimore but still felt Denver would win. Looking, his article is no longer on the Falcfans site. Pity..

WP graph of Baltimore versus Denver. I tweeted that this graph was going to resemble a seismic chart of an earthquake. Not my work, just a screen shot off the excellent site Advanced NFL Stats.

WP graph of Baltimore versus Denver. I tweeted that this graph was going to resemble a seismic chart of an earthquake. Not my work, just a screen shot off the excellent site Advanced NFL Stats.

After a double overtime victory by 3 points, it’s awfully tempting to say, “I predicted this”, and if you look at the teams I’ve  favored, to this point* the streak of picks is 6-0. Let me point out though, that you can make a limiting assumption and from that assumption figure out how accurate I should have been. The limiting assumption is to assume the playoff model is 100% accurate** and see how well it predicted play. If the model is 100% accurate, the real results and the predicted results should merge.

I can tell you without adding up anything that only one of my favored picks had more than a 70% chance, and at least two were around 52-53%. So 6 times 70 percent is 4.2, and my model, in a perfect world, should have picked no more than 4 winners and 2 losers. A perfect model in a probabilistic world, where teams rarely have 65% chances to win, much less 100%, should be wrong sometimes. Instead, so far it’s on a 6-0 run. That means that luck is driving my success so far.

Is it possible, as I have argued, that strength of schedule is an under appreciated playoff stat, a playoff “Moneyball” stat, that teams that go through tough times are better than their offense and defensive stats suggest? It’s possible at this point. It’s also without question that I’ve been lucky in both the 2012 playoffs and the 2013 playoffs so far.

Potential Championship Scenarios:

 

Conference Championship Possibilities
Home Team Visiting Team Home Win Pct Est. Point Spread
NE BAL 0.523 0.7
HOU BAL 0.383 -3.5
ATL SF 0.306 -6.1
SF SEA 0.745 7.9

 

My model likes Seattle, which has the second best strength of schedule metric of all the playoff teams, but it absolutely loves San Francisco. It also likes Baltimore,  but not enough to say it has a free run throughout the playoffs. Like many modelers, I’m predicting that Atlanta and Seattle will be a close game.

~~~

+ I should also mention  that Bryan  Broaddus tweeted about a colleague of his who predicted a BAL victory.

* Sunday, January 13, 2013, about 10:00am.

** Such a limiting assumption is similar to assuming the NFL draft is rational; that the customers (NFL teams) have all the information they should, that they understand everything about the product they consume  (draft picks), and that their estimates of draft value thus form a normal distribution around the real value of draft picks, and that irrational exuberance, or trends, or GMs falling in love with players play no role in picking players. This, it turns out, makes model simulations much easier.

Though the results for the divisional round are embedded in the image of my playoff spreadsheet in my previous article, the table below is certainly easier to read.

 

Divisional Playoff Round
Home Team Visiting Team Home Win Pct Est. Point Spread
DEN BAL 0.477 -0.7
NE HOU 0.638 4.2
ATL SEA 0.462 -1.1
SF GB 0.700 6.3

 

I suspect other systems will rank Seattle as stronger than mine does, and Baltimore as weaker. That said, the Vegas line as of this Sunday gives Atlanta a 2 point advantage over Seattle, and my system slightly favors Seattle. We can calculate odds and points via other mechanisms, say, Pythagoreans, SRS and median point spreads, and if we do, what do we get?

 

Atlanta Versus Seattle
Technique Home Win Pct Est. Point Spread
Median Point Spread 0.632 4.0
Simple Ranking System 0.407 -2.8
Pythagorean Expectation 0.486 -0.4

 

Certainly different systems yield different emphases. For me, the one lasting impression I had was the Washington Seattle game was an almost picture perfect demonstration that home field advantage is strongest in the first quarter.

Of all the teams playing, my system likes San Francisco the best. I suspect it likes it more than others. We’ll learn more as other analytics oriented folks post their odds for the divisional round.

We can’t work with my playoff model without having a set of week 17 strength of schedule numbers, so we’ll present those first.

2012_stats_week_17

Between a difficult work schedule this last December and a very welcome vacation (I keep my stats on a stay at home machine), I haven’t been giving weekly updates recently. Hopefully some of my various thoughts will begin to make up for that.

Though with SOS values, you could crunch all the playoff numbers yourselves, this set of data should help in working out the possibilities:

Odds as calculated by my formula

Odds as calculated by my formula, with home field advantage adjusted to 60%. Point spread calculated with formula 3.0*logit(win probability)/logit(0.60). Click on image twice to expand.

What I find interesting is the difference between Vegas style lines, and my numbers, and the numbers recently posted by Brian Burke on the New York Times Fifth Down blog. My model is very different from Brian’s, but in three of the four wild card games, our percentage odds to win are within 2-3 percent of each other.

Point spreads were estimated as follows: if an effect of 60% were valued at 3 points (i.e. playoff home field advantage is about 60% and home field advantage is usually judged to be worth 3 points), then two effects of that magnitude should be worth 6 points. But it’s only on a logit scale that these effects can be added, so it only makes sense to relate probabilities of winning through their logits. As the logit of 0.60 is about 0.405465, then an estimated point spread can be had with the formula

point spread = 3.0*logit(win probability)/0.405465

Update (1/9/2012) – even simpler is:

est. point spread = 7.4*logit(win probability)

A simplified table of the wild card games, with percentages and estimated point spreads is:

Wild Card Playoff Round
Home Team Visiting Team Home Win Pct Est. Point Spread
GB MIN 0.682 5.6
WAS SEA 0.482 -0.5
HOU CIN 0.642 4.3
BAL IND 0.841 12.3

How many successes is a touchdown worth?

We’ve spoken about the potential relationships between success rates, adjusted yards per attempt, and stats like DVOA here, but to make any progress, you need to consider possible relationships between successes and yards. Let me point out the lower bound of the relationship is known, as 3 consecutive successes must yield at least 10 yards, and 30 consecutive successes must end up scoring a touchdown. In this case, the relationship is 1 success is equal to or greater than 3 1/3 yards.

Thus, if the surplus value of a touchdown is 20 yards, that’s 6 successes. If a turnover is worth 45 yards, that’s about 13.5 successes.

A smarter way to get at the mean value of this kind of relationship, as opposed to a limiting value, would be to add up the yards of all successful plays in the NFL and divide by the number of those plays. For now, that’s something to be pursued later.

Things that are easy to note: the teams with at least 9 wins are either guaranteed a playoff birth, or have, at worst, a 99% chance of making the playoffs. The teams with 8 wins have a very good chance of entering the playoffs. Those teams with 7 wins have at least a 50% chance of making the playoffs. Those with 6 wins have between a 5% to 30% chance of making the playoffs. Let’s say they are hoping to get in.

Data from week 12

2012_stats_week_12

Data from week 13

2012_stats_week_13

The methodology of these stats is discussed in previous posts in this series. If you’re wondering where I’m getting odds to go into the playoffs, see this post. If you’re wondering what chance your team has of winning in the playoffs, see this post on my logistic regression methods, based on studies of playoff games. How would your ranking in the playoffs affect your chances of getting into the Super Bowl? We studied that here.

I am not a proponent of the notion that regular season offensive stats are predictive in the post season. My studies suggest p on the order of 0.15 for offensive stats in the post season, and thus aren’t predictive enough for my tastes ( p <= 0.05). That hasn't stopped Football Outsiders from pretending that their proprietary stats are predictive and calculating playoff odds with their tools.

Over some five years, the whole of the Matt Ryan – Mike Smith era, Atlanta has had a habit of outperforming its Pythagoreans:

Atlanta outperforming its Pythagoreans
Year WL% Pythag Delta
2008 69 62 7
2009 56 56 0
2010 81 72 9
2011 63 59 4
2012 (to date) 90 71 19

 

But they’ve never outperformed their Pythagoreans as substantially as they have this year. It can’t be blamed on early season New Orleans collapse, as their only loss was inflicted by New Orleans. New Orleans has only hindered this process. Is it turnover that are causing all this? While the 2010 team had a +14 turnover ratio and the 2011 team had a +8 turnover ratio, the 2012 team has only a +5 turnover ratio at this point and the 2008 team had a -3 turnover ratio. No, it’s something else. For now, perhaps noting that this team tends to outperform its Pythagoreans is enough.

Week 11 scoring stats:

Chicago’s biggest weakness was on display this Monday night, as Aldon Smith had a career day. Aaron Schatz (@FO_Schatz) has sent digging into his archives for the biggest DVOA blowouts of all time. The 32-7 demolition of the Bears by the 49ers wasn’t the worst, but it clearly evoked the worst.

The game plan was heavy on traps and wham blocks, and would have warmed the hearts of anyone who ever played NFL Strategy against a blitz heavy opponent.

It does lead to the question of whether Chicago is in the same downward spiral they experienced last year. At this point, however, you would expect Jay Cutler to return and thus slow down the bleeding.

I believed, in the immediate aftermath of the 2011 season, that with Jason Peters at left tackle, the least of Philadelphia’s worries would have been the tackle position. Instead, he was injured in the off season. In September, Philadelphia center Jason Celce went down with a season ending injury. In the New Orleans game, Todd Herremans suffered a season ending injury, and going into the Dallas game, starting guard Danny Watkins had been out with a sprained ankle.

Losing Todd Herremans: deal breaker for the Eagles? (Image from Wikimedia).

So, in week 10, the Eagles had one healthy starting caliber player, and 4 backups playing on the offensive line. This loss of talent was profound, even in comparison with Dallas, which had 1 backup on the line – though Dallas RG Mackenzie Bernadeau has been pretty marginal as a starter. Simplified, losing tackles is much worse than losing a guard and a center. Result? A markedly ineffective Vick, a thoroughbred offense reduced to dog-sled pace.

No wonder announcers were hyping this as the “end of a season” for one of these teams. Most any cold blooded announcer could have figured out what was about to happen. The only question was how best to pitch it so people would actually watch.

Atlanta: I’ve been comparing the 2012 Atlanta Falcons to the 1976 Oakland Raiders, to make the case that Atlanta has a chance. But the 1976 Raiders had made it to three previous Conference Championship games, while the Mike Smith squads have never gone that far. They lack the deep playoff experience of those 1970s Raiders squads.

The fact is, all scoring stats suggest Atlanta has benefited from plenty of luck. I think, because of a better Julio Jones, that this is a better Falcons team than the 2011 team, but the coaching changes in New Orleans markedly benefited this squad. Yes, Atlanta can be beaten.

Week 9 scoring stats:

Week 10 scoring stats:

If we use the median point spread as a measure of how good Atlanta is, and select the teams within 2 points of their value, you end up with a group that includes San Francisco, New England, Minnesota, and the New York Giants. That’s a talented group of teams, but perhaps not as terrifying as Green Bay, Houston, Denver, and Chicago. Pythagoreans point out three elite teams in Houston, Chicago, and San Francisco, while simple rankings prefer the quartet of Houston, Chicago, Denver and San Francisco.

At this point, perhaps the more appropriate past comparison for the Falcons would be the 1973 Oakland Raiders. Atlanta needs to make some noise in the playoffs first.

Should anyone be worried about the Giants mid season slide? No. They always do this. The question is, will they fully recover in time to make a playoff run. That’s not something that will be entirely answered until week 17.

Week 8, NFL scoring stats:

To explain the columns above, Median is a median point spread, and can be used to get a feel for how good a team is without overly weighting a blowout win or blowout loss. HS is Brian Burke’s Homemade Sagarin, as implemented in Maggie Xiong’s PDL::Stats. Pred is the predicted Pythagorean expectation. The exponent for this measure is fitted to the data set itself. SOS, SRS, and MOV are the simple ranking components, analyzed via this Perl implementation. MOV is margin of victory, or point spread divided by games played. SOS is strength of schedule. SRS is the simple ranking.

NFC East teams with aspirations of playoff contention, Dallas and Philadelphia, appear to be having them derailed by a single common cause. Neither has an offensive line healthy enough or good enough to let their quarterbacks shine. Further, being 1.5 games behind their wild card competition in the NFC North leaves them with precious few chances. Perhaps they’ll improve, perhaps Philadelphia will find a miracle LT and Dallas will scrape together a center and a right guard, but don’t hold your breath waiting. Washington is dynamic, but needs a few pieces here and there. Early season defensive line injuries did that team little good.

For now, what interests me are things like: who in the AFC can compete with Houston? Baltimore looks as if it’s winning on tradition more than dominance. New England looks great, except when it doesn’t. Denver is a powerful work in progress. In the NFC, it’s entirely possible that the NFC North will field 3 playoff teams. Chicago, Green Bay and Minnesota look good. Atlanta continues to roll up wins, the last perhaps its most impressive so far. The Giants continue to show strength. The San Francisco 49ers may be the best team in football right now, leading in HS, #2 in Simple Ranking, and no more than 0.3% off the top of the Pythagoreans.

Totally off the subject: this is the political season, and some of my favorite bloggers are tweeting some politics these days. One of the most interesting of the lot is @skepticalsports. I don’t share the political sentiments of Benjamin Morris, but polite and political – which he manages to do – is a rare combination, and it actually takes some work to be offended by his tweets.

Week 7, NFL scoring stats:

To explain the columns above, Median is a median point spread, and can be used to get a feel for how good a team is without overly weighting a blowout win or blowout loss. HS is Brian Burke’s Homemade Sagarin, as implemented in Maggie Xiong’s PDL::Stats. Pred is the predicted Pythagorean expectation. The exponent for this measure is fitted to the data set itself. SOS, SRS, and MOV are the simple ranking components, analyzed via this Perl implementation. MOV is margin of victory, or point spread divided by games played. SOS is strength of schedule. SRS is the simple ranking.

One of the things dogging Atlanta sports talk radio is “just how good are the Atlanta Falcons”? Statistically, they’re in the top 5-10 but not the very top in the various scoring stats. A lot of their success is based on turnover differential, not a good predictor of success over the long term. The Houston Texans, by contrast, shook off their one game blues are are back in the top 2 or 3 once again.

Chicago for now is at the head of the scoring stats, and those of us familiar with Jay Cutler’s ability to have really bad games will be watching to see if he can keep it up. Coming hard are both the Giants and the Packers. Denver is the best looking of the 3-3 teams.

Week 6, NFL scoring stats:

Atlanta leads in no statistical category except their won-loss record.

To explain the columns above, Median is a median point spread, and can be used to get a feel for how good a team is without overly weighting a blowout win or blowout loss. HS is Brian Burke’s Homemade Sagarin, as implemented in Maggie Xiong’s PDL::Stats. Pred is the predicted Pythagorean expectation. The exponent for this measure is fitted to the data set itself. SOS, SRS, and MOV are the simple ranking components, analyzed via this Perl implementation. MOV is margin of victory, or point spread divided by games played. SOS is strength of schedule. SRS is the simple ranking.

Houston lost to Green Bay, and so for now, they’re no longer the statistical darling of the NFL. Chicago is now the top dog. The Bears are a team that can play great offensive games or horrible ones, and it’s anyone’s guess how long their offensive explosion will last. Minnesota appears to be competitive, and Green Bay and Detroit are coming out of their funks, so I expect a tough divisional battle.

That said, the surprise of the NFC is the tough division race in the NFC West. 3 of the 4 teams have real chances this year, and maybe even Saint Louis will be in the race in a year or two. The conference overall seems to be improved, with tough defenses becoming the norm this season.

The New York Giants won a game that impressed the critics, and if both Dallas and Philadelphia remain snake bitten teams that shoot themselves in their own feet, that could manifest in a great set of statistics over the year. More likely though, the Giants will play everyone tough, perhaps even play a great 8 game stretch, and then have 2-3 mystifying losses to teams they are better than. The lack of a running game makes it hard for the Giants to close out games. What they do will be on the backs of a calm collected QB, their pass rush, and large, gifted wide receivers.

Next Page »

Follow

Get every new post delivered to your Inbox.

Join 197 other followers