February | 2012 | Code and Football

February 2012

Monthly Archive

February 28, 2012

Pre Draft Post Combine thoughts – taking the measure of fan desires

Posted by foodnearsnellville under Cleveland Browns, Dallas Cowboys, Draft, Miami Dolphins, Saint Louis Rams, Speculation, Washington Redskins | Tags: David DeCastro, Robert Griffin III |
1 Comment

About as soon as Dallas lost their last game of the season, a veritable consensus formed in fan circles about what Dallas should do: almost any fan mocker worth his salt had Dallas picking up Carl Nicks in free agency and drafting David DeCastro in the 14th. That this quinella might be hard to pull off didn’t faze the crowd, and arguing with any of these guys an amazing waste of time. I felt as if I was looking at the daily barrage of “Patrick Peterson falls to the Boys” exuberance all over again.

As @FO_MTanier has noted on Twitter, this interest in DeCastro spills over into the media as well.

It’s taken perhaps a month, but those fans that claim “insider” connections, and are respected in general for, perhaps, actually having those connections are saying now that the Boys are more looking for a center in free agency and will let the guards they have develop. Costa is regarded as the weak link, not the collection of talent at guard. Further, Steven Jones has said that the defense needs work, and media/fan draft interest is beginning to shift to others, people such as Melvin Ingram, Luke Kuechly and Dontari Poe.

Is this man a future Redskin? Image from Ceasarscott of Wikimedia.

The recent combine workout, including a 4.38 40 times of Robert Griffin III has changed the fan status of RG3 in the eyes of Redskins fans to something approaching blowtorch heat. It hasn’t been mellowed by Saint Louis openly shopping the second pick. The first pick is pretty much assumed to be Oliver Luck, but a fella with a cannon arm, clear intelligence and Vick-like speed leads people more and more to think that Mr. Griffin will be, at worst, a poor man’s Vick. And if he’s more judicious with his throws than Michael, learns the game more intimately, well then all the better. There is now the smell of potential top 10 QB around RG3.

Four teams are thought to be interested in Robert Griffin: Cleveland, Washington, Miami, and Seattle. How much of that is real, how much of that is assumed, I can’t tell presently. Talk radio has Cleveland in the driver’s seat for a trade, as it has 2 number 1 picks, and the #4 pick as well. The Skins, by contrast, have only the #6 pick.

A #2 pick is worth 2600 points on the JJ chart. Cleveland’s #4 is worth 1800, and their #22 pick is worth 780, about even in value. That said, Peter King is making comparisons to the trade for Ryan Leaf, which netted 2 firsts, a second, and Eric Metcalf. Already, in Redskins fan circles, people are saying they wouldn’t pay 3 #1s for RG3, but if the Leaf trade sets the benchmark, I’d suggest that the equivalent of three #1 choices is the going rate for a potential top 10 QB.

The Redskin’s first round choice is worth 1600 points. How do they make up 1000 points without at least giving up another #1? Beyond that, what sweetener could they give that would make their trade better than Cleveland’s two #1s?

Useful links:

NFP’s take on Dallas team needs.

NFP’s take on Redskin team needs.

NFP’s take on Eagles team needs.

NFP’s take on Giants’ team needs.

February 9, 2012

The C&F NFL playoff model: what it says about previous playoff experience

Posted by foodnearsnellville under Data, Football, Modeling, Statistics | Tags: NFL playoffs, playoff prediction |
[11] Comments

In April of 2011 I published a playoff model, one that described the odds of winning in terms of home field advantage, strength of schedule, and previous playoff experience. In the work I did then, I fixed the length of time of previous playoff experience to 2 years, and as I was working with (and changing) my experimental design (the “y” variable was initially playoff winning percentage, which turned out to be a relatively insensitive parameter), once I had a result with two independent variables, at that point I called it a day and published.

Once the 2011 season rolled into the playoffs, while thinking about the upcoming game between Atlanta and New York, I realized I had never tested the span of time over which playoff experience mattered. I then proposed that New York could be considered to have playoff experience, since it had played in 2007, and if so, there would be marked changes to the odds associated with the New York Giants. This was a reasonable proposition at the time, because no testing had been done on my end to prove or disprove the idea.

Using this notion, the formula we published then racked up a 9-2 record for predicting games, or more cautiously, 7-2-2, as the results obtained for the San Francisco-New Orleans game (50-50 odds) and the NYG-Green Bay game (results yielding possible wins for both teams at the same time) really didn’t lend confidence in betting for either side of those two games.

Once the playoffs were over, I uploaded the new 2011 playoff games and did logistic regressions of these data. I amended the program I used for my analysis to allow for playoff experience to be judged over 1,2,3, or 4 year intervals. I also allowed the program to vary the range of years to be fitted. Please note, that there are a very small number of playoff games played in any particular year (11), and I’ve seen sources that claim you can really only resolve one parameter per 50 data points. If we cut the data set too short, we’re playing with fire in terms of resolving our data. But to explain the experimental protocol, I ran the data from 2001 to 2009, 2001 to 2010, and 2001 to 2011 through fits where playoff experience was judged over a 1, a 2, a 3, and a 4 year span. The results are given below, in a table.

Table Explanation:

These are data derived from logistic regression fits, using Maggie Xiong’s PDL::Stats, to NFL playoff data. The data were taken from NFL.com. Start year is the first year of playoff data, end year is the last year considered. HFA is the magnitude of the home field advantage. SOS is the strength of schedule metric, as derived from the simple ranking system algorithm. Playoff experience was determined by examining the data and seeing if the team played a playoff game within “playoff span” years of the year in question. Assignment was either a 1 or 0 value, depending on whether the question was true or false. D_m/(n-p) is the model deviance divided by the number of degrees of freedom of the data set. As explained here, this parameter should tend to the value of 1. The p of the parameters above are the confidence intervals of the various fit values. It is better when p is small, and desired is a p < 0.05. P values greater than 0.05 are highlighted in blue.

Note that the best fits are found when the playoff experience span is the smallest. The confidence limits on the playoff parameter are the smallest, the model deviance is the smallest, the confidence limit of the model deviance is the smallest. The best models result from the narrowest possible definition of “playoff experience”, and this result is consistent across the three yearly spans we tested.

So where does this place the idea that the New York Giants were a playoff experienced team in 2011? It places it in the land of the educated guess, the gut call, a notion coming from the same portion of the brain that drew a snake swallowing its own tail in the dreams of August Kekule. Sometimes intuition counts. But in the land of curve fitting, you have to publish your best model, not the one you happen to like for the sake of liking it. The best model I have to date would be the one for the 2001 to 2011 data set, with a playoff experience band defined in terms of a single year. It yields the following logistic formula:

logit P = 0.668 + 0.348*(delta SOS) + 0.434*(delta Playoff Experience)

Compared to the previous formula, the probability resulting from a one unit difference in SOS now becomes 0.58 instead of 0.57 (see the Wolfram Alpha article for an easy way to transform logits into probabilities), but the value of playoff experience now becomes 0.606, instead of 0.68.

If there were one area I’d like to work on with regard to this formula, it would be to find a way to calculate the (dis)advantage of having a true rookie quarterback. I suspect this kind of analysis could be best done with counting. I don’t think a curve fit is necessary in this instance. I suspect a rookie quarterback adjustment would have allowed this formula to more accurately determine the potential winner in the Houston Texans – Cinncinnati Bengals game. After all, 10-1 is better than 9-2.

February 6, 2012

NFL Playoffs: How home field advantage affects playoff odds.

Posted by foodnearsnellville under Data, Football, Statistics | Tags: home field advantage, NFL playoffs, odds |
[6] Comments

The playoffs are a funny bit of business, where people tend to assume the #1 seed has a really good chance of making it to the Super Bowl. That is, unfortunately, not even close to the truth. If you ignore home field advantage, then it becomes easy to see that in these circumstances, the #1 and #2 seeds have 1 chance in 8 of winning (0.125), whereas seeds 3-6 have a 1 in 16 chance of winning (0.0625). But since in the playoffs, there is a home field advantage (at least until you reach the Super Bowl), the actual odds from Seeds 1 to 6 vary quite dramatically.

For now, we’re going to assume a home field advantage of 0.60. From 2001 to 2010, 100 non-Super Bowl playoff games were played, and the home team won 60 of them. This year, the home team won every time, unless the visitor was named the New York Giants, leading to a record of 8-2. So, I guess, the running total now, from 2001 to 2011, has to be 68/110, or 61.8% or so.

That said, I’m still going to use 60% in my calculations below.

For the sake of making it easier to turn any calculations into code, we’ll assign the home field advantage to the variable U (for “upper”), and to 1 – U, we will assign the variable L (for “lower”). Given these assignments, we now have:

Temporary variables:

LL = L*L
T₂₃ = U*L + L*U
T₄₅ = LL*U + (1. – LL)*L

Calculations of playoff odds

Seed 1 = U*U*0.50
Seed 2 = U*T₂₃*0.50
Seed 3 = U*L*T₂₃*0.50
Seed 4 = U*L*T₄₅*0.50
Seed 5 = L*L*T₄₅*0.50
Seed 6 = L*L*L*0.50

T₂₃ is necessary to calculate the second game of Seed 2 or the third game of Seed 3. In this game, these two teams could face Seed 1, Seed 4, Seed 5, or Seed 6. Critically, they will either face Seed 1, for which they would be the visiting team, or all others, for which they would be the home team. The odds therefore become (odds of Seed 1 winning)(vistor’s odds) + (1 – odds of Seed 1 winning)(home team odds).

T₄₅ is necessary to calculate the third game of Seed 4 or 5. In this game, these two teams could face Seed 1, Seed 2, Seed 3, or Seed 6. As Seed 6 is the only team for which Seeds 4 and 5 would be the home team, it is easiest to calculate the odds of Seed 6 making it to the third game, and then subtract those odds for the probability of playing as the visitors. Since the odds of Seed 6 arriving at game 3 are L*L, you end up with the formula given above.

Choosing a value of 0.60 for the home field advantage, we end up with:

Seed 1 : 0.18
Seed 2 : 0.144
Seed 3 : 0.0576
Seed 4 : 0.05184
Seed 5 : 0.03456
Seed 6 : 0.032

The range, from 18% to about 3%, is considerably more broad than the naive 1/8 to 1/16 values. Home field has a marked effect on the ability of teams to reach and win the Super Bowl. But the sheer number of teams involved, 12, and the arrangement of the playoffs, means that a #1 seed has, with a HFA of 60%, about a 36% change of making it to the Bowl, and a 18% chance of winning.

Note: this link has a coded version of the calculations above.

February 4, 2012

Playoffs aren’t the regular season

Posted by foodnearsnellville under Data, Football, New England Patriots, New York Giants, Statistics | Tags: NFL playoffs, strength of schedule |
Leave a Comment

When you try to think of the NFL playoffs as simply an extension of the regular season, you screw up. Advantages that reliably yield wins under regular season conditions – think of the dominance of the San Francisco 49ers defense, at times, in the NFC Championship game two weeks ago – aren’t consistent enough in the post season. A lot of games are decided by, well, small effects, perhaps intangibles, at this time of year.

Part of the reason is that the gap in the classical offensive and defensive metrics is much more narrowed in the post season; you’re looking at such small differences in net offensive potential that other elements come into play. The other component, as far as I can tell, is that traditional analysts, focused on the analysis of the regular season, are loathe to abandon tools that worked so well on the 16 regular season games. If it’s 66-75% accurate during the regular season, isn’t that enough in the post season?

In my opinion, the answer is no. Regular tools fail because the playoff system has already selected for teams that are good at scoring and preventing scoring. Those teams are, to a first approximation, already well matched. You can’t use regular season tools reliably. You have to analyze for playoff specific causes of wins and losses.

This is the only reason I can come up with for the recent analyses of the strength of schedule metric. Analysts have noted (see here and here) that it is negatively correlated with winning. This year has particularly potent effects, using Football Outsider’s definition of the SOS metric. Jim Glass, in the FO article, nails the effect on the head when he states:

The fact that stronger teams play easier schedules and weaker teams play tougher ones results trivially from the fact that teams cannot play themselves. As teams cannot play themselves, in lieu of doing so the strongest teams must play the weaker and the weakest the stronger.

This, of course, begs the question that my playoff results pose: if strength of schedule correlates with losing, then why do playoff teams with advantages in the strength of schedule metric win? The confidence limit of this effect is larger than the one for playoff experience, in my measurements. Given the right experimental design, this is pretty much a given.

Back in the early 1990s, I used to call this the “NFC East effect” and it seemed as obvious to me as the nose on my face. The NFC East was the toughest division in football. Whatever team won the NFC East was bound to win the Super Bowl because they had faced such incredibly hard competition, that anyone else was a patsy by comparison (with the possible exception of the San Francisco 49ers). And whether any division could again gain such dominance, I don’t know. The salary cap has made it hard to hold such powerful teams together.

I’m posting now because the 2007 (and now 2011) New York Giants are a poster child for this phenomenon. My formula gave the New York Giants a 61% advantage in the 2007 Super Bowl. It is giving the Giants an advantage in this Super Bowl as well, by 66%. By traditional metrics, the 2011 Giants shouldn’t have survived so much as their first playoff game. They managed, this year, to win three. The largest measurable advantage they had in this year’s playoffs is their exceptional strength of schedule.

So, win or lose, the question is still out there. If regular season stats are so important, why are the Giants winning? And if you’re using a “regular season” model to predict playoffs, perhaps you need to step back and start analyzing the playoffs on their own, without preconception.

Search for:
3-4 4-3 5-2 5-2 Oklahoma 6-2 46 46 defense adjusted yards per attempt approximate value Benjamin Morris Bill Belichick Bob Carroll book books Brian Burke Buddy Ryan Chris Brown classic CPAN David Romer defense defensive front defensive fronts Doug Farrar draft DVOA expected points flex defense football football books Football Outsiders football pythagorean football statistics Homemade Sagarin Jimmy Johnson John Thorn Keith Goldner logistic regression median point spread mock draft NFL NFL books NFL draft NFL passer rating NFL playoffs nickel front odds Paul Zimmerman PDL PDL::Stats Perl Pete Palmer playoff model playoffs Pro Football Focus Pro Football Reference Pythagorean expectation pythagorean expectation 2011 ranking statistics Rex Ryan risk analysis Rob Ryan Ron Jaworski scoring scoring model scoring models simple ranking Simple Ranking System Smart Football Sports Illustrated The Hidden Game of Football Tom Landry trade risk Vince Lombardi winning
Analysis Atlanta Falcons Baltimore Ravens Blogging Books and Articles Chicago Bears Cleveland Browns Code Dallas Cowboys Data Defense Denver Broncos Draft Football Green Bay Packers History and Biography Kansas City Chiefs Los Angeles Rams Minnesota Vikings Modeling New England Patriots New Orleans Saints New York Giants Philadelphia Eagles Pittsburgh Steelers San Francisco 49ers Statistics Video Washington Redskins Xs and Os
Top Posts & Pages
Blogroll
- AdamJT13 AdamJT13′s blog. Salary Cap and compensation pick wizard. Cowboys fan.
- Blogging the bEast Eagles fan, but covers all 4 NFC East teams.
- Count's Corner Canadian Cowboy’s Fan’s blog.
- Cowboys Nation Rafael Vela’s blog. Better analysis than most.
- Dallas Cowboys Books Reviews of books and DVDs on the ‘Boys.
- Fifth Down Blog More newspaper outlet than truly amateur blog. Still, it can have superb articles.
- Fix My Franchise 110% fans, 110% of the time. Enjoyable.
- Food Near Snellville My food blog. Started modestly, then grew.
- Football Relativity Smart blog. Intelligent premise. Nicely done categories.
- Future Sons of Washington A Redskins draft blog. Just getting started.
- Iggles Blog multiple author, fan orientation, bleeding Eagles Green. Links to plenty other Eagles sites.
- Legend of Kirby Dar Dar Yakuza Rich’s new blog. One of these days he’ll stick with a blog, and football fandom will be better for it. I don’t always agree with him, but he’s invariably interesting.
- Live Ball Sports Three authors, multiple sports, deep analysis, with a serious analytics flavor.
- NFL Draft Rage Articles, photos, and Youtube content make this a lively draft site.
- NFL Football Now Lively general perspective NFL blog
- Reading and Thinking Football The replacement to Residual Prolixity. Some of the best reviews of sports books anywhere, and the author is a first rate thinker.
- Residual Prolixity Some fantastic reviews on football books. FO contributor.
- SDogo's Blog Active draft fan.
- Swinging Gate DC 3 guys talking thoughtfully about their beloved Redskins
Football Forums
- Coach Huey Both a forum and a great place to chill out and read up on some Xs and Os
- Cowboys Zone Huge Cowboys fan site. Most of my peers migrated here from Usenet circa 2004-2005.
- Extreme Skins Large, lively Redskins forum with an excellent draft thread.
- Falc Fans The admin, Pudge, makes this a fine Atlanta Falcons site.
Football Sabermetrics
- Advanced NFL Stats Win Probability central, and one of the most accessible analytics sites out there. Perhaps my first recommendation for a newcomer to football analytics.
- Drive-By Football One of the new wave of professional analysts.
- Football is Sex Baby German language analytics blog with a focus on the German Football League. Use Chrome and translate.
- Football Outsiders Authors of “Football Outsiders Almanac”. One of the oldest, if not the oldest, football analytics sites.
- Football Perspective Chase Stuart’s analytics blog. Creative,interesting,worth a read.
- Outside the Hashes Some really nice EPA work on college football can be found here.
- Pro Football Focus Ambitious attempt to do stats on every NFL player playing.
- Skeptical Sports Analysis Analysis and plots to die for.
- Statheads (Sports Reference) Saber – erm- Analytics Ground Zero. It’s all referenced here.
Media
- 680 The Fan Blogs Musings from the sports talk radio pros in Atlanta.
- Brian Billick's blog After you read his book, check out his blog sometime.
- Pro Football Daly Dan Daly is the author of “National Forgotten League” and an expert on the early history of professional football.
- Rich Tandler's Real Redskins Author of books on the Skins and on the Hokies. Been interesting so far!
- Takin It To The House Lloyd Vance is a NFL writer and analyst. Interesting articles, interesting blogroll.
Playbooks
- Fast and Furious Football Free NFL Playbooks available here.
Power Ranking Sites
- Beatpaths A strikingly original way to calculate power ratings.
Statistics and History
- Doug Stats At least 20 season of NBA team stats.
- Draft History Simple, easy to navigate, excellent resource.
- Pro Football Reference Simple, accurate, easy to use site.
Xs and Os
- Blitzology On the cutting edge of modern defensive technique.
- Coach Hoover's Blog Good articles, good resource for coaching info, playbooks, especially coaching clinics.
- Coach Huey Both a forum and a great place to chill out and read up on some Xs and Os
- Football is Life Coaches blog with some interesting 46 material
- Football Stuff inactive now, but 2-3 pages of some in depth Xs and Os.
- Smart Football Hard Core Xs and Os, amazing scope. A “wow” so far.
Categories
Archives

Code and Football

February 2012

Pre Draft Post Combine thoughts – taking the measure of fan desires

The C&F NFL playoff model: what it says about previous playoff experience

NFL Playoffs: How home field advantage affects playoff odds.

Playoffs aren’t the regular season

Top Posts & Pages

Blogroll

Football Forums

Football Sabermetrics

Media

Playbooks

Power Ranking Sites

Statistics and History

Xs and Os

Categories

Archives