logistic regression | Code and Football

December 5, 2017

Statistically, does a veteran QB matter in the playoffs?

Posted by foodnearsnellville under Code, Data, Football, Modeling, Statistics | Tags: logistic regression, NFL, NFL playoffs, playoff formulas |
Leave a Comment

This question came up when I was looking up the last year in the playoffs for seven probable NFC playoff teams. Both New Orleans and Philadelphia last played in the playoffs four years ago, in 2013. And then the thought came up in my head, “But Drew Brees is a veteran QB.” This seems intuitive, but wanting to actually create such a definition and then later to test this using a logistic regression, there is the rub.

There are any number of QBs a fan can point to and see that the QB mattered. Roger Staubach seemed a veteran in this context back in the 1970s, Joe Montana in the 1980s, Ben Roethlisberger in the 21st century, Eli Manning in 2011, and Aaron Rogers last year. But plenty of questions abound. If a veteran QB is an independent variable whose presence or absence changes the odds of winning a playoff game, what tools do we use to define such a person? What tools would we use to eliminate entanglement, in this case between the team’s overall offensive strength and the QB himself?

The difference between a good metric and a bad metric can be seen when looking at the effect of the running game on winning. The correlation between rushing yards per carry and winning is pretty small. The correlation between run success rate and winning are larger. In short, being able to reliably make it on 3rd and 1 contributes more to success than running 5 yards a carry as opposed to 4.

At this point I’m just discussing the idea. With a definition in mind, we can do one independent variable logistic regression tests. Then with a big enough data set – 15 years of playoff data should be enough, we can start testing three independent variable logistic models (QB + SOS + PPX).

May 1, 2011

Welcome Statheads; draft and playoff wins considered.

Posted by foodnearsnellville under Atlanta Falcons, Blogging, Data, Football, Statistics | Tags: draft, logistic regression, logits, NFL, playoff home field advantage, playoff wins, previous playoff experience, strength of schedule |
1 Comment

I’ll note that the Sports Reference blog Statheads now has this blog on their sidebar, something I’m quite grateful for. Not that you don’t have to fight for readership as an amateur in football blogging, because you do. But to be spoken of in the same breath as sites like Football Outsiders or Advanced Football Stats is, well, heady stuff. So to Neil Paine, thank you.

What I’m going to say to readers that are largely team (e.g. Dallas, Atlanta, Green Bay, Pittsburgh, Chicago) fans, you’ll get the best bang for your buck by looking at the tag cloud on the right of the blog and clicking ones that interest you. If you’re one of the football coaches that have drifted here because of Coach Hoover’s recommendations, I’d suggest your best usage of this site would be to follow the tags “46 defense” and “defensive fronts”. For those for whom algebra isn’t an issue, just read the general flow of this board. I’m going to try and keep the look visual and fold most of the numerical results behind “more” tags. The guy who wants to see Rob Ryan or Dom Capers get after the quarterback doesn’t need screen shots of program output, and the guy who does can click on the “more”.

The big splash of the draft, from my point of view, was the trade from 27th to 6th by the Falcons. They netted Julio Jones with the trade, giving up two firsts, one second, and 2 4th round choices in the process. This is because the Falcons felt they weren’t explosive enough, and so had to improve on an offense ranked in the top ten in the league. That their defense was mediocre and the salient feature of the last Super Bowl was that the #1 and #2 ranked defenses met, seemed to be bypassed in the quest for game changing explosiveness. I suspect trying to become a 21st century Air Coryell certainly has fan and box office appeal, but is it wise over the longer term? I’m reminded of the nursery rhyme:

For want of a nail the shoe was lost.
For want of a shoe the horse was lost.
For want of a horse the rider was lost.
For want of a rider the battle was lost.
For want of a battle the kingdom was lost.
And all for the want of a horseshoe nail.

I’m not convinced there was that much more air based explosiveness to get out of the Falcons offense. Perhaps on the ground, where some rest for Michael Turner might save him from regression to the mean.

JMO, but the focus of the Falcons is classic Parcells style ball control, where yards per carry are far less important than time of possession (this, incidentally, is why Curtis Martin will always be underrated by YPC-heads – Parcells just never cared about his ball carriers YPC). In such an offense, the most important component of the offense are first downs, not pretty stats. Given how light the Falcons defensive line tends to be, keeping them off the field as much as possible has to be a serious design consideration for the whole team.

(more…)

April 25, 2011

Predicting playoff wins in the NFL – what factors are important?

Posted by foodnearsnellville under Code, Data, Statistics | Tags: Benjamin Morris, logistic regression, logits, PDL, PDL::Stats, Perl, playoff wins, statistical factors |
[13] Comments

I was, to some extent, inspired by the article by Benjamin Morris on his blog Skeptical Sports, where he suggests that to win playoff games in the NBA, three factors are most important: winning percentage, previous playoff experience, and pace – a measure of possessions. Pace translated into the NFL would be a measure that would count elements such as turnovers and punts. In the NBA, a number of elements such as rebounds + turnovers + steals would factor in.

I’ve recently captured a set of NFL playoff data from 2001 to 2010, which I analyzed by converting those games into a number. If the home team won, the game was assigned a 1. If the visiting team won, the game was assigned a 0. Because of the way the data were organized, the winner of the Super Bowl was always treated as the home team.

I tested a variety of pairs of regular season statistical elements to see which ones correlated best with playoff winning percentage. The test of significance was a logistic regression (see also here), as implemented in the Perl module PDL::Stats.

Two factors emerge rapidly from this kind of analysis. The first is that playoff experience is important. By this we mean that a team has played any kind of playoff game in the previous two seasons. Playoff wins were not significant in my testing, by the way, only the experience of actually being in the playoffs. The second significant parameter was the SRS variable strength of schedule. Differences in SRS were not significant in my testing, but differences in SOS were. Playing tougher competition evidently increases the odds of winning playoff games.

(more…)

April 13, 2011

Was sabermetrics, now analytics.

Posted by foodnearsnellville under Blogging, Football, Statistics | Tags: analytics, Benjamin Morris, Brian Burke, Homemade Sagarin, logistic regression, PDL, PDL::Stats, Perl, sabermetrics |
Leave a Comment

We’ll start on a small, pretty blog called “Sabermetrics Research” and this article, which encapsulates nicely what’s happening. Back when sabermetrics was a “gosh, wow!” phenomenon and mostly the kind of thing that drove aficionados to their campus computing facility, the phrase “sabermetrics” was okay. Now that this kind of analysis is going in-house (a group of speakers (including Mark Cuban) are quoted here as saying that perhaps 2/3 of all basketball teams now have a team of analysts), it’s being called “analytics”. QM types, and even the older analysts, need a more dignified word to describe what they do.

The tools are different. There is the phrase logistic regression all over the place (such as here and here). I’ve been trying to rebuild a toolset quickly. I can code stuff in from “Numerical Recipes” as needed, and if I need a heavyweight algorithm, I recall that NL2SOL (John Dennis was a Rice prof, I’ve met him) is available as part of the R language. Hrm. Evidently, NL2SOL is also available here. PDL, as a place to start, has been fantastic. It has hooks to tons of things, as well as their built-ins.

Logistics regression isn’t a part of PDL but it is a part of PDL::Stats, a freely available add on package, available through CPAN. So once I’ve gnawed on the techniques enough, I’d like to try and see if Benjamin Morris’s result, combining winning percentage and average point spread (which, omg, is now called MOV, for margin of victory) and showing that the combination is a better predictor of winning than either in basketball, carries over to football.

I suspect, given that Brian Burke would do a logistic regression as soon as tie his shoes, that it’s been done.

To show what PDL::Stats can do, I’ve implemented Brian Burke’s “Homemade Sagarin” rankings into a bit of code I published previously. The result? This simple technique had Green Bay ranked #1 at the end of the 2010 season.

There are some issues with this technique. I’ll be talking about that in another article.

Search for:
3-4 4-3 5-2 5-2 Oklahoma 6-2 46 46 defense adjusted yards per attempt approximate value Benjamin Morris Bill Belichick Bob Carroll book books Brian Burke Buddy Ryan Chris Brown classic CPAN David Romer defense defensive front defensive fronts Doug Farrar draft DVOA expected points flex defense football football books Football Outsiders football pythagorean football statistics Homemade Sagarin Jimmy Johnson John Thorn Keith Goldner logistic regression median point spread mock draft NFL NFL books NFL draft NFL passer rating NFL playoffs nickel front odds Paul Zimmerman PDL PDL::Stats Perl Pete Palmer playoff model playoffs Pro Football Focus Pro Football Reference Pythagorean expectation pythagorean expectation 2011 ranking statistics Rex Ryan risk analysis Rob Ryan Ron Jaworski scoring scoring model scoring models simple ranking Simple Ranking System Smart Football Sports Illustrated The Hidden Game of Football Tom Landry trade risk Vince Lombardi winning
Analysis Atlanta Falcons Baltimore Ravens Blogging Books and Articles Chicago Bears Cleveland Browns Code Dallas Cowboys Data Defense Denver Broncos Draft Football Green Bay Packers History and Biography Kansas City Chiefs Los Angeles Rams Minnesota Vikings Modeling New England Patriots New Orleans Saints New York Giants Philadelphia Eagles Pittsburgh Steelers San Francisco 49ers Statistics Video Washington Redskins Xs and Os
Top Posts & Pages
Blogroll
- AdamJT13 AdamJT13′s blog. Salary Cap and compensation pick wizard. Cowboys fan.
- Blogging the bEast Eagles fan, but covers all 4 NFC East teams.
- Count's Corner Canadian Cowboy’s Fan’s blog.
- Cowboys Nation Rafael Vela’s blog. Better analysis than most.
- Dallas Cowboys Books Reviews of books and DVDs on the ‘Boys.
- Fifth Down Blog More newspaper outlet than truly amateur blog. Still, it can have superb articles.
- Fix My Franchise 110% fans, 110% of the time. Enjoyable.
- Food Near Snellville My food blog. Started modestly, then grew.
- Football Relativity Smart blog. Intelligent premise. Nicely done categories.
- Future Sons of Washington A Redskins draft blog. Just getting started.
- Iggles Blog multiple author, fan orientation, bleeding Eagles Green. Links to plenty other Eagles sites.
- Legend of Kirby Dar Dar Yakuza Rich’s new blog. One of these days he’ll stick with a blog, and football fandom will be better for it. I don’t always agree with him, but he’s invariably interesting.
- Live Ball Sports Three authors, multiple sports, deep analysis, with a serious analytics flavor.
- NFL Draft Rage Articles, photos, and Youtube content make this a lively draft site.
- NFL Football Now Lively general perspective NFL blog
- Reading and Thinking Football The replacement to Residual Prolixity. Some of the best reviews of sports books anywhere, and the author is a first rate thinker.
- Residual Prolixity Some fantastic reviews on football books. FO contributor.
- SDogo's Blog Active draft fan.
- Swinging Gate DC 3 guys talking thoughtfully about their beloved Redskins
Football Forums
- Coach Huey Both a forum and a great place to chill out and read up on some Xs and Os
- Cowboys Zone Huge Cowboys fan site. Most of my peers migrated here from Usenet circa 2004-2005.
- Extreme Skins Large, lively Redskins forum with an excellent draft thread.
- Falc Fans The admin, Pudge, makes this a fine Atlanta Falcons site.
Football Sabermetrics
- Advanced NFL Stats Win Probability central, and one of the most accessible analytics sites out there. Perhaps my first recommendation for a newcomer to football analytics.
- Drive-By Football One of the new wave of professional analysts.
- Football is Sex Baby German language analytics blog with a focus on the German Football League. Use Chrome and translate.
- Football Outsiders Authors of “Football Outsiders Almanac”. One of the oldest, if not the oldest, football analytics sites.
- Football Perspective Chase Stuart’s analytics blog. Creative,interesting,worth a read.
- Outside the Hashes Some really nice EPA work on college football can be found here.
- Pro Football Focus Ambitious attempt to do stats on every NFL player playing.
- Skeptical Sports Analysis Analysis and plots to die for.
- Statheads (Sports Reference) Saber – erm- Analytics Ground Zero. It’s all referenced here.
Media
- 680 The Fan Blogs Musings from the sports talk radio pros in Atlanta.
- Brian Billick's blog After you read his book, check out his blog sometime.
- Pro Football Daly Dan Daly is the author of “National Forgotten League” and an expert on the early history of professional football.
- Rich Tandler's Real Redskins Author of books on the Skins and on the Hokies. Been interesting so far!
- Takin It To The House Lloyd Vance is a NFL writer and analyst. Interesting articles, interesting blogroll.
Playbooks
- Fast and Furious Football Free NFL Playbooks available here.
Power Ranking Sites
- Beatpaths A strikingly original way to calculate power ratings.
Statistics and History
- Doug Stats At least 20 season of NBA team stats.
- Draft History Simple, easy to navigate, excellent resource.
- Pro Football Reference Simple, accurate, easy to use site.
Xs and Os
- Blitzology On the cutting edge of modern defensive technique.
- Coach Hoover's Blog Good articles, good resource for coaching info, playbooks, especially coaching clinics.
- Coach Huey Both a forum and a great place to chill out and read up on some Xs and Os
- Football is Life Coaches blog with some interesting 46 material
- Football Stuff inactive now, but 2-3 pages of some in depth Xs and Os.
- Smart Football Hard Core Xs and Os, amazing scope. A “wow” so far.
Categories
Archives

Code and Football

Statistically, does a veteran QB matter in the playoffs?

Welcome Statheads; draft and playoff wins considered.

Predicting playoff wins in the NFL – what factors are important?

Was sabermetrics, now analytics.

Top Posts & Pages

Blogroll

Football Forums

Football Sabermetrics

Media

Playbooks

Power Ranking Sites

Statistics and History

Xs and Os

Categories

Archives