In April of 2011 I published a playoff model, one that described the odds of winning in terms of home field advantage, strength of schedule, and previous playoff experience. In the work I did then, I fixed the length of time of previous playoff experience to 2 years, and as I was working with (and changing) my experimental design (the “y” variable was initially playoff winning percentage, which turned out to be a relatively insensitive parameter), once I had a result with two independent variables, at that point I called it a day and published.
Once the 2011 season rolled into the playoffs, while thinking about the upcoming game between Atlanta and New York, I realized I had never tested the span of time over which playoff experience mattered. I then proposed that New York could be considered to have playoff experience, since it had played in 2007, and if so, there would be marked changes to the odds associated with the New York Giants. This was a reasonable proposition at the time, because no testing had been done on my end to prove or disprove the idea.
Using this notion, the formula we published then racked up a 9-2 record for predicting games, or more cautiously, 7-2-2, as the results obtained for the San Francisco-New Orleans game (50-50 odds) and the NYG-Green Bay game (results yielding possible wins for both teams at the same time) really didn’t lend confidence in betting for either side of those two games.
Once the playoffs were over, I uploaded the new 2011 playoff games and did logistic regressions of these data. I amended the program I used for my analysis to allow for playoff experience to be judged over 1,2,3, or 4 year intervals. I also allowed the program to vary the range of years to be fitted. Please note, that there are a very small number of playoff games played in any particular year (11), and I’ve seen sources that claim you can really only resolve one parameter per 50 data points. If we cut the data set too short, we’re playing with fire in terms of resolving our data. But to explain the experimental protocol, I ran the data from 2001 to 2009, 2001 to 2010, and 2001 to 2011 through fits where playoff experience was judged over a 1, a 2, a 3, and a 4 year span. The results are given below, in a table.
These are data derived from logistic regression fits, using Maggie Xiong’s PDL::Stats, to NFL playoff data. The data were taken from NFL.com. Start year is the first year of playoff data, end year is the last year considered. HFA is the magnitude of the home field advantage. SOS is the strength of schedule metric, as derived from the simple ranking system algorithm. Playoff experience was determined by examining the data and seeing if the team played a playoff game within “playoff span” years of the year in question. Assignment was either a 1 or 0 value, depending on whether the question was true or false. Dm/(n-p) is the model deviance divided by the number of degrees of freedom of the data set. As explained here, this parameter should tend to the value of 1. The p of the parameters above are the confidence intervals of the various fit values. It is better when p is small, and desired is a p < 0.05. P values greater than 0.05 are highlighted in blue.
Note that the best fits are found when the playoff experience span is the smallest. The confidence limits on the playoff parameter are the smallest, the model deviance is the smallest, the confidence limit of the model deviance is the smallest. The best models result from the narrowest possible definition of “playoff experience”, and this result is consistent across the three yearly spans we tested.
So where does this place the idea that the New York Giants were a playoff experienced team in 2011? It places it in the land of the educated guess, the gut call, a notion coming from the same portion of the brain that drew a snake swallowing its own tail in the dreams of August Kekule. Sometimes intuition counts. But in the land of curve fitting, you have to publish your best model, not the one you happen to like for the sake of liking it. The best model I have to date would be the one for the 2001 to 2011 data set, with a playoff experience band defined in terms of a single year. It yields the following logistic formula:
logit P = 0.668 + 0.348*(delta SOS) + 0.434*(delta Playoff Experience)
Compared to the previous formula, the probability resulting from a one unit difference in SOS now becomes 0.58 instead of 0.57 (see the Wolfram Alpha article for an easy way to transform logits into probabilities), but the value of playoff experience now becomes 0.606, instead of 0.68.
If there were one area I’d like to work on with regard to this formula, it would be to find a way to calculate the (dis)advantage of having a true rookie quarterback. I suspect this kind of analysis could be best done with counting. I don’t think a curve fit is necessary in this instance. I suspect a rookie quarterback adjustment would have allowed this formula to more accurately determine the potential winner in the Houston Texans – Cinncinnati Bengals game. After all, 10-1 is better than 9-2.