January 27, 2012
Posted by foodnearsnellville under Data
| Tags: Wolfram alpha
Wolfram Alpha is a web site and also a $3.00 Android app. I’ve been working with the app for a couple days now, and the web page some of yesterday. It’s a ‘scientific knowledge base‘, and also a powerful calculator and equation solving tool. It graphs, it integrates, it differentiates, it knows what a logit is, and can convert probabilities and logits in either direction.
Wolfram Alpha can save calculated results as PDF files.
Wolfram Alpha can solve equations analytically
Useful as a knowledge base into subjects not covered in high school.
It knows what osmium tetroxide is, gives you the Lewis structure and the toxicology of the compound. It can tell you the mass of Jupiter in earth units, and can map, say, Betelgeuse or Rigel into a Hertzsprung-Russell diagram. It’s not suited to answering many programming questions, though its answer to ‘shell sort’ is interesting enough. Right now, in conjunction with Stats Inc, they’ve loaded 5 years of NFL data into the engine, so the question ‘Tony Romo passer rating’ returns a result. No such luck for ‘Babe Ruth on base percentage’. And don’t ask for ‘Jim Brown yards per carry’ either. It’s not in the engine currently.
It has a nice understanding of genealogy, can convert from any numerical base you like (so useful as a ‘programmers calculator’). The bases to be converted aren’t fixed, though base conversions of any kind give you the number in binary, octal, and hex as a by-product. In theory, you could do Mayan (base 20) or Babylonian (base 60) math.
January 25, 2012
I have a cousin that owns a pretzel shop and not so long ago, while coming up with a variety of NFL themed pretzels, my cousin and her husband came up with this one, a tebowing pretzel
A tebowing pretzel
For a while, it was pretty crazy their way.
I’ve been looking for a calculator on my Kindle Fire I could program to do logits, and found out that Wolfram Alpha can do that, and probably more. The two pictures below should explain why, if you’re into logistic regressions and logistic formulas, why Wolfram Alpha is for you.
Calculating a logit from a probability
Calculating a probability from a logit
pretty cool, that.
January 15, 2012
After the Giants victory over the Packers, I finally got up the nerve to say what my system has been saying from the start, that my predictive system markedly favors the Giants throughout the entire playoffs.
Going all the way?
The deal, of course, is a heavily favored team can lose. A team seeded 1 or 2 and favored by 70% in every game only has a 34% chance of making it through 3 games. The nature of the playoffs make it difficult for any team, even a really good team, to win it all.
That said, the Giants are favored by 75% over the San Francisco 49ers. The only advantage the 49ers hold is home field advantage. The Giants have to be considered a playoff experienced team, and they have a massive strength of schedule advantage, the same advantage that will give them precendence over either New England or Baltimore. If you choose to treat the Giants as having no playoff experience, that lowers their odds to win to a mere 58%.
Favored in the Conference Championship Round:
Giants over 49ers: 75%
NE over Ravens: 59%
Favored in the Super Bowl:
Giants over NE: 66%
Giants over Ravens: 64%
NE over 49ers: 64%
Ravens over 49ers: 65%
Odds of winning the Super Bowl:
For contrast, we’ll calculate the Pythagorean odds for these teams as well, ignoring the effects of strength of schedule, and playoff experience.
49ers over Giants: 86%
NE over Ravens: 61%
49ers over NE: 61%
And the 49ers are favored to win the Super Bowl, via Pythagoreans, by 52%.
Of course, if you’re taking these kinds of offensive metrics seriously, please note the odds of the Giants having made it this far was only 7.4% (Originally calculated as 5.4%). Consider those odds, please, before writing my little predictive system off.
January 8, 2012
Both the Giants and Denver have won today, eliminating all wild cards and leading to two #4 seeds playing at the #1 seeds. In the case of the Giants, using my formula, we have the question of whether they truly have playoff experience. If they do not, then Green Bay is favored, on average, by 56%, though the relative error of strength of schedule results allow for Green Bay being favored by as much as 73% to the Giants being favored by 63%. If the Giants are treated as if they have playoff experience, then there is a wide range of results, from Green Bay being favored by 55% to the Giants being favored by 78%, with the average result being the Giants favored by 63%. Note that home field plus Pythagoreans would favor Green Bay by 83%.
In the Case of Denver versus New England, New England has playoff experience and home field in their favor, and Denver played a tougher schedule. New England is favored by my scheme by 69%. Home field plus Pythagoreans would favor New England by 88%.
January 7, 2012
The wins by Houston and New Orleans ensure that the #3 NFC and AFC seeds will be playing the #2 seeds, and that the #1 seeds will be playing the winner of the #4-#5 game. For now we’ll simply ask: if a team has playoff experience, but a rookie quarterback, does the rookie negate that experience advantage? Houston certainly looked good in their game.
In San Francisco-New Orleans, the Saints have the advantage of playoff experience, but San Francisco has home field and a tough schedule. My code suggests the odds in this game are 50-50. In Baltimore-Houston, Baltimore has all three advantages, and is favored to the tune of a 81% chance to win.
January 2, 2012
Playoff experience is a potent effect, enough to overcome Denver’s advantages in home field and tougher schedule.
Steelers: Super Bowl last year, Away, SOS = -0.84, Pythagorean = 71.8%
Broncos: Last in playoffs 2005, Home, SOS = -0.23, Pythagorean = 35.3%
Typically in playoff games, you don’t see huge differences in offensive stats, because the teams that make it in the modern NFL tend to be good offensive teams.But Denver is nearly as bad this year as Seattle was last year (Seattle actually was worse, with a Pythagorean of 32.7%). Treating this as a regular season game, instead of a playoff game would give PIT a 76% edge. Instead, using the playoff formula, PIT would be favored by 54%.
January 2, 2012
This playoff game is one of the more unique battles, as Houston is lacking playoff experience, as well as Matt Shaub. But with this post we’re going to introduce mods to the code we presented in our previous article, to allow us to set relative bounds on strength of schedule. The SOS effect has a large relative error, about 80%, so what happens to the odds in this matchup when we do exactly that?
Bengals: Playoff Exp 2 yrs ago, Away, SOS = -0.85, Pythagorean = 54.1%
Texans: No Playoffs ever, Home, SOS = -1.90, Pythagorean = 69.5%
Plugging these numbers into my formula, you get CIN favored by 66%. On the low end of the SOS relative, you get CIN favored by 60% and at the high end, CIN favored by 71%. Given that 68% is what you get from playoff experience proper, the effect of Houston’s better record (and thus HFA) is roughly cancelled out by CIN’s better SOS.
So why isn’t Houston favored more, given their powerful offense? As stated previously, offensive metrics aren’t predictive to p = 0.05, more like p = 0.15 or so. Further, Houston had the easiest schedule in all of football. Cinncinnati also had an easy one, but not the easiest one.
January 2, 2012
Much as in the previous series, we’re going to analyze the playoff prospects of New Orleans and Detroit. We’re also going to post the code (very hacky) that I’ve been using to study playoff teams. The code (2 pics required) is as follows:
Now one thing about this code, because it’s using Getopts::Long, numbers have to be positive or else this code will think that the number is an option. The simple fix is to find the value of the most negative SOS and add a positive number equal in magnitude to both SOSs. As the only important value is the difference, this is a valid form of data entry.
Ok, the significant factors, plus Pythagoreans:
Detroit: No playoff exp, Away, SOS = 0.63, Pythagorean 62.9%
New Orleans: Won Super Bowl 2 years ago, Home, SOS = -1.60, Pythagorean 77.7%
Because NO’s SOS is negative, just let it equal zero and add 1.60 to the SOS of Detroit, yielding 2.23. That’s the info you would pump into the calculator above. And it gives you the following results:
New Orlean’s advantage due to playoff experience alone give NO a 68% chance of winning.
Adding in home field advantage give New Orleans a 76% chance of winning.
Adding in strength of schedule reduces New Orleans chances to 69%. New Orleans is heavily favored.
By comparison, after all is said and done, had Atlanta been slotted into this game, the playoff calculator gives Atlanta a 51% chance of winning. Atlanta has a slightly better SOS than Detroit, and it also has recent playoff experience.
Given how powerful the New Orleans offense is, should Atlanta have sought out a team with a weaker offense, such as New York? That’s one of the counterintuitive points of my previous playoff analysis. Offensive metrics tend to yield a p of 0.15, not 0.05. They’re suggestive, not etched in stone advantages. New Orleans’ powerful offense may come into play, but then again, it may not.
January 2, 2012
Way back in 2011 I did a study of factors that were statistically significant in determining playoff wins. There were three: home field advantage, playoff experience, and strength of schedule. And because playoff experience was such an obvious factor in those logistic regression studies, one question I didn’t ask was how far back do you need to go with regard to playoff experience to judge whether a team “has it”. Two years? Three years? Four years? This is important in the case of the New York Giants, because they won a Super Bowl in 2007 and lost in the divisional round in 2008. So, they have deep playoff experience from 4 years ago and also some from 3 years ago. They barely missed the playoffs two years ago and a year ago, were not a playoff factor.
My study however fixed the range of playoff experience at two years, so my formulas are valid for that period. So, to look at the important factors with regard to these two teams, plus Pythagoreans:
New York Giants: Playoff Exp 3 yrs ago, Home, SOS = 1.96, Pythagorean=49.0%
Atlanta Falcons: Playoff Exp last year, Away, SOS = 0.28, Pythagorean= 58.9%
So, if we calculate odds using Pythagoreans and the 60% HFA that playoff teams have had over the past 10 years, you get that the ATL-NYG game is even. If you instead use my strict formula, and deny the NYG any playoff experience advantage, then the New York Giants would be favored by 53%. If you grant that the NYG have playoff experience, and recalculate these odds, then the probability of the Giants winning rises to 71%, one of the highest in this round of play. This is in part due to home field, but also due to the Giants having played the hardest schedule of any playoff team.
To repudiate another notion, that the Giants and Dallas are simply two peas in a pod, that you could roll the dice and choose one over the other, look at the stats of the New York Giants, Dallas, and say, the Atlanta Falcons against teams with a record of 0.500 or more. Against 0.500 or better teams, Dallas was 1-8. It scored 172 points against those teams and gave up 246 points, for a Pythagorean of 0.282. That Pythagorean should have been good for 2.5 wins, which means the team underperformed its own Pythagorean against winning teams. While a lot of Dallas fans will point fingers at the defense (Pro Football Reference had Dallas ranked as the 27th best in pass coverage, prior to week 17), overall consistency also needs to be looked at and addressed.
By contrast, the Giants were 6-4 against 500 or better teams, scored 270 points against 256 given up, and had a .535 Pythagorean against good teams. This is a team whose performance improves when facing good teams, and who outscored their Pythagorean against good teams.
The Falcons record against 0.500 teams or more is 3-5. They scored 156 points against these teams versus giving up 207 points. The team Pythagorean against good teams is 0.324, which totals to 2.6 victories against better teams over 8 games. They performed roughly as expected.
Update (since I don’t know where else to put it): before the NYG-DAL game, Cool Standings was projecting a 62% chance of a Dallas victory (and thus playoff prospects). That prediction only made sense if Cool Standings were ignoring home field advantage in their analysis. It’s something to think about.
January 2, 2012
I was on vacation the week of the 16th, and my job has me weighted down with electronics (being a leveraged asset comes with some debits). Thus, I didn’t take anything with me to calculate the week 16 stats. They are included below.
And of course, week 17 stats, so we can so some serious playoff discussions, largely following my logistic regressions of playoff metrics from the previous year.
To explain the columns above, Median is a median point spread, and can be used to get a feel for how good a team is without overly weighting a blowout win or blowout loss. HS is Brian Burke’s Homemade Sagarin, as implemented in Maggie Xiong’s PDL::Stats. Pred is the predicted Pythagorean expectation. The exponent for this measure is fitted to the data set itself. SOS, SRS, and MOV are the simple ranking components, analyzed via this Perl implementation. MOV is margin of victory, or point spread divided by games played. SOS is strength of schedule. SRS is the simple ranking.