Football


The final game really didn’t help Atlanta’s SOS much, but I’ll note that numbers are slowly beginning to look more normal. SRS isn’t a good stat at 3 games, and may not be a good stat at 4. As the season goes on, it will get better, and SOS, by the end of the season, is one component in a formula that predicts post season success.

Global Statistics:
Games  Home Wins Winning_Score Losing_Score Margin
48         26        27.85         17.83     10.02

Calculated Pythagorean Exponent:  3.30

Rank  Team    Median  GP   W   L   T  Pct   Pred   SRS    MOV   SOS
------------------------------------------------------------------------
1     PHI     19.0     3   3   0   0 100.0  98.3  19.99  21.67 -1.67
2     DEN     12.0     3   3   0   0 100.0  78.3  10.49   9.00  1.49
3     MIN      9.0     3   3   0   0 100.0  82.5   9.77   8.00  1.77
4     NE       7.0     3   3   0   0 100.0  87.5  17.34  12.00  5.34
5     BAL      5.0     3   3   0   0 100.0  70.2   3.68   4.33 -0.65
6     PIT      8.0     3   2   1   0  66.7  48.7   1.43  -0.33  1.77
7     ATL      7.0     3   2   1   0  66.7  60.9  -8.27   4.33 -12.60
8     HOU      7.0     3   2   1   0  66.7  31.7   3.07  -3.67  6.74
9     KC       6.0     3   2   1   0  66.7  75.6   8.96   6.67  2.29
10    LA       5.0     3   2   1   0  66.7  26.1 -11.22  -5.67 -5.56
11    DAL      4.0     3   2   1   0  66.7  69.5  -3.31   5.67 -8.97
12    GB       4.0     3   2   1   0  66.7  59.2   3.35   2.67  0.68
13    SEA      2.0     3   2   1   0  66.7  75.5   1.64   5.00 -3.36
14    OAK      1.0     3   2   1   0  66.7  51.0  -9.07   0.33 -9.40
15    NYG      1.0     3   2   1   0  66.7  52.7  -9.16   0.67 -9.83
16    DET     -1.0     3   1   2   0  33.3  46.0  -2.00  -1.33 -0.66
17    CAR     -1.0     3   1   2   0  33.3  56.8   7.40   2.00  5.40
18    NYJ     -1.0     3   1   2   0  33.3  31.9  -1.97  -5.33  3.37
19    ARI     -2.0     3   1   2   0  33.3  67.9   7.75   5.33  2.42
20    MIA     -2.0     3   1   2   0  33.3  46.2   5.20  -1.00  6.20
21    SD      -4.0     3   1   2   0  33.3  64.1   5.77   4.67  1.11
22    IND     -4.0     3   1   2   0  33.3  37.1   0.09  -4.67  4.76
23    WAS     -4.0     3   1   2   0  33.3  26.9 -11.68  -8.00 -3.68
24    TB      -5.0     3   1   2   0  33.3  22.9 -14.25 -10.33 -3.91
25    BUF     -6.0     3   1   2   0  33.3  53.6   4.16   1.00  3.16
26    TEN     -7.0     3   1   2   0  33.3  26.7  -5.43  -5.00 -0.43
27    CIN     -8.0     3   1   2   0  33.3  27.6  -3.01  -6.33  3.32
28    SF     -19.0     3   1   2   0  33.3  39.6  -4.06  -3.33 -0.73
29    NO      -3.0     3   0   3   0   0.0  34.4 -14.50  -5.67 -8.83
30    JAX     -4.0     3   0   3   0   0.0  18.8  -5.73 -10.00  4.27
31    CLE     -6.0     3   0   3   0   0.0  18.8  -0.37 -10.00  9.63
32    CHI    -14.0     3   0   3   0   0.0  11.7  -6.08 -12.67  6.59

Ok, all the games for week 3, but the Atlanta – New Orleans game have been played. It’s a little early to post data from the simple ranking system, as the SOS stat hasn’t stabilized yet, but hey, I can do this set today and in a day or two, add an update with the Atlanta stats.

Global Statistics:
Games  Home Wins Winning_Score Losing_Score Margin
47         26        27.49         17.53      9.96

Calculated Pythagorean Exponent:  3.21


Rank  Team    Median  GP   W   L   T  Pct   Pred   SRS    MOV   SOS
------------------------------------------------------------------------
1     PHI     19.0     3   3   0   0 100.0  98.1  20.56  21.67 -1.11
2     DEN     12.0     3   3   0   0 100.0  77.6  10.41   9.00  1.41
3     MIN      9.0     3   3   0   0 100.0  81.9   9.56   8.00  1.56
4     NE       7.0     3   3   0   0 100.0  86.8  16.86  12.00  4.86
5     BAL      5.0     3   3   0   0 100.0  69.6   3.46   4.33 -0.87
6     PIT      8.0     3   2   1   0  66.7  48.8   2.33  -0.33  2.67
7     HOU      7.0     3   2   1   0  66.7  32.2   3.19  -3.67  6.86
8     KC       6.0     3   2   1   0  66.7  75.0   8.94   6.67  2.28
9     LA       5.0     3   2   1   0  66.7  26.7 -12.60  -5.67 -6.94
10    DAL      4.0     3   2   1   0  66.7  69.0  -1.44   5.67 -7.10
11    GB       4.0     3   2   1   0  66.7  58.9   3.19   2.67  0.52
12    SEA      2.0     3   2   1   0  66.7  74.9   0.72   5.00 -4.28
13    OAK      1.0     3   2   1   0  66.7  51.0  -8.97   0.33 -9.30
14    NYG      1.0     3   2   1   0  66.7  52.6  -6.29   0.67 -6.95
15    ATL      0.0     2   1   1   0  50.0  50.0 -12.77   0.00 -12.77
16    DET     -1.0     3   1   2   0  33.3  46.1  -2.11  -1.33 -0.77
17    CAR     -1.0     3   1   2   0  33.3  56.6   7.00   2.00  5.00
18    NYJ     -1.0     3   1   2   0  33.3  32.4  -2.04  -5.33  3.29
19    ARI     -2.0     3   1   2   0  33.3  67.4   6.66   5.33  1.33
20    MIA     -2.0     3   1   2   0  33.3  46.3   4.72  -1.00  5.72
21    SD      -4.0     3   1   2   0  33.3  63.7   5.68   4.67  1.02
22    IND     -4.0     3   1   2   0  33.3  37.5  -0.00  -4.67  4.66
23    WAS     -4.0     3   1   2   0  33.3  27.5  -9.80  -8.00 -1.80
24    TB      -5.0     3   1   2   0  33.3  23.6 -16.57 -10.33 -6.24
25    BUF     -6.0     3   1   2   0  33.3  53.5   3.69   1.00  2.69
26    TEN     -7.0     3   1   2   0  33.3  27.3  -5.50  -5.00 -0.50
27    CIN     -8.0     3   1   2   0  33.3  28.2  -2.77  -6.33  3.57
28    SF     -19.0     3   1   2   0  33.3  39.8  -4.96  -3.33 -1.63
29    NO      -2.0     2   0   2   0   0.0  43.5  -9.63  -2.00 -7.63
30    JAX     -4.0     3   0   3   0   0.0  19.5  -5.89 -10.00  4.11
31    CLE     -6.0     3   0   3   0   0.0  19.5  -0.42 -10.00  9.58
32    CHI    -14.0     3   0   3   0   0.0  12.3  -5.23 -12.67  7.44

I think Atlanta suffers the most here. The SOS close to -13 will almost certainly stabilize after the game tomorrow. That said, I’m really impressed by the Eagles so far this season and for now, they’re the top ranked team on this table, via a variety of metrics.

I’ve been curious, since I took on a new job and a new primary language at work, to what extent I could begin to add Python to the set of tools that I could use for football analytics. For one, the scientific area where the analyst needs the most help from experts is in optimization theory and algorithms, and at this point in time, the developments in Python are more extensive than Perl.

To start you have the scipy and numpy packages, with scipy.optimize having diverse tools for minimization and least squares fitting.  Logistic regressions in python are discussed here,  and lmfit provides some enhancements to the fitting routines in scipy.  But to start we need to be able to read and write existing data, and from that then write the SRS routines. The initial routines were to be based on my initial SRS Perl code, so don’t be surprised if code components looks very familiar.

This code will use an ORM layer, SQLAlchemy, to get to my existing databases, and to create the Class used to fetch the data, we used a python executable named sqlacodegen. We set up sqlacodegen in a virtual environment and tried it out.  The output was:

# coding: utf-8
from sqlalchemy import Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()
metadata = Base.metadata

class Game(Base):
    __tablename__ = 'games'

    id = Column(Integer, primary_key=True)
    week = Column(Integer, nullable=False)
    visitor = Column(String(80))
    visit_score = Column(Integer, nullable=False)
    home = Column(String(80))
    home_score = Column(Integer, nullable=False)

Which, with slight mods, can be used to read my data. The whole test program is here:

from sqlalchemy import Column, Integer, String, create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from pprint import pprint

def srs_correction(tptr = {}, num_teams = 32):
    sum = 0.0
    for k in tptr:
        sum += tptr[k]['srs']
    sum = sum/num_teams
    for k in tptr:
        tptr[k]['srs'] -= sum
        tptr[k]['sos'] -= sum 

def simple_ranking(tptr = {}, correct = True, debug = False):
    for k in tptr:
        tptr[k]['mov'] = tptr[k]['point_spread']/float(tptr[k]['games_played'])
        tptr[k]['srs'] = tptr[k]['mov']
        tptr[k]['oldsrs'] = tptr[k]['srs']
        tptr[k]['sos'] = 0.0
    delta = 10.0
    iters = 0
    while ( delta > 0.001 ):
        iters += 1
        if iters > 10000:
            return True
        delta = 0.0
        for k in tptr:
            sos = 0.0
            for g in tptr[k]['played']:
                sos += tptr[g]['srs']
            sos = sos/tptr[k]['games_played']
            tptr[k]['srs'] = tptr[k]['mov'] + sos
            newdelta = abs( sos - tptr[k]['sos'] )
            tptr[k]['sos'] = sos
            delta = max( delta, newdelta )
        for k in tptr:
            tptr[k]['oldsrs'] = tptr[k]['srs']
    if correct:
        srs_correction( tptr )
    if debug:
        print("iters = {0:d}".format(iters))
    return True     

year = "2012"
userpass = "username:password"

nfl = "mysql+pymysql://" + userpass + "@localhost/nfl_" + year
engine = create_engine(nfl)

Base = declarative_base(engine)
metadata = Base.metadata

class Game(Base):
    __tablename__ = 'games'
    id = Column(Integer, primary_key=True)
    week = Column(Integer, nullable=False)
    visitor = Column(String(80))
    visit_score = Column(Integer, nullable=False)
    home = Column(String(80))
    home_score = Column(Integer, nullable=False)

Session = sessionmaker(bind=engine)
session = Session()
res = session.query(Game).order_by(Game.week).order_by(Game.home)

tptr = {}
for g in res:
#    print("{0:d} {1:s} {2:d} {3:s} {4:d}".format( g.week, g.home, g.home_score, g.visitor, g.visit_score ))
    if g.home not in tptr:
        tptr[g.home] = {}
        tptr[g.home]['games_played'] = 1
        tptr[g.home]['point_spread'] = g.home_score - g.visit_score
        tptr[g.home]['played'] = [ g.visitor ]
        tptr[g.visitor] = {}
        tptr[g.visitor]['games_played'] = 1
        tptr[g.visitor]['point_spread'] = g.visit_score - g.home_score
        tptr[g.visitor]['played'] = [ g.home ]

    else:
        tptr[g.home]['games_played'] += 1
        tptr[g.home]['point_spread'] += (g.home_score - g.visit_score)
        tptr[g.home]['played'] += [ g.visitor ]
        tptr[g.visitor]['games_played'] += 1
        tptr[g.visitor]['point_spread'] += ( g.visit_score - g.home_score )
        tptr[g.visitor]['played'] += [ g.home ]

simple_ranking( tptr )
for k in tptr:
    print("{0:10s} {1:6.2f} {2:6.2f} {3:6.2f}".format( k, tptr[k]['srs'],tptr[k]['mov'], tptr[k]['sos']))

The output was limited to two digits past the decimal and to that two digits past decimal of precision, my results are the same as my Perl code. The routines should look a lot the same. The only real issue is that you have to float one of the numbers when you calculate margin of victory, as the two inputs are integers. Python isn’t as promiscuous in type conversion as Perl is.

Last note. Although we included pprint, at this point we’re not using it. That’s because with the kind of old fashioned debugging skills I have, I use pprint the way a Perl programmer might use Data::Dumper, to look at data structures while developing a program.

Update: the original Doug Drinen post about the Simple Ranking System has a new url. You can now find it here.

It is going to be near-impossible for me to be objective about Dak Prescott. He is a Dallas Cowboy and he graduated from the same high school I did. He’s probably the biggest sports star that part of Louisiana has had since Joe Delaney.

In recent years, Chris Brown of Smart Football has been talking plenty about package plays and after Dak’s performance in the first preseason game of the year, he analyzed one play from the game. It’s good enough I recommend it. Please read, it’s worth your time.

3o7tkrgknhmpowqb3o

Odds for the 2015 NFL playoff final, presented from the AFC team’s point of view:

SuperBowl Playoff Odds
Prediction Method AFC Team NFC Team Score Diff Win Prob Est. Point Spread
C&F Playoff Model Denver Broncos Carolina Panthers 2.097 0.891 15.5
Pythagorean Expectations Denver Broncos Carolina Panthers -0.173 0.295 -6.4
Simple Ranking Denver Broncos Carolina Panthers -2.3 0.423 -2.3
Median Point Spread Denver Broncos Carolina Panthers -5.0 0.337 -5.0

 

Last week the system went 1-1, for a total record of 6-4. The system favors Denver more than any other team, and does not like Carolina at all. Understand, when a team makes it to the Super Bowl easily, and a predictive system gave them about a 3% chance to get there in the first place, it’s reasonable to assume that in that instance, the system really isn’t working.

So we’re going to modify our table a little bit and give some other predictions and predictive methods. The first is the good old Pythagorean formula. We best fit the Pythagorean exponent to the data for the year, so there is good reason to believe that it is more accurate than the old 2.37. It favors Carolina by a little more than six points. SRS directly gives point spread, which can be back calculated into a 57.7% chance of Carolina winning. Likewise, using median point spreads to predict the Denver-Carolina game gives Carolina a 66.3% chance of winning.

Note that none of these systems predicted the outcome of the Carolina – Arizona game. Arizona played a tougher schedule and was more of a regular season statistical powerhouse than Carolina. Arizona, however, began to lose poise as it worked its way through the playoffs. And it lost a lot of poise in the NFC championship game.

Odds for the third week of the 2015 playoffs, presented from the home team’s point of view:

Conference Championship Playoff Odds
Home Team Visiting Team Score Diff Win Prob Est. Point Spread
Carolina Panthers Arizona Cardinals -1.40 0.198 -10.4
Denver Broncos New England Patriots 1.972 0.879 14.6

 

Last week the system went 2-2, for a total record of 5-3. The system favors Arizona markedly, and Denver by an even larger margin. That said, the teams my system does not like have already won one game. There have been years when a team my system didn’t like much won anyway. That was the case in 2009, when my system favored the Colts over the Saints. The system isn’t perfect, and the system is static. It does not take into account critical injuries, morale, better coaching, etc.

Odds for the second week of the 2015 playoffs, presented from the home team’s point of view:

Second Round Playoff Odds
Home Team Visiting Team Score Diff Win Prob Est. Point Spread
Carolina Panthers Seattle Seahawks -1.713 0.153 -12.7
Arizona Cardinals Green Bay Packers -0.001 0.500 0.0
Denver Broncos Pittsburgh Steelers 0.437 0.608 3.2
New England Patriots Kansas City Chiefs -0.563 0.363 -4.2

 

Last week the system went 3-1 and perhaps would have gone 4-0 if after the Burflict interception, Cincinnati had just killed three plays and kicked a field goal.

The system currently gives Seattle a massive advantage in the playoffs. It says that Green Bay/Arizona is effectively an even match up, and that both the AFC games are pretty close. It favors Denver in their matchup, and the Chiefs in theirs.

One last comment about last week’s games. The Cincinnati-Pitt game was the most depressing playoff game I’ve seen in a long time, both for the dirty play on both sides of the ball, and the end being decided by stupid play on Cincinnati’s part.  It took away from the good parts of the game, the tough defense when people weren’t pushing the edges of the rules, and the gritty play on the part of McCarron and Roethlisberger. There was some heroic play on both their parts, in pouring rain.

But for me, watching Ryan Shazier leading with the crown of his helmet and then listening to officials explain away what is obvious on video more or less took the cake. If in any way shape or form, this kind of hit is legal, then the NFL rules system is busted.

Next Page »