Blogging


This is a follow up on my three part article, “Drawing a good diagram of a football field“. After trying for some time to automate arrow drawing, I’ve come to the conclusion that using GIMP, an arrow drawing plug-in, and GIMP’s path tool (a way to draw both straight lines and curves) are adequate to handle this problem.

Needed:

1. Diagram Drawing Tools (see my three part series here, here, and here).
2. A copy of GIMP. It’s free and available on Windows, Mac, and Linux.
3. Some experience with layers on GIMP. Start by searching “How to use layers in GIMP”. There are some nice Youtube videos that can get you started here.
4. This article is also really good: “How to Draw Arrows in Gimp”. Get the arrow plugin, install it, and read the instructions.
5. Look further at GIMP pathing with the search: “using path tool in gimp”.
6. For dashed and dotted lines, search using this phrase: “dashed lines in gimp”.

Some notes about drawing diagrams. If you’re using Windows shell to draw, then you’ll have no issues with Image::Magick. If you want to use the Perl interface to draw, then switch to Graphics::Magick, a fork of Image::Magick. Image::Magick has bugs when used with Perl.

When drawing arrows on a 640×480 diagram scaled in the way I’ve been scaling them, a length of wings setting of 15 and a brush width of 2 works well. This can be paired with path lengths of 5.0 pixels and will do nicely.

Tampa under front, Tampa 2 zone defense. Modeled on the diagram in Matt Bowen's Tampa 2 article.

Tampa under front, Tampa 2 zone defense. Modeled on the diagram in Matt Bowen’s Tampa 2 article.

Tom Landry's 4-3 Inside, showing a 1960s era strong side rotating zone. SAM and   left cornerback jam before falling into zone.

Tom Landry’s 4-3 Inside, showing a 1960s era strong side rotating zone, an early Cover 3. SAM and left cornerback jam before falling into zone.

Drawing a zone drop.

Load your diagram. Add a new transparent layer. Make sure you’re drawing on the transparent layer. Use the rectangle select tool (letter “R”). Choose a region to highlight as a zone. At this point select the bucket fill tool (shift B) and then select the color for the bucket fill. Use a foreground fill and set the opacity to about 50%. If your field is light green, use a dark green to fill the zone.

If you have more than one zone to add, add them all now.

Once complete, select all (important) and add another layer. On this new layer, add a path from the middle of the zone to the player. Select Tools -> Arrow (you added the arrow plugin, didn’t you?). Adjust arrow length of wings and brush width and draw the arrow. Repeat as needed.

If the path isn’t straight or you need a bar to create a “jam”, just use the path tool as needed.

If you want to add text or label the diagram, I’d suggest adding another transparent layer and putting the text above the main, zones, and arrows.

Once done, save the file as a GIMP native file and then again as a JPEG file. The second save will cause an export, crunching all the layers down to one image.

Miami 43, shade front, man plus cover 1 by the free safety.

Miami 43, shade front, man plus cover 1 by the free safety.

Drawing man to man coverage

The trick here is to use the path tool to make a stippled (dotted) image. Choose a path from the defender to the player to be defended. When you choose stroke path, choose a line type (there are many). I like the stippled pattern, as it’s unlikely to be mistaken for a solid line.

This is going to be a mixed bag of a post, talking about anything that has caught my eye over the past couple weeks. The first thing I’ll note is that on the recommendation of Tom Gower (you need his Twitter feed), I’ve read Josh Katzowitz’s book: Sid Gillman: Father of the Passing Game.

img_6590

I didn’t know much about Gillman as a young man, though the 1963 AFL Championship was part of a greatest games collection I read through as a teen. The book isn’t a primer on Gillman’s ideas. Instead, it was more a discussion of his life, the issues he faced growing up (it’s clear Sid felt his Judaism affected his marketability as a coach in the college ranks). Not everyone gets the same chances in life, but Sid was a pretty tough guy, in his own right, and clearly the passion he felt for the sport drove him to a lot of personal success.

Worth the read. Be sure to read Tom Gower’s review as well, which is excellent.

ESPN is dealing with the football off season by slowly releasing a list of the “20 Greatest NFL Coaches” (NFL.com does its 100 best players, for much the same reason). I’m pretty sure neither Gillman nor Don Coryell will be on the list. The problem, of course, lies in the difference between the notions of “greatest” and “most influential”. The influence of both these men is undeniable. However, the greatest success for both these coaches has come has part of their respective coaching (and player) trees: Al Davis and Ara Parseghian come to mind when thinking about Gillman, with Don having a direct influence on coaches such as Joe Gibbs, and Ernie Zampese. John Madden was a product of both schools, and folks such as Norv Turner and Mike Martz are clear disciples of the Coryell way of doing things. It’s easy to go on and on here.

What’s harder to see is the separation (or fusion) of Gillman’s and Coryell’s respective coaching trees. Don never coached under or played for Gillman. And when I raised the question on Twitter, Josh Katzowitz responded with these tweets:

Josh Katzowitz : @smartfootball @FoodNSnellville From what I gathered, not much of a connection. Some of Don’s staff used to watch Gillman’s practices, tho.

Josh Katzowitz ‏: @FoodNSnellville @smartfootball Coryell was pretty adament that he didn’t take much from Gillman. Tom Bass, who coached for both, agreed.

Coaching clinics were popular then, and Sid Gillman appeared from Josh’s bio to be a popular clinic speaker. I’m sure these two mixed and heard each other speak. But Coryell had a powerful Southern California connection in Coach John McKay of USC, and I’m not sure how much Coryell and Gillman truly interacted.

Pro Football Weekly is going away, and Mike Tanier has a nice great article discussing the causes of the demise. In the middle of the discussion, a reader who called himself Richie took it upon himself to start trashing “The Hidden Game of Football” (which factors in because Bob Carroll, a coauthor of THGF, was also a contributor to PFW). Richie seems to think, among other things, that everything THGF discussed was “obvious” and that Bill James invented all of football analytics wholesale by inventing baseball analytics. It’s these kinds of assertions I really want to discuss.

I think the issue of baseball analytics encompassing the whole of football analytics can easily be dismissed by pointing out the solitary nature of baseball and its stats, their lack of entanglement issues, and the lack of a notion of field position, in the football sense of the term. Since baseball doesn’t have any such thing, any stat featuring any kind of relationship of field position to anything, or any stat derived from models of relationships of field position to anything, cannot have been created in a baseball world.

Sad to say, that’s almost any football stat of merit.

On the notion of obvious, THGF was the granddaddy of the scoring model for the average fan. I’d suggest that scoring models are certainly not obvious, or else every article I have with that tag would have been written up and dismissed years ago. What is not so obvious is that scoring models have a dual nature, akin to that of quantum mechanical objects, and the kinds of logic one needs to best understand scoring models parallels that of the kinds of things a chemistry major might encounter in his junior year of university, in a physical chemistry class (physicists might run into these issues sooner).

Scoring models have a dual nature. They are both deterministic and statistical/probabilistic at the same time.

They are deterministic in that for a typical down, distance, to go, and with a specific play by play data set, you can calculate the odds of scoring down to a hundredth of a point. They are statistical in that they represent the sum of dozens or hundreds of unique events, all compressed into a single measurement. When divorced from the parent data set, the kinds of logic you must use to analyze the meanings of the models, and formulas derived from those models, must take into account the statistical nature of the model involved.

It’s not easy. Most analysts turns models and formulas into something more concrete than they really are.

And this is just one component of the THGF contribution. I haven’t even mentioned the algebraic breakdown of the NFL passer rating they introduced, which dominates discussion of the rating to this day. It’s so influential that to a first approximation, no one can get past it.

Just tell me: how did you get from the formulas shown here to the THGF formula? And if you didn’t figure it out yourself, then how can you claim it is obvious?

The three sites we noted last year: Cool Standings, Football Outsiders, and NFL Forecast, are at it again, providing predictions of who is going to be in the playoffs.

 

Cool Standings uses Pythagoreans to do their predictions (and for some reason in 2011, ignored home field advantage), FO uses their proprietary DVOA stats, and NFL Forecast uses Brian Burke’s predictive model.

Blogging the Beast has a terrific article on “the play”. If you watched any Dallas-Philadelphia games in 2011, you’ll know exactly what I mean, the way with a simple counter trap, LeSean McCoy treated the Cowboys line as if it were Swiss cheese.

Most important new link, perhaps, is a new Grantland article by Chris Brown of Smart Football. This article on Chip Kelly is really good. Not only is the writing good, but I love the photos:

Not my photo. This is from Chris Brown’s Chip Kelly article (see link in text).

as an example. Have you ever seen a better photo of the gap assignments of a defense?

After watching one or another controversy break out during the 2011 season, I’ve become convinced that the average “analytics guy” needs a source of play-by-play data on a weekly basis. I’m at a loss at the moment to recommend a perfect solution. I can see the play-by-play data on NFL.com, but I can’t download it. Worst case, you would think you could save the page and get to the data, but that doesn’t work. I suspect the use of AJAX or equivalent server side technology to write the data to the page after the HTML has been presented. Good for business, I’m sure, but not good for Joe Analytics Guy.

One possible source is now Pro Football Reference (PFR), which now has play by play data in their box scores, and has tended to present their data in AJAX free, user friendly fashion. Whether Joe Analytics Guy can do more than use those data personally, I doubt. PFR is purchasing their raw data from another source. And whatever restrictions the supplier puts on PFR’s data legally trickle down to us.

Further, along with the play by play, PFR is now calculating expected points (EP) along with the play by play data. Thing is, what expected point model is Pro Football Reference actually using? Unlike win probabilities, which have one interpretation per data set, EP models are a class of related models which can be quite different in value (discussed here, here, here). If you need independent verification, please note that Keith Goldner now has published 4 separate EP models (here and here), his old Markov Chain model, the new Markov Chain model, a response function model, and a model based on piecewise fits.

That’s question number one. Question that have to be answered to answer question one are things like:

  • How is PFR scoring drives?
  • What is their value for a touchdown?
  • If PFR were to eliminate down and distance as variables, what curve do they end up with?

This last would define how well Pro Football Reference’s own EP model supports their own AYA formula. After all, that’s what a AYA formula is, a linearized approximation of a EP model where down and to go distance are ignored, with yards to score is the only independent variable.

Representative Pro Football Reference EP Values
1 yard to go 99 yards to go
Down EP Down EP
1 6.97 1 -0.38
2 5.91 2 -0.78
3 5.17 3 -1.42
4 3.55 4 -2.49

 

My recommendation is that PFR clearly delineate their assumptions in the same glossary where they define their version of AYA. Make it a single click lookup, so Joe Analytics Guy knows what the darned formula actually means. Barring that, I’ve suggested to Neil Paine that they publish their EP model data separately from their play by play data. A blog post with 1st and ten, 2nd and ten, 3rd and ten curves would give those of us in the wild a fighting chance to figure out how PFR actually came by their numbers.

Update: the chart that features 99 yards to go clearly isn’t 1st and 99, 2nd and 99. Those are 1st and 10 values, 2nd and 10, etc at the team’s 1 yard line. The only 4th down value of 2011, 99 yards away, is a 4th and 13 play, so that’s what is reported above.

The fans were all nestled, all snug in their beds, while visions of clutch quarterbacks all danced in their heads.

Tim Tebow has managed to capture the imaginations of many announcers, fans, and analysts, including the eye of one Benjamin Morris. Ben posits, among other things,  that Tebow is being held back by his own conservatism,  that an inability to take passing risks in the first three quarters of the game is tossed aside in the fourth and some more true representation of his passing skill emerges.

This isn’t the first time that Ben has speculated on the nature of young quarterbacks and interceptions (This link is the most important, but also see here and here). One contradictory notion  that has come out of his analyses is that a lot of interceptions early in the career of a quarterback tends to be a good thing. It suggests a quarterback with exceptional skills testing those skills out — the idea that a talented cook has to get burned by his own grease to learn his chops spills over into the quarterbacking world.

A related question, important to NFC East fans, is Eli Manning clutch? This question was raised this year by Eli Manning’s exceptionally high ESPN QBR ratings relative to other metrics. People really got upset, claimed that the ESPN QBR was “busted”. But perhaps the ‘clutch’ factor actually saw something in Eli.

It’s almost a theme with the Giants that they fall behind and Eli either scores a couple late to win the game, or scores late to tie the game and then (win/lose) in overtime, or he puts on this furious rally that almost wins the game. They beat teams they shouldn’t, based on their Pythagoreans, and then lose to football patzers.

What to make of it? My gut unchecked feeling is yes, Eli is clutch, but  his team is another question altogether. It’s difficult to know with fans, emotions get the best of them. Donovan McNabb becomes Donovan McFlabb, good analysts try to prove that Jon Kitna is a better quarterback than Tony Romo, etc.

Thinking without benefit of numbers a bit further, Eli just doesn’t get ruffled. His play doesn’t suffer any effects of pressure. And that means, no matter how inadequate the team around him becomes, he’s still dangerous.

~~~

Kindle notes: just bought a Kindle Fire, and like it a great deal. It’s a better email platform than many web based email services, so it is  useful to forward  mails from those services to this device. I wish I could plug my  camera into the Kindle and upload photos, but  that will probably have to wait until Android 4 becomes a common base OS for these kinds of portable devices.

~~~~

Twitter notes: For those familiar with Smart Football, he tweets well, and is a useful feed if you’re at all interested. Trent Dilfer does quite a bit of good analysis via tweets. Surprisingly good is Doug Farrar, whose player analyses I tend to respect. I haven’t read much of Doug’s blog, Shutdown Corner, but given the character of his tweets, it might be worth a gander.

There are three interesting sites doing the dirty job of forecasting playoff probabilities.  The first is Cool Standings, which is using Pythagorean expectations to calculate the odds of successive wins and losses, and thus, the likelihood of a team making it to the playoffs. The second is a page on the Football Outsiders’s site named DVOA Playoff Odds Report, which is using their signature DVOA stat – a “success” stat – to  generate the probability of a team making it to the playoffs. Then there is the site NFL Forecast, which has a page that predicts playoff winners using Brian Burke’s predictive model.

Of the three, Cool Standings is the most reliable in terms of updates. Whose model is actually most accurate is something any individual reader should try and take into consideration. Pythagoreans, in my opinion, are an underrated predictive stat. DVOA will tend to emphasize consistency and has large turnover penalties. BB’s metrics have tended to emphasize explosiveness, and now recently, running consistency, as determined by Brian’s version of the run success stat.

I’ve found these sites to be more reliable than local media (in particular Atlanta sports radio) in analyzing playoff possibilities. For a couple weeks now it’s been clear, for example, that Dallas pretty much has to win its division to have any playoff chances at all, while the Atlanta airwaves have been talking about how Atlanta’s wild card chances run through (among other teams) Dallas. Uh, no they don’t. These sites, my radio friends, are more clued in than you.

The Stathead blog is now defunct and so, evidently, is the Pro Football Reference blog. I’m not too sure what “business decision” led to that action, but it does mean one of the more neutral and popular meeting grounds for football analytics folks is now gone. It also means that Joe Reader has even less of a chance of understanding any particular change in PFR. Chase Stuart of PFR is now posting on Chris Brown’s blog, Smart Football.

The author of the Armchair Analysis blog, Jeff Cross, has tweeted me telling me that a new play by play data set is available, which he says is larger than that of Brian Burke.

Early T formations, or not?

Currently the Wikipedia is claiming that Bernie Bierman of the University of Minnesota was a T formation aficionado

U Minnesota ran the T in the 1930s? Really?

I’ve been doing my best to confirm or deny that. I ordered a couple books..

No mention of Bernie's T in this book.

I've skimmed this book, and haven't seen any diagrams with the T or any long discussion of the T formation. There are a lot of unbalanced single wing diagrams, though.

I also wrote Coach Hugh Wyatt, who sent me two nice letters, both of which state that Coach Bierman was a true blue single wing guy. In his book, “Winning Football”, I have yet to find any mention of the T, and in Rick Moore’s “University of Minnesota Football Vault”, there is no mention of Bernie’s T either.

I suspect an overzealous Wikipedia editor had a hand in that one. Given that Bud Wilkinson was one of Bernie’s players, a biography of Bud Wilkinson could be checked to see if the T formation was really the University of Minnesota’s major weapon.

Next Page »

Follow

Get every new post delivered to your Inbox.

Join 244 other followers