Hitting, Shot Differentials and Variance

A couple of things running through my head between Flames games today. The first topic has to do with hitting and winning versus shooting and winning. The second topic is on how advanced analysis in hockey is gaining prominence (but is still obviously misunderstood).

Outshoot rather than outhit

The NW division has become oddly enamored with tough guys and pugilists recently. The Flames traded for Brian McGrattan, the Cancuks claimed Tom Sestito and the Oilers acquired Mike Brown. This grinder parade was foreshadowed by Don Cherry a couple of weeks ago when he publicly praised the Leafs for deploying Colton Orr against the Rangers (the Leafs lost the game in question and were outshot handily by the way). Nevertheless, Toronto has gone a nice percentages based run in the last few weeks and now other decision makers (particularly of other mediocre teams) seem convinced that adding grit might be a quick fix solution for what ails them.

Advertisement - Continue Reading Below

This has sparked a debate about the value of hits. Particularly versus the value of possession, since most of the guys in question get outshot at the best of times, even if they are limited to 4th line minutes.

Of course, there is a way to investigate such claims. Tyler Dellow looked at team hit rates and win rates in games from last season. The conclusion: 

The data is, of course, hilarious. As a whole, teams did far better when they got outhit than they when they outhit the other side. I suspect that there are two main reasons for this: first, there probably is a great deal of truth to the argument that teams without the puck hit more, which doesn’t facilitate scoring. Second, there’s probably an element of teams that are behind deciding to focus on laying the body to try and turn the momentum – “Send out the energy line!” I suspect that what shows up here contains some score effects although, we know that trailing teams tend to possess the puck more, which would seem to give them less opportunity to hit.

Another stats-inclined Oiler fan, Michael Parkatti took a look at the correlation between hitting and things like shots for, shots against and points here. Again, nothing there: 

not one r-square correlation was above 0.04, while most were below 0.01… meaning that less than 1% of the variability in shots or points could be explained by the variability in hits. The relationship is statistically random — some high hit teams are good, some are bad, and vice versa. There must be something else to explain why good teams are good?

 Michael went the next step and did a regression analysis on shot differential and points. His findings in this case were the opposite – I can say with 99.9999999999985% confidence that there is a statistically significant positive relationship between shot differential and standings points.

Advertisement - Continue Reading Below

Furthermore, the analysis allowed Michael to develop a regression equation for the relationship between points and shots, allowing us to forecast standings points for each team based on shots. The equation Standings Points = 91.68 + Shot Differential * 0.0275 where "shot differential" is a team’s total to date shots for-shots against. For the Flames, that comes out to 91.68 + (-7.8 * 0.0275) = 91.5 or about 92 points over a full season. Right in the 7-10 area they have settled in the last few years. By the way, for this truncated year that comes out to about 53 points – two points out of the assumed cut off of 55.

The point here isn’t to discount hitting or physicality altogether. Hockey is a tough game and contact can have it’s uses in certain circumstances, including separating a man from the puck, intimidating the opponent, etc. Of course, hitting can also have little-to-no-effect in most neutral situations and can actually be detrimental if a guy takes himself out of the play in order to hammer someone (a consistent Dion Phanuef failing during his time here).

The problem NHL decision makers seems to be the weighting of hitting and toughness beyond their utility. To be blunt, if you’re trading puck possession/outshooting for hitting/toughness, you’re almost certainly making your team worse, not better.

NHL Teams Interested in Stats! (but are still suspicious of them for the wrong reasons)

Yesterday, Elliotte Frieman published an interesting post on the recent Sloan conference (where NHLN contributor Eric T. presented his findings on zone entries). Friedman notes a handful of clubs attended the conference this year whereas it was just the Vancouver Cancuks four years ago (no, the Flames were not amongst them).

There’s a few bits on how teams are still secretive and proprietary with their own stats/models but still aren’t really sure if they have found anything paradigm shifting or truly useful. What stuck out to me, though, were some of the perceived strikes against statistical analysis presented in the piece.


One of the biggest issues with analytics is that its most hardcore followers tend to discount things like "heart" or "clutch performance" because they are not quantifiable.

Advertisement - Continue Reading Below

Well, no, that’s not quite accurate. The issue with "heart" isn’t that it’s not quantifiable, strictly. We all know that humans vary in their degree of motivation, volition, passion, work ethic. Typically the reason I discount tales of "heart" and similarly fuzzy concepts is because they are too often marshaled in NHL analysis incorrectly: as the easy, go-to narrative when an outcome is extraordinary or unexpected.

This tendency comes from years and years of sports reporters and decision makers mistaking natural variance in the game as true performance signals and, absent any other explanation, assigning blame or credit to some plausible personal psychological attributes of the team/players. I say this because ever since I began to track and understand percentages and regression to the mean in the NHL I’ve noticed that approximately 9/10 stories about a club’s lack of heart or incredible passion leading to losses or wins can be explained by low or high percentages. Whenever the percentages regress to the mean, wiping out the extraordinary results, so do the pop-psych articles in the MSM.

As an NHL coach or GM, I would certainly be cognizant of a given players commitment level to the team and to winning. However, I’d make sure to temper those considerations with the knowledge that I may mistake my liking a guy for his having a "good character" (subjectivity trap) and that our perception of a player’s performance and even underlying personality traits can be greatly influenced by results; results that are non-trivially dependent on things beyond that guy’s control.

As for clutch performance, it’s discounted because it is quantifiable and most have found that it doesn’t truly exist as a skill; ie something that can be reliably replicated or predicted. The perception of clutch is more or less the combination of some players simply being better than others + the natural ebb and flow of luck.

The issue in analytics isn’t merely assigning grades or blame after the fact – that’s the easy part. It’s finding metrics and models that help you predict outcomes with a certain degree of accuracy. As Gabriel Desjardins so eloquently put it "when people say that a team lacks the ‘intangibles’ to win, they are *making sh*t up* after the fact to match the results." The human tendency to see patterns in noise, to offer post-hoc rationlizations that neatly and easily explain events and to mistake hindsight for foresight means even honest attempts at analysis are beset by difficulties and pitfalls. That’s why doing this sort of work is an odd mixture of humility before the facts and endless skepticism.

Advertisement - Continue Reading Below


Oakland A’s general manager Billy Beane, the hero of Moneyball, blamed his team’s playoff failures on "luck." That’s a cop-out.

People who object to the term "luck" as used here don’t seem to understand what it means. The word comes with an unfortunate connotation of "not deserving" or "completely random". Outcomes in professional sports are weighted probabilities, not destinies, so it’s entirely possible for the better team to lose on any given night or even over a brief sample of games, like a best of seven series, for no other reason beyond variance. There are also other influences beyond the control of the players, coaches and GM’s of course: the officiating, injuries to key players, etc. Sports are interesting not only because of the action, competition and violence, but because they are a boiling cauldron of uncertainty. Sometimes the underdog wins. And sometimes it’s not because of any particular failing of the favorite.

I understand the reluctance amongst coaches and players to deploy "luck" as an explanation for wins or losses since it is completely unsastisfying to our sensibilities (and might be used as an excuse when things legitimately go wrong). However, the insistence that variance doesn’t really exist in the NHL and every outcome can be 100% explained by controllable factors causes people to fabricate stories and too often leads decision makers down the wrong path.

  • Bean-counting cowboy

    Kent, I hope that eventually when the powers that be come to use an fully appreciate advanced analytics (matter of time IMO), that you and other earlier pioneers eventually get put on the pedestal you deserve (maybe I will see your name in a textbook someday!)

    • Appreciate that. Honestly I’m not looking for any of that stuff. To tell you the truth, I’m mostly a student of the brighter minds doing this work and a willful disseminator of the findings and discussion.

      If we continue to move along the path and understand the game on a deeper level, I’m happy.

      • beloch

        re: luck – I think that coaches do use luck as an explanation sometimes. I think of the “well we think we deserved a better fate tonight…” cliche, or the “bounces didn’t go our way” kind of thing.

        P.S great article, but i’m still hoping for the Bayesian analysis šŸ˜‰

  • Bean-counting cowboy

    Kent, your point regarding luck is probably the most important single point in the entire analytics vs. (you name it – scouting/eye-test/haven’t played the game) debate. It is nearly impossible for a large segment of people to understand that the better team doesn’t always win, the result isn’t always fair, shit happens. Luck is the correct word to use, but since the general usage of the word implies that it is in lieu of skill, it’s a word too polluted to ever use in this type of discussion.

  • jeremywilhelm

    Funny thing in this whole McGrattan Sestito acquisition thing, is we know Vancouver uses advanced stats, and usually quite effectively, and yet, they still went and got a pure pugilist.

  • RexLibris

    Great article Kent. Thoroughly enjoyed it, even the bits with numbers and stuff. =)

    You may have seen this already, but for anyone who hasn’t here is a clip of Mark Cuban taking commentator Skip Bayless down a notch for using “vague generalities” to describe the Mavericks vs the Heat.

    Very good discussion and something that any fan of analytics will enjoy.

  • RexLibris

    Good article Kent,

    Quick clarification on that regression equation: it’s expressed over an 82 game schedule, so you have to prorate the totals to that.

    For the flames, if you prorate their shot deficit of -2 over the entire season, it’d be -7.8.

    y = 91.68 + (-7.8 * 0.0275) = 91.5, so essentially the exact same thing, since the Flames are so close to even anyways.

    They’re on a 86 point pace right now, so pretty close to the equation. This is telling us that they’re just not hitting the averages right now, likely due to Kipper’s injury.

  • I think one of the best examples of luck comes from this season.

    The hawks “luck” in combination with the horseshoe up emerys butt got them the win in a game where the flames dominated them.

    By extension, the hawks luck and subsequent “streak” has extended out to have become a league wide phenomenon that has actually captured a small percentage of the cognizance of the big sports media *cough* espn *cough* .

    And thats a good thing. So in effect the flames NOT getting the win in that game has benefitted the league overall. Ironically.

    Big marquee team with solid players, tremendous fans, big city.. so the luck gods made the flames a small sacrifice in a bigger picture.

    I think Im fine with it.

  • Purple Hazze

    Great article!

    Interesting note about teams being secretive and proprietary with their own stats/models. Read an interesting blog post the other day about the Sloan conference and how most of the good basketball analytic guys had been snatched up by the various teams in the NBA and because of all the secrecy there’s not many people left in the public forum to pushing the analytics further.

  • beloch

    I’d be really interested if you guys somehow managed to interview Chris Snow. Specifically, I wonder what he’s telling Feaster and Co. about the Flames’s chances down the road and whether he thinks the team should be buying or selling. I’m sure they have him gagged and under lock and key though.

  • beloch

    I’d also really love to see someone ask Feaster this:

    “Since the season of 2006/2007, the Flames have averaged 93.7 points/season. The average 9th place team in the same time-frame has had 91.2 points. For 6 years the Flames have been a bubble team and, under your watch, this has not changed. What’s your plan for the future? Stay in the bubble and hope to get lucky?”

  • beloch

    Any stats on the increase in energy the Flames seemed to play with after Mcgratten had his hit shift against the Canucks or even how much more life there was in the building? How about the change it seemed to make to Vancouver’s game plan as now they were worried about what he might do next?

    • beloch

      There have been studies done on the effect of fighting on the outcome of games. The study I’m familiar with used GVT, and found that (ignoring the effect fighting has on the other team, which is usually exactly the same as on the first team) 1 fight is worth something like 1/10 of a goal. Meaning, in order to be worth a single point in the standings (in other words, to have any effect on the season at all), a fighter would need to contribute 30 fights (30 fights = 3 goals = 1/2 wins). In order to be worth a single win, it would require 60 fights.

      And, as mentioned, this is ignoring the effect the fight has on the other team, which is usually exactly the same as the effect it has on your team.

  • Fighting will never be statistically significant. We know this. But, there is a reason that fighters are in the game. The same reason as why the flames have won the last 2 with mcgratton in the lineup. Fighters or tough guys bring confidence for the team. Can you statistically defend this? No. Fighting, toughness, confidence and energy can not be statistically defended. But they are invaluable to the players on the team. I’ve got into this arguement numerous times on this site, but the belief here is stats are the be all, end all. Its funny to me. Moneyball has its merits, but there is reason the oakland a’s have not won the world series. I read the jamie mclennans book, and there he stated the 04 flames season changed when sutter challenged the whole team to be tougher, meaner and have more passion. The result, stanley cup finals. I’d be interested to see the “stats” and “corsi” of the teams of that season and how they would compare to the flames.. My guess detroit should of owned them. Toughness has a part in the game, like it or not.

    • It is hard to explain some of the unique pieces to this game with some people fixated on data. Some things simply are not defensible in a data driven situation. Yet, a more talented Vancouver team lost to the physicality of a Bruin team who played the body at every turn. That is not supposed to win games according to the data. McGratten has already won us a game imo but the data does not support the difference he made that night so is not valid. I don’t discount data but can data not be manipulated? I bet Hartley has his own data that supports his decisions. I know Babcock does. BTW, the Wings hit the Oilers every time the Oil had the puck tonight. Not the bone crunching kind but they removed players from the puck all the time. Yet, many of these were not even registered as hits on the stats. Could the data collection be flawed?

      • SmellOfVictory

        The data don’t describe -how- a team outshoots another team, only that it does. Nowhere does it say “you can’t win against another team by physically removing them from the puck”; it’s just saying that if you’re hitting more, it means you have the puck less (because otherwise you’d be taking an assload of minor penalties) and are therefore likely losing the puck possession battle. And hits absolutely are a flawed measurement. Even shots aren’t perfectly counted, but hits are incredibly subjective and not at all reliable.

        Similarly, it’s not saying it isn’t useful having a guy like Lucic kicking around and possibly scaring the pants off of people. However, a) that factor cannot be measured, and b) teams should not seek out an intimidation/physical element at the expense of other elements (skating, shooting, etc).

  • SmellOfVictory

    Hockey is a combination of skill and strength.

    I believe Grit, toughness and particularly size is a difference maker especially in the playoffs.. as well as “heart”, tenacity, sacrifice.. all those things.

    Stats tell one side of the story, but no matter how many shots you fire at the net, or where you are on the ice… if you can’t dominate another team or intimidate them in any way, chances are you won’t be successful. Most teams that have won a cup of proven this.