September 03 2014 12:00PM
(In part 1 I discussed the current state of advanced stats in the NHL with a view to defining an "ideal state" for NHL clubs in their efforts to establish modern analytics departments. In part 2, we look at where this form of analysis came from and where it may be headed in the future)
“I’ve never said, never thought, that it was better to be an outsider than it was to be an insider, that my view of the game was better than anyone else’s. It’s different; better in some ways, worse in some ways. What I have said is, since we are outsiders…let us use our position as outsiders to what advantage we can. Let us back off from the trees, look at the forest as a whole, and see what we can learn from that.”
- Bill James
Having been an early adopter and advocate of possession-based analysis, perhaps the most common complaint I encountered over the years was how, if corsi was so valuable, it was not actively employed by those who make their living inside the game. If the virtues of this analysis are so clear, why didn't the experts come up with it? How could a bunch of no-name amateurs create something that could be of value to experienced, lifelong hockey men?
It's a reasonable question. The counter-intuitive truth of the matter is that hobbyists and outsiders enjoy distinct advantages that makes such advances more probable outside of the NHL's official sphere. I've written previously about the potential value of the outsider perspective, as well as the structural obstacles that dissuade this sort of internal innovation but a more fulsome explication is perhaps in order.
Top Down Vs Bottom Up
Although teams are adding all sorts of adjunct layers to their front offices these days, most teams still operate as a top-down hierarchy, with a rigid, fixed power structure. The league is also notoriously insular, with the vast majority of executives coming from the ranks of ex-players, family members of past decision makers and lawyers/player agents. In addition, the amount of open positions in a system with a given number of teams is also more or less fixed. The result is the NHL as a business is less likely to experience new, disruptive models that challenge established ways of thinking.
As a result, there is a distinct lack of intellectual diversity. In fact, there are disincentives to stepping too far outside of the box. Just about everyone who rises to power in the league has been steeped in the same "hockey culture" for decades. There is an implicit antipathy towards things that don't conform to NHL norms, and therefore an inherent risk to being "too different" for those whose career aspirations lie within league walls. Although there is significant attrition and relatively low career security in NHL position from coaches all the way up to the GM chair, there is also significant churn and intra-team recycling done within the confines of the NHL. Meaning one can fail, but fail successfully (ie; retain legitimate employment options) by not colouring too far outside the lines. It's one thing to lose by perfectly conventional means. It's quite another to fail while being stigmatized as odd by the rest of the league.
There is also a distinct difference in information flow between NHL teams and the sort of de facto, crowd-sourced peer review that produced corsi stats. NHL clubs are separate and disconnected, particularly when it comes to chasing strategic insight that will confer competitive advantages. As such, any particular insights that are gleaned from work inside individual franchises are horded and protected as state secrets. In effect, NHL teams are the proverbial collection of blind men trying to describe the elephant by feeling a single portion of the animal: they each have bits of information that are only portions of the whole.
In contrast, it's much easier for amateurs to work on big picture innovation outside of the hierarchies and incentive structures within the league. Possession theory grew not only from the key insights of a handful of pioneers, but due to the rapid sharing and subsequent iterations engendered by digital self publishing and social media. Blogs and the like allowed an ever growing collection of amateurs to bring their widely diverse range of skills and abilities to bear on the issue of better understanding hockey. What's more, they were never subject to the pressures like those that exist in the league such as being fired or ostracized. The outsiders could focus on the big picture problems season after season and share their research freely, unencumbered by the tethers that would traditionally bind would-be innovators inside the NHL.
The Times Are Changing
The issue of new digital self publishing raises the point that the technology, tools and, indeed, the game itself, have undergone fairly rapid changes over the last decade or so. A vast majority of the NHL's decision makers grew up in an environment drastically different from the one we've seen emerge in the cap era: a league of increasing parity, complex budgeting demands and huge amounts of readily available statistical data. The evolution of the intellectual demands placed on coaches and GMs, as well as the skills and tools required to meet those demands, have likely evolved faster than the men in authority.
Furthermore, it's a fair inference that the general mistrust towards stats one sees from many front offices isn't merely a knee-jerk reaction to the unfamiliar. A vast majority of the league's existence, scouts, coaches and general managers learned to be deeply skeptical of the predictive power of numbers - at least, of the numbers that were available such as goals, assists and plus/minus.
Though many GM's still obviously weight certain numbers heavily in their decision making (like goal scoring, for instance), they also deeply respect players with well established resumes (ostensibly limiting the risk of not "knowing" what a guy is) and give considerable regard to non-tangible factors like personality, attitude, role etc.
Possession-based theory has also confirmed that there are significant gaps between things like conventional counting stats and a player's true value on the ice. In absence of this knowledge, however, these gaps significantly increase the risk of failure in player evaluation and prediction. Thus, the entirely adaptive heuristic that "numbers don't tell you everything" evolved.
Of course, decision makers attempted to fill those gaps with the data and mechanisms they had at the time, including scouting, interviews and background checks. A swath of primarily qualitative information aimed at divining a player's true "core" self and if he had the sort of game style and personality traits that would ultimately translate to success in the NHL. The official formula is to mix those things together to varying degrees, weight a player's recent performance heavily, and then add a dash of gut feeling and executive intuition to build rosters.
And most of them did (and do) this quite well (with a few obvious exceptions). The question isn't whether traditional hockey men are bad at their jobs, but whether there's room to do them better.
More and More Complexity
The enduring suspicion of stats gives rise to the other most common charge against advanced stats in hockey: that the game is too dynamic and too complex to effectively capture with numbers. In fact, the exact opposite is actually true. The complexity of the game increases the need for systematic data collection and analysis rather than invalidate it.
With the the differences between teams and players narrowing due to parity, it is becoming ever more difficult to tease apart the signal from the noise in the NHL - even for highly trained, highly experienced professionals. Like two equally talented poker players facing each other, the random drop of the cards becomes more influential to the outcome than it would between two unequal opponents. As parity increases, so too does the sway of chance.
The human brain tends to be great at detecting patterns, but lousy at understanding which of them is a product of chance. We are, by nature, practically innumerate and blind to things like base rates and natural variance in big data sets. We mistake illusory correlations for causation and find narratives appealing and explanatory, even though they tend to be reductive and dependent on a a heaping dollop of the hindsight bias. In the face of increasing complexity of information, people tend to retreat to comforting cognitive paradigms such as convention, ritual and even superstition to re-establish a perception of control.
The idea that hockey is dynamic and complex is a recognition of the difficulty of data gathering, but nevertheless an argument in favour of disciplined, systematic statistical analysis, not against it. When used correctly, "big data" can be harnessed to tease skill from circumstance and chance in ways that may be next to impossible through intuition or qualitative means alone.
The Way Forward
Corsi stats have evolved to the degree that they can likely improve NHL decision making in some general, broad brush ways right away. They fill some of those aforementioned conventional stat gaps and provide a mechanism for identifying and understanding the role of variance in the league.
As mentioned in the previous article, the next step is marrying possession-theory and statistical modeling with existing experience and knowledge at the NHL level. Specifically, teams have the opportunity and the wherewithal to leverage the work that has been done in the wild so far and then apply their internal resources, scouts and video tools to yield new insights.
Deep data mining and granular analysis combined with video and observation can teach club's about the skills and tactics that influence possession on both the individual and team level. By discovering what further moderates possession, teams could begin to develop robust models that predict a player's future possession and goal differential based on his age, results and usage. They could also adjust their on-ice tactics to combat opposition's strategies and build rosters. Most importantly, they can build internal processes to test their own hypothesis and assumptions while combating some of the counter-productive cognitive short-cuts that tend to skew rational decision making currently.
The infiltration of new stats in the NHL has begun. What happens from here depends on how seriously teams take these new tools and how well they manage build their departments and integrate them into the decision making process at all levels of the organization.