Nation World HQ
September 01 2014 12:00PM
This guide is an overview for the media and newcomers to the NHL's advanced (or "fancy") stats. It includes definitions of the key advanced stats concepts, plus an FAQ to clarify some of the typical inquiries about these measures. It is not to meant to be completely comprehensive; only a useful introduction to possession-based analysis.
Corsi numbers are proxies for offensive zone puck possession. That means the higher the number, the more time a player or team spends in the attacking end of the ice.
Corsi - Total shots at the net for and against at even strength, including missed shots, blocks, goals and saved shots. Can be expressed as a differential (+/-) or a ratio (%).
Named after goalie coach Jim Corsi who initially developed the measure to track goalie performance. Can be expressed as a differential (+/-) or a ratio (%). Differentials are often converted into a rate stat to correct for ice time (corsi differential/60 minutes of ice time).
Corsi For (CF) - All the shot attempts at the net for a given player or team at even strength. The "offensive" half of tyical corsi differential.
Corsi Against (CA) - All the shot attempts at the net against for a given player or team at even strength. The "defensive" half of typical corsi differential.
Relative corsi (corsi rel) - The difference between a team's corsi rate when a player is on and off the ice at even strength. For example, if a team generates a corsi of +2 corsi per 60 minutes with Joe Hockey on the ice and that drops to -3 corsi per 60 minutes when he isn't, his relative corsi is +5 corsi per 60.
Fenwick (FF) - Total shots at the net for and against at even strength except for blocked shots.
Named after hockey blogger Matt Fenwick, who hypothesized removing blocked shots from corsi would result in better correlation with scoring chances.
Fenwick relative (FF rel) - same as relative corsi, except without blocked shots.
Corsi/Fenwick close - A team or player's corsi or fenwick rate when the game is within a goal. Created to correct for playing to score effects. (see below)
Corsi/Fenwick tied - A team or player's corsi or fenwick rate when the game is tied. Created to correct for playing to score effects. (see below)
Goal scoring is controlled by two primary processes in the NHL: volume (possession) and frequency (percentages). These stats measure the rate at which the puck goes in the net (frequency) with a player on the ice. They tend to heavily regress to the mean over time, so they are considered proxies for luck or variance.
PDO - The sum of on-ice save percentage and on-ice shooting percentage at even strength. League average PDO is 100. Sums considerably higher or lower (+/-2.5) tend to regress towards 100 over large samples for both teams and players. Named after the internet alias of the man who conceived the stat. Could be considered an acronym for "Percentage Determined Outcomes".
On-ice Save Percentage (SV%ON) - The save rate for an individual player at even strength. Skaters have no discernible effect on this number. League average is around .920 (92.0%).
On-ice Shooting Percentage (SH%ON) - The rate at which a player's team scores at even strength. The quality of player and his linemates does seem to have some effect in this number, such that we would expect a modest spread around the league mean of of 0.08 (8%) due to skill effects. However, it takes many thousands of shots to differentiate skill from random variance.
Aside from a players skill level, there are number of circumstances that influence corsi results, including tactics, quality of linemates, quality of competition and where the player tends to start his shifts. The following stats attempt to account for many of these factors.
Zone Starts (ZS) - the ratio of offensive zone faceoffs to defensive zone faceoffs for a player at even strength. Usually expressed as a percentage. A rule of thumb is each extra zone start is worth about (+/-) 0.3 corsi. For example, if a player sees 300 more offensive zone faceoffs than defensive zone faceoffs over a season, his corsi will be inflated by approximately (300 X 0.3) +90 net corsi.
Quality of Competition (QoC) - The aggregate quality of competition a player faces at even strength. Calculated in a number of ways:
1.) Total Ice (TOICE) - The combined, averaged percentage of even strength ice time per game of a players' opponents. For example, PK Subban averaged 19:17 at even strength in 2013-14, which is roughly 43% (19.33/45) of Montreal's per game even strength ice available. A number of 30% or over usually indicates a high quality of competition. A number below 25% usually indicates a very low quality of competition.
2.) Corsi Quality of Competition (Corsi QoC) - The combined, averaged corsi rate/60 of a players' opponents.
3.) Relative Corsi Quality of Competition (Rel QoC) - The combined, average relative corsi* of a players' opponents. This number tends to give the more accurate quality of competition ranking over regular corsi QoC.
Note - Corsi quality of competition metrics are best used to rank players within a certain team, rather than compare players across teams.
*(See relative corsi above for a definition)
Quality of Teammates - The aggregate quality of teammates for a player over time. Calculated in the same manner(s) as quality of competition above. Because players see a lot more time with regular linemates than they do opposition players, quality of teammate is hypothesized to have greater influence on corsi than quality of competition.
Playing to Score Effect - The persistent tendency for teams who are leading to cede possession to teams who are trailing. This effect tends to accelerate the higher the goal differential in a game. For example, a team leading by three goals tends to give up more possession than a team leading by one or two goals (and vice versa). Can "wash out" over time, but can be very pronounced in small samples, such as a single game or a brief series of games. Corrected for by using corsi/fenwick close or tied (see above for definitions).
With or Without You (WOWY) - A form of analysis that tries to determine an individual's contribution to corsi by looking at his effect on frequent linemates. This is done by looking at each linemate's corsi with the player at even strength and then without him. For example, with Joe Hockey, Johnny Blueline has a corsi ratio of 54%. Without Joe Hockey, Johnny Blueline's corsi ratio drops to 47%. This process is repeated across Joe Hockey's linemates to see if there is a persistent pattern of improvement or decline.
Zone Entries - A stat that measures how frequently skaters enter the offensive zone with control of the puck. Usually broken down into component parts (passing, carrying, dump-in, give-away) to determine a ratio or differential of controlled entries (passing, carrying/dump-in, give away).
Zone Exits - Similar to zone entries, except looking at how skaters exit the defensive zone rather then entering the offensive zone.
Zone entry and exit stats are relatively new were developed by Eric Tulksy and his landmark study which showed that neutral zone play likely has a strong influence on possession rates. The database of zone entry and exits stats collected by Corey Sznajder for the 2013-14 will help make further in-roads in this area of study.
1.) Why corsi? Why don't we just use goals and shots?
The reason corsi is useful is because of statistical power: the larger the sample size of data, the higher the power of the analysis. Meaning, in hockey, there are far more corsi events (all shots at the net) than there are individual goals or even shots on goal. Over the course of a single season, for example, this increases our sample size from 100 or less (goals) or several hundred (shots) to several thousand (corsi events) for each team and player.
2.) What is the main take away of all these new stats?
The primary relationship to understand with hockey's advanced stats is the interplay of volume (corsi) and frequency (percentages). The latter (percentages) has far more influence on outcomes (like goal differential and wins), but tends to
a.) vary around the mean erratically in small samples and
b) regress towards the mean over large samples.
The former (corsi) has a less obvious effect on scoring and wins in the short term, but tends to be more repeatable and predictable at both the skater and team level. Especially when accounting for various moderating effects like quality of line mates, opposition, score effects, zone starts, etc.
A metaphor to understand the relationship between corsi and percentages when it comes to roster building: PDO is to goals as the temperature is to the weather: although it tends to settle into a generally predictable average over the course of a particular season, it nevertheless will vary wildly around that mean in short outbursts. In Calgary, Alberta, for example, it can be 30 degree Celsius during the day and then 12 degrees that night in the summer. Or it can change from -25 degrees one day to +10 degrees the next during the winter.
Building a strong possession team is like building a robust shelter that can weather the changes in PDO. A club that depends on percentage driven outcomes to succeed is a straw hut erected on the beach - trouble when the weather turns. On the other hand, a team that controls the play at even strength is like a brick house - far less likely to experience catastrophic failure when the percentages fall.
3.) Are there limitations to corsi analysis?
Of course, since there is no one kind of analysis or data that can paint a complete picture of the game. Here's a few of the caveats to keep in mind:
a.) Corsi is almost exclusively concerned with even strength play. This excludes special teams outcomes, which tends to be about 25% of the game.
b.) Tearing apart individual contributions is difficult. While we have developed some tools to help factor in the influence of moderating variables like linemates, opposition, tactics and usage, the nature of hockey means assigning outcomes to one particular skater of the 10 that are on the ice at any one time is challenging. This is also why considering a player's corsi in context of his team and circumstances is vital.
c.) PDO isn't entirely "luck". While we consider the fluctuations in on-ice SH% and SV% to be a proxy for random variance, the reality is there is some influence of skill and usage in these measures. The problem is it is almost impossible to identify the "skill" from the "luck" in small samples (less than several thousand shots).
d.) We haven't determined with certainty the particular player skills or coaching tactics that directly influence corsi yet. Although we can determine with some certainty good and bad possession players and good and bad possession teams, the particular collection of skills or strategies that cause these outcomes are still not entirely clear. Parsing these relationships would involve careful observation and video scouting and is likely the "next step" in the evolution of advanced stats.
4.) What is an example of a "good" or "bad" corsi rate?
The starting point for any possession rate is zero (0 corsi/60 or 50% if it's expressed as a ratio). This indicates the skater is more or less spending an equal amount of time at both ends of the ice.
All things being equal a player with a corsi rate in the positive double digits (+10/60) or with a ratio of 55%+ is probably an elite possession player. On the other hand, anyone in the negative double digits (-10/60) or below 45% is probably a lousy possession player. Similarly, any team at 55% or above is likely elite. Anyone at 45% or less is probably in the draft lottery (unless they have an elite goalie).
The important caveat here is "all things being equal". It's important to always consider a player's corsi numbers relative to his team and his circumstances. When referencing corsi, it's essential to look at things like zone start ratio, quality of competition and relative corsi (and relative ranking) within the context of his team. Be sure to note the number of games played by a skater as well: results can be skewed by only playing a limited number of contests (30 is a good minimum cut off point).
Case Study: Sheldon Brookbank
Here's an example of how to use some of the concepts and tools noted above.
Sheldon Brookbank played 48 games for the Chicago Blackhawks in 2013-14. His basic corsi rate was +3.28/60, which considered alone seems pretty decent. However, here are his circumstantial factors:
1.) a zone start ratio of 61% (meaning he started more often in the offensive zone)
2.) A relative corsi QoC of -0.86 (3rd easiest on the team)
3.) A relative corsi rate of -10.1/60 (3rd lowest on the team)
So now we can see that Brookbank played lesser competition, started more often in the offensive zone and yet the team's ability to direct pucks at the net/possess the puck in the offensive zone dropped more than 10 shots per hour when he was on the ice (versus when he was on the bench).
Conclusion: despite his seemingly above board possession rate, Brookbank was no doubt floated by the quality of his team and the relative ease of his circumstances. He's probably not a very good possession player.
This excellent look at the world of hockey analytics was written by Kent Wilson. For those with more questions about how to understand, find, or use advanced stats, feel free to contact Kent via email (firstname.lastname@example.org) or on twitter (@Kent_Wilson).