Jim Corsi and His Statistic

corsi-and-miller

Jim Corsi, who spent his entire NHL career with the Edmonton Oilers, is a rather interesting guy. He took an unusual road to the NHL, through Canadian university hockey. He spent three years with the Canadian Olympic soccer team during that time as well, before leaving soccer to focus solely on hockey.

Corsi played two seasons with the Quebec Nordiques before the World Hockey Association folded at the end of 1978-79. The next year, Corsi went 8-14-3 with a 3.65 GAA for the 1979-80 Edmonton Oilers, a team that featured 18-year olds Mark Messier and Wayne Gretzky, along with a 20-year old Kevin Lowe. They lost in the first round that season, although Corsi had already moved on, being dealt to Minnesota for future considerations. He spent the next decade in Italy and represented them at the world championships eight different times.

Corsi holds a graduate degree in engineering, and speaks four different languages. He’s been the goaltending coach for the Buffalo Sabres since the 1997-98 season, and during his tenure the Sabres have been a reliable producer of NHL goaltenders (Ryan Miller, Martin Biron, Mika Noronen).

Keith Loria of NHL.com interviewed several NHL goalie coaches, including Corsi, for a January 29th piece, and he talked a little bit about pregame preparation and also what he does with the Sabres’ minor-league goaltenders:

On the road, you give your goalie an understanding of the surroundings, how the boards work, how the glass works, video and stats of the opposing players. I have to coordinate with the coaching staff and give a general idea on how the opposition prepares itself in the offensive side of the game and power plays.”

“One of my duties is to work with our Portland team and we have various technical ways of following our goaltenders and draft picks playing in college. We’ll look at videos and watch tape. We stay in contact with them as best we can. It’s not always easy because you can see how the player is playing in a game but you don’t always know the emotional part, so you try to be hands on with all of them.”

I thought there were some interesting points in those quotes. Jim Corsi is best known for the statistic that bears his name, the Corsi number; and as a result he’s frequently mislabeled as someone who only uses numbers in his work. Such a label is obviously wrong: the statistics don’t tell you how pucks bounce off the glass or the boards. He also talks about the difficulties of managing players spread out across the country, saying that it’s difficult to coach when “you don’t always know the emotional part”.

Yet, from his quote we can also get an idea of the technical difficulties of following just the goaltenders already drafted by an organization. There are hundreds of players in the NHL, thousands more playing in Europe, the AHL and the ECHL. Then there are the players in various junior leagues around North America, and going the college route; even with a large scouting staff, it is difficult to keep tabs on everyone. The Sabres have resorted to doing the vast majority of their scouting via video, a decision that may end up making some sense – is it better to see a player a few times live or many times on video?

This is one of the real advantages of statistics over watching a player live: sample size. The short-hand for this is what I call the Brian Boucher effect. Boucher is an NHL journeyman; he’s currently the backup goaltender for the San Jose Sharks. He’s had some ups and downs in his career, occasionally taking over the starting job and sometimes getting stashed in the minors.

On December 31, 2003, Boucher posted a shutout over the Los Angeles Kings. He posted another in his next game against the Dallas Stars. Over the next week, the Coyotes would beat the Carolina Hurricanes, Washington Capitals, and Minnesota Wild, with Boucher posting shutouts in all five games.

Now imagine, for a moment, that Boucher were a little used junior goaltender, and the scout watching him caught just that five game segment. Without statistical context, what assumption would likely be made?

That is the first difficulty. The second lies in trying to gauge all 18 skaters playing in a single game. It’s a difficulty that Gare Joyce ran into when he tagged along with Columbus scouts for his book Future Greats and Heartbreaks. Joyce attempted to do scouting reports on all of the players, and found his notes a confused and nearly useless mess. A long-time scout advised him to focus on just one player for an entire shift, following him exclusively, because that’s how he isolated players for his reports. Joyce did so, with superior results.

I’ve used that technique myself. I’m hardly a scout; I imagine that there are so many things picked up over the course of a lifetime in the game that a fan like me doesn’t start to comprehend. Still, by choosing one player and following him, you can get an excellent idea of his quality. On the other hand, such focus comes at a price; you miss much of what else is happening on the ice.

The combination of these two problems is where statistics become useful. Of course, they’re limited by some of the same problems: let’s say Player X spends 90% of his ice-time with offensive stars, and 10% with the rest of the team. In the 10% segment, his line scores 3 goals and allows 3. In the 90% segment, he puts up very good numbers playing with good players. But is he a good player, or is he being carried by his line-mates? It’s very difficult to tell, based on the numbers.

This is where the Corsi number comes in handy. It’s not a common statistic, so I’ll give a brief explanation. The NHL tracks shots on net, missed shots and blocked shots. The Corsi number is the total of all shots at net (incl. misses and blocks) for and against while a player is on the ice. To make it more accurate, this is often measured only at even-strength. Because the vast majority of shots come from the offensive zone, this statistic is a fairly good measure of who is spending a lot of time in the offensive zone, and who is getting stuck in the defensive zone.

Suddenly, that 10% segment is much bigger. Instead of 3 goals for and 3 goals against, we can expand it to 21 shots for, and 40 shots against. If we toss in missed shots and blocked shots, we could be looking at a Corsi of +60/-100, which tells us that our Player X was likely spending too much time in the wrong end of the rink. Over the course of an entire season, a first-line player is on ice for about 2000 shot attempts in one direction or the other, and even a fourth-line player generally sees around 500. Thus, this statistic gives us a big-picture view of which direction the play is going when any given player is on the ice.

This isn’t to advocate turning control of a hockey team over to a shot-counting computer (although it probably wouldn’t do any worse than Doug MacLean or Mike Milbury). Jim Corsi, who created the statistic, talks about the importance of the emotional state of a player, and different visual variables like how pucks react along the boards of a given arena, and the same holds true for scouting. As one example, the numbers tell us that Ales Kotalik is an elite powerplay performer; but only by watching the games or video do we realize that he’s scoring goals by playing the left point. How a player gets his points is certainly a major consideration when acquiring him, and it’s only one of the things that the statistics don’t show well.

On the other hand, what the statistics do show is overall effectiveness. They don’t do a good job of show the process, but they do an excellent job of showing the results. Sticking to the Kotalik example, by using various advanced statistics we know that at even-strength he’s been playing a third or fourth line role for years (and starting in the offensive zone more often than not), and that he really hasn’t produced much offensively relative to his ice-time 5-on-5. This is something that Oilers’ management either didn’t pick up on or chose to ignore; they put Kotalik on the top line despite a proven track record of not being a difference maker 5-on-5, and the results were predictable.

The other thing statistics do is catch stuff that has been missed, much in the way video does. Don Cherry was coaching when Roger Neilson started the push towards video, and he said something to the effect that he saw the game well enough from behind the bench and didn’t need to review tape. Now, every team in the league does – because coaches are human, and have human limitations. The scope of the game is too broad to catch everything in one go. As one quick example – is the Strudwick pairing deployed more frequently in the offensive or defensive zone? Without the numbers, I’d say defensive; Strudwick doesn’t put up points and has a solid reputation as a tough, physical, stay-at-home guy. In point of fact, no defenseman on the team is deployed more in the offensive zone (relative to ice-time) than Jason Strudwick – and that’s an important thing to know when evaluating his performance this season.

In short, statistics are a useful tool for talent evaluation; they don’t predict the future, but they show where and how a player has been used, and what his results have been in different circumstances – against different players, with different players, in different zones, with lots of icetime, with little icetime. They do it for the entire league, providing a broad picture for every player to spend significant time in the NHL. They can tell you who to watch and what to watch for; if a player is putting up negative results, video can be examined for the reason. They aren’t meant to replace visual observation; they’re meant to augment it, focus it and refine it.

As a final point, statistical analysis is frequently mocked by journalists and other fans as something that bloggers came up with and use because they don’t have access to the team, or as something that nobody who ever played the game would use. In reality, statistics have been developed by experienced NHL personnel, and are used at that level. We’re just trying to catch up to the things done by people like Jim Corsi, Ron Wilson, Roger Neilson, and the like – observing and copying a trend, not starting one.

  • Jonathan Willis wrote:

    Ender wrote:

    I mean, hell, the ’sphere would get a lot further with the people you seem to be arguing against by posting youtube clips of plays and then backing it up further with stats.
    The other day I put up all those videos of Ales Kotalik scoring from the point, and I still have people arguing with me that I’m out to lunch on him.
    Besides which, scouting by youtube video is the worst possible kind of sample contamination; it’s a fraction of the game, generally without context, and a tiny sample of a player’s season. Even 5 or 6 clips fail to represent even a drop in the bucket.

    I'm talking about illustrating an example here. Obviously you can't show a whole game, but I think more people have an easier time getting behind Coach's Cornerish views of certain plays and how players fucked up on certain plays than EVPTS/60. Like you said, you can make a really bad assumption based on youtube vids or highlights. I'd just argue that it's because on those youtube vids, people are intentionally stripping away the context. I'd also argue that it's not better or worse than most of the math that goes on around here as far as presenting something as holistic.

  • Jonathan Willis wrote:

    Mike wrote:

    It’s a rough indicator of zone time, which the NHL does not track.
    Yes – if the NHL did track it, Corsi would be largely superfluous.

    If you're using it as an indicator of zone time, and you have the stats to count shots and (presumably) the time that they're taken, why wouldn't you do something like this instead?

    Data:

    Player on team A takes a shot at 10:52
    Player on team A takes a shot at 10:54
    Player on team A takes a shot at 10:56
    Player on team B takes a shot at 11:00
    Player on team A takes a shot at 11:12

    Method:

    Time between player on team A takes a shot (or attempted shot) and player on team B takes a shot (or attempted shot). You eliminate the Detroit effect, and have a decent approximation of when teams are inside the zones. Set an arbitrary number for the neutral zone, in that if it's longer than 30 seconds between shot from A and shot from B, assume they were in the neutral zone for that time.

    I mean, it's obviously not perfect, and the neutral zone number would need to be tweaked, but if you're talking about that vs Corsi, at first glance I'd expect it to be more realistic as far as zone possession goes. Plus it uses the same Corsi data, so it shouldn't be too hard to work out algorithmically.

    I've said it before and I'll likely say it again, but at some point people need to decide what they want the stats to show, and look to see if there's a better way of showing that. As it sits, I look at the "advanced stats" bandied about and while I can figure out what they're trying to get across most of the time, I rarely think they actually are the best method for what the writer is trying to measure.

  • Ender wrote:

    I mean, hell, the ’sphere would get a lot further with the people you seem to be arguing against by posting youtube clips of plays and then backing it up further with stats.

    The other day I put up all those videos of Ales Kotalik scoring from the point, and I still have people arguing with me that I'm out to lunch on him.

    Besides which, scouting by youtube video is the worst possible kind of sample contamination; it's a fraction of the game, generally without context, and a tiny sample of a player's season. Even 5 or 6 clips fail to represent even a drop in the bucket.

  • Ender wrote:

    When your “journalists and other fans” are mocking stats as something that bloggers came up with to compensate for not being in the game, you’re IMO missing the point of those journalists and other fans. Stats on the ’sphere, as a general rule, are used *instead* of video. That removes context, and really anything can be proven by presenting the right numbers in the right order.

    In a vacuum, perhaps. But the folks who follow the Oilers in the 'sphere do watch the games, as do the people who read the blogs. The video is an influence; we've all formed our opinions through a combination of watching and statistical analysis, whether we admit it or not – at least in reference to our own team.

    As for those "other fans", would you agree with me that there is a wide disparity in a fan's ability to judge a game? Look at David Staples' player ratings as an example – there's very rarely a consensus when they're discussed. Not that long ago, there was a fellow on here arguing that Ales Hemsky is a perimeter player – something that's clearly untrue to a moderately competent observer. Mileage is always going to vary on personal observation, whereas statistics are universal. A 3 is a 3 is a 3, regardless of whether John Smith thought it was more of an 8. Statistics help serve as a common point of reference between fans – you can argue with me that, say, Ales Hemsky had a good or bad game, but we can both agree that he was a -2 last night.

  • Ender wrote:

    Sometimes teams get outshot because goalies give up a lot of rebounds. Some times they get outshot because the opposing team is taking a bunch of garbage shots from the outside.

    To your first point, I'd suggest that adding in missed shots and blocked shots adds some clarity; it's fairly rare that rebounds result in either of those two numbers.

    To your second point, I completely agree. I've made the mistake in the past of using straight Corsi from team to team, and that's wrong; Detroit for example has always been a shot-happy outfit.

    On the other hand, these factors should be consistent throughout the roster, and when faceoffs are added in Corsi still gives us a fairly accurate – in broad strokes, mind you – picture of who is spending time in which zone.

  • Ender wrote:

    Some times they get outshot because the opposing team is taking a bunch of garbage shots from the outside. Sometimes they’re being outshot because the opposing team is playing catch-up. You say Corsi shows results, but what results exactly?

    As per Willis, that would tell us "that our Player X was likely spending too much time in the wrong end of the rink."

    It's a rough indicator of zone time, which the NHL does not track. Sure, once in a while you could get less shots in 12 minutes of offensive time than the Red Wings get in 8 minutes in your own end, but those will be the outliers, and over the course of a season will smooth out to be less of a blip.

  • Ender wrote:

    Sidenote: My history might be bad, but I thought Vic actually developed the Corsi stat based on an offhand quote from Mr. Corsi himself.

    Behind the Net credits Jim Corsi, and IIRC Lindy Ruff discussed this in an interview somewhere. If Vic did in fact develop it, I'm sure he'll chime in and let me know.

  • Sidenote: My history might be bad, but I thought Vic actually developed the Corsi stat based on an offhand quote from Mr. Corsi himself.

    More to the point, I'm still not sold on the Corsi concept. Sometimes teams get outshot because goalies give up a lot of rebounds. Some times they get outshot because the opposing team is taking a bunch of garbage shots from the outside. Sometimes they're being outshot because the opposing team is playing catch-up. You say Corsi shows results, but what results exactly? How many shots for/against?

    Now, you go on to say that stats are to refine video, which is something I'm personally fine with. When your "journalists and other fans" are mocking stats as something that bloggers came up with to compensate for not being in the game, you're IMO missing the point of those journalists and other fans. Stats on the 'sphere, as a general rule, are used *instead* of video. That removes context, and really anything can be proven by presenting the right numbers in the right order.

    I mean, hell, the 'sphere would get a lot further with the people you seem to be arguing against by posting youtube clips of plays and then backing it up further with stats.

    At the end of the day, it's the old Math vs Physics debate. Math is always correct, but it doesn't necessarily represent the real world. Physics isn't ever entirely correct, but it is a pretty good approximation of the real world. The issue comes in when the people working Math present it as though it's Physics.

  • Great post Willis. I get the feeling that current management regime often "goes with their gut" as opposed to taking a more measured approach when it comes to personnel/lineup decisions. Lowe, and MacTavish have been able to draw on their years of first hand experience; and undoubtably have a high level of insight…So their GUT calls have often been very sound… I wonder, however, that as the years pass, and the game continues to evolve, and the players change; Are MacT and Lowe beginning to lose their touch? They have made some pretty bizarre decisions recently… They seem to have a habit of blatantly disregarding good data, making moves that haven't worked out at all. We have both repeatedly cited the Kotalik example…

  • Thanks for the excellent write up Jon.

    Great piece by you. Education at work.

    Someone que "The More You Know" pic from those commercials.

    I'm trying not to hate statistics & math so much anymore.