Corsi, PDO and You

We’re firmly nestled in the four-day mid-season break that is the All-Star Break! While that may not provide us with anything meaningful to discuss, as the All-Star Break is mostly an excuse for the NHL to hang out with sponsors and have a fun weekend in a random NHL market, it does provide us with a break from in-game action to take a deep breath and see what’s what.

So let’s discuss Calgary’s possession game.

The Calgary Flames have really improved, standings-wise, from last season. After 47 games in 2014-15, they’re 25-19-3. After the same number of games last season, they were 16-25-6. So, it’s a big turnaround. Unfortunately, a lot of it is still driven by bounces and luck.

Screen Shot 2015-01-22 at 1.48.08 PM

This is the Flames Corsi percentage on a rolling five-game basis. Aside from that brief spike, the Flames haven’t been amazing. Year-long, they’re a 44.4% Corsi team, indicating that they get out-shot (and out-attempted) quite a bit. (Though you’d think that the return of Mikael “Corsi King” Backlund would’ve resulted in a pronounced Corsi recovery, so far it hasn’t – the team’s 43.9% since he’s been back. But sample size and all that…)

Weirdly, that hasn’t mattered.

Looking at Calgary’s ten best and worst games this season for overall Corsi – not just 5-on-5, but including special teams, too – you notice something peculiar.

They win a lot of games where they get out-shot.

Corsi% Opponent Result
25.6% Chicago 1-0 Win
33.0% Los Angeles 2-1 Win
34.2% Edmonton 5-2 Win
36.1% Chicago 2-1 Loss
36.2% Nashville 4-3 Win
37.4% Vancouver 1-0 Win
37.9% San Jose 2-0 Win
39.2% Florida 6-4 Win
40.7% Arizona 5-2 Win
40.7% Winnipeg 4-1 Win

And bizarrely, the exact opposite is almost true: the team is 3-6-1 in their 10 best Corsi games. This is how (and why) hockey is a strange, strange game. There’s a bunch of guys doing a bunch of things on the ice, and often bounces make the difference.

Here’s the Flames’ rolling five-game PDO. Guess where the wins and losses are.

Screen Shot 2015-01-22 at 1.48.32 PM

We said similar things about 20-ish games ago, and shockingly, the wild PDO ride the Flames have been on has continued.

This isn’t to rain on the parade. The Flames have persevered through the peaks and valleys of their wild PDO ride to date. And, to be honest, I’m very curious if Bob Hartley utilizing Mikael Backlund and Sean Monahan more strategically down the stretch may help out in terms of the team’s possession game (and as such, their Corsi MAY go up). I mean, having your best possession player out of commission for 29 games probably doesn’t help things much.

That said, the Flames are very dependent on their goaltending. They’ve been lucky to this point that they’ve had three goalies go on really strong stretches, and their depth in this position has meant – so far – that the club hasn’t absolutely fallen into the abyss. But the team’s also been lucky that they haven’t gone from a cold goalie to another cold goalie.

After 47 games, what do the underlying numbers say about the Calgary Flames? Well, they’re not the world’s best team, but they’ve managed to hang around the playoff picture anyway. And given that the season has progressed to this point in a fairly consistent – albeit consistently uneven – manner, suggests that we’ll see more of the same over the remaining 35 regular season games.

In other words: it should be a wild ride to the season’s finish.

  • Purple Hazze

    Ryan,
    I would have thought by this time this season, with the “strange” results to date, you would have moved beyond the simplistic Corsi ‘its all luck’ analysis to dig into the data and see what is really happening. Many of your statements are truly wrong (e.g. goaltending, which has been barely average…) and failure to note that the Flames lead the league in several categories (e.g. shots blocked) which impact Corsi significantly render your analyses about useless. Why don’t you do us all a favour and use some analytical skills to provide a clearer picture of what is really happening?

  • Lordmork

    Can we pinpoint what caused the December possession spike? Can luck impact possession?

    Also, I still get the idea a lot of people don’t understand how luck works.

  • RedMan

    if CORSI is just a more complicated path to the conclusion of “luck”, then i think the cycle has come full circle.

    But hey, if CORSI can tell us WHICH guys are lucky, we can get these guys to buy our lottery tickets and stuff like that, so then CORSI is very valuable.

  • beloch

    The problem with goals, as data, is that they’re really chunky. A game can have over a hundred corsi events and just one goal. This isn’t basketball folks! The problem is that which team wins a single game is depressingly random.

    The thing about statistics is that they start to tell us useful things only when the sample size is large enough. If you look at all the games played by all teams this season, you’ll find a strong correlation between CF% and GF% and, hence, wins. It’s not perfect, but strong. This means we can say teams that dominate corsi in a given game have better odds of scoring more goals than their opponent. However, if you look at the data of any single team at this point in the season, the correlation will be pretty weak (the Flames are not alone in this!). Because of how chunky goal data is, half a season of data for a single team isn’t enough for correlations to become strong.

    If the last paragraph wasn’t clear to you, let me try something else:

    Imagine flipping a coin. The outcome is heads or tails. It seems like there should be a 50% probability of either outcome. Say you flip the coin ten times and it comes up 5 heads and 5 tails. Hey! This is making sense! You flip it 10 more times and get 8 heads and 2 tails. Is this coin biased? i.e. Will it come up heads more often than tails if you flip it a thousand times? The answer is that you don’t have enough data yet. Based on your knowledge of other coins it’s pretty safe to assume the coin is roughly fair, so you should predict that a thousand tosses will come up roughly half heads. In truth, a real coin is not perfectly fair and will have a slight bias, but it will take thousands of tosses to determine what that bias is.

    While advanced stats do a decent job of predicting outcomes across the league over a whole season, looking at a single game or even all games played by a single team is like looking at a handful of coin tosses. Some teams are playing with a coin biased towards “Win” and other teams are playing with a coin biased towards “Lose”, but both kinds of teams can put together long strings of “Win Win Win Win Win…”, purely based on chance. Advanced stats give us insight into what makes a team more likely to win or lose that are valid when looking at every game played by every team in an entire season. However, that doesn’t mean every team’s record will perfectly reflect the bias of their “coin”. NHL seasons are far too short for this. It’s almost certain that some teams will seem to “defy advanced stats” in a given year, simply due to random chance and small sample sizes.

    So, for those wondering why the Flames seem to be defying advanced stats, the answer is that they aren’t. What advanced stats suggest is that, if you froze this version of the Flames in time and then had them play a couple thousand games, they should lose more than they win. However, the reality is that a season is just 82 games and teams are changing and evolving constantly. Random chance (or luck if you prefer) is a significant factor in the NHL even for a single team’s entire season. The next time you’re watching a game that the Flames are dominating in all stats except goals, remember the coin flipping analogy. The Flames might be putting some special sauce on their coin in that game but, ultimately, “luck” still reigns supreme over a sample of a single toss.

    • Purple Hazze

      Beloch, In statistics you are absolutely correct in saying that sample size is a key factor, and that the larger the sample size the better, or more consistent the result. The problem is that the thing that you are sampling, i.e. Cross has to be inherently meaningful, versus muddy, so that once you have a large enough data set you have something to hang your hat on. The problem is that Corsi is not that thing. Because Corsi is tallying shot attempts (Missed Shots, SOG and Blocked shots) even if you have a large sample size you still don’t know what you have. You would be much further along looking at something related to goals (which actually make a difference in winning) or SOG or Scoring Chances because at the end of the day you’d have a much clearer picture of reality.

      Just looking at Corsi your explanation of any deviations from your expected is “Luck” and you believe if you just have a large enough sample size every team not meeting your pre-concieved fit to high Corsi=high Wins will eventually Regress to the norm. If you actually dug deeper and identified the real contributing factors perhaps you’d see that the Flames are actually pretty spot on to expectations due to what they are doing on the ice, and the resultant expectation is if they are able to keep up they should have similar results.

      • beloch

        You need a clearer definition of meaningful vs muddy, but I think I can guess what you’re asking, and it’s a very good question.

        First, stats don’t lie about the large scale. However, to what extent could other factors determine the fine structure of game outcomes? Finding these factors is the quest those who came up with Corsi, Fenwick, etc. are on, and the “holy grail” is to be able to perfectly predict the outcomes of single games based on who’s playing and whatever other data is available. There could be many things that bring us closer to the grail. For example, scoring chances seem likely to be more predictive of goals than shots, corsi, or fenwick. However, scoring chance data is currently spotty and is not yet available in sufficient quantity to say much of anything.

        The next obvious question is, does the “Holy Grail” exist? How “random” are the outcomes of hockey games really? Let’s go back to the coin flip for a moment. In fact, the outcome of a coin toss is completely predictable. If you have precise measurements of the coin, the force applied to toss it, and the environment in which it is tossed you can predict the outcome every time. Indeed, people have built machines capable of tossing coins (in a controlled environment) in such a way that the outcomes are predictable. Coin tosses aren’t random. It’s just a non-trivial problem to measure the system and compute the outcome.

        So, is hockey inherently predictable, but we just lack the ability to measure the state of the system? Well, a big part of that system is a collection of human brains. How predictable is human behavior really? Going back to the coin, could we predict how a human will toss the coin before the event begins right down to how he or she holds the coin and how his or her muscles contract? This is not a question we currently know the answer to. Some believe that the human brain is a classical molecular computer and, if we could measure it’s current state as well as the future inputs, we could run a simulation to determine what the output will be. However, it’s possible that human cognition is dependent on quantum processes, and some quantum outcomes, unlike a coin toss, truly are random in theory. If so, than even with perfect measurements of a hockey game’s system and an infinitely powerful quantum computer, we could never hope to predict the outcome with perfect accuracy.

        In short, “luck” and “random chance” are loaded terms, to say the least.

        • beloch

          beloch,
          Good reply.

          One of the things that really bugs me is that rather than falling back onto the most predictive stats (e.g. Fenwick instead of Corsi, shot quality versus quantity) we seem to always have writers throw out the simplest stat (Corsi). Statistically that could be supported because it has the highest volume of incidences versus others, but I fear that it happens because the writers can’t be bothered to dig deeper or really don’t believe in it themselves….

    • Purple Hazze

      Using the coin flip analogy then doesn’t it lend more to the fact that advanced stats arn’t very good at predicting hockey results? As you said sure if this version of the Flames frozen in time played a 1000 games they will have some predictive power but since they’re only playing 82 what do we do with the stats?

      Also throw in the fact that individuals on the team are still maturing and learning game to game, their odds of the coin flip wouldn’t even be 50:50 but rather always changing. Its like trying to predict a coin toss with the odds changing at every toss.

      • beloch

        You need to be precise about the results you refer to. Are stats good at predicting the outcome of a single game? No. Are they good at predicting the outcome of a large collection of games? Yes, within limits. How teams evolve is one of those limits.

        My entire point is that people should not expect advanced stats to have strong predictive power on the scale of a single game or even a single team’s season. Some people get mad because they think advanced stats are saying one thing but game outcomes say another, but that’s not the case at all.

        What good are advanced stats then? Well, weak predictive power is better than nothing. Many are on the quest to improve that power, but there is a long way to go and the end is shrouded in mystery.

      • ChinookArchYYC

        You’ve done an excellent job defending the value of advanced stats in hockey. I would add that I’ve noticed a lot of fans tend to misunderstand there proper usage. As a result, many readers get frustrated with these stats, because they may not always (and in the case of the Flames, hardly ever) reflect the on-ice results. Corsi and Fenwick are a general reflection of which player or team had the puck, NOT who scores or wins more.
        Great Corsi players, like Michael Backlund don’t automatically score more, instead they tend to help a teams cause to ‘keep’ the puck. Strong Corsi teams like Chicago don’t win because of possession, they win because when they possess the puck, they are very dangerous. The Hawks have elite level scorers mixed with players that get them the puck. They win because they make the most of their many opportunities.

        The other problem is that readers and writers alike cherry pick with a single advanced stat like Corsi to explain every thing. Out-possessing your opponent is not enough to win. What does possession matter, if your Goalie has a .88 SV% (that’s 3 or 4 goals allowed per game!)? Stroll over to ON to witness the same daily comment, which reflects this. To paraphrase, ‘Oliers win at Corsi, and loose again’. My point is that more context is needed.

        Last comment/request. I’d rather see Fenwick used more often when talking about the Flames, given shot-blocking is a big part of their team strategy.

  • beloch

    Ortio’s return to the AHL assures ADK snapping their 4 game losing streak. Wolf with two goals tonight. 4 game goal streak. Why isn’t he replacing Byron/Bollig yet?

  • beloch

    Ortio, Wotherspoon and Granlund back in Adirondack, Adirondack snaps their losing streak winning Rochester Americans 4-1.

    3 stars of the game: 1st star David Wolf, 2g 0a +1 5 shots on goal. 2nd star Joni Ortio 35 saves on 36 shots. 3rd star Markus Granlund 1g 1a +1 3 shots on goal.

  • SoCalFlamesFan

    Ohhh dam that Big Bad Wolf! !

    Love this!! I absolutely can not wait for him to crash the party! What a tear and I can gladly say he is proving each “expert” wrong! He’s a beaut and been a fan favorite in Hamburg now Glens Falls and next stop Cowtown.

    Will be ordering a big bad Wolf jersey soon as he cracks the roster.

  • Burnward

    For those who are interested, here’s a link to the Adirondack – Rochester game highlights on youtube: https://www.youtube.com/watch?v=V-vBF77caLU (same video clip can be viewed on Adirondack Flames website, along with coaches post-game interview). Wolf almost managed to score a third goal off a pretty cross-ice pass from Granlund and Poirier-Granlund-Wolf line had several nice plays. Potter had a strong game with 2 helpers, Shore got an assist on power play, Hathaway scored and Sven almost scored on power play. Wolf has now 7 goals and 2 assists in last 6 games.

  • ChinookArchYYC

    I don’t see anything surprising in the stats you’ve quoted. Because a team that is ahead has a tendency to defend and a team that is trailing has a tendency to push to get back in the game, a team that is behind will often have more shots attempted than the team that has outplayed them.

    The effect of score effects is well known. imo CF% is not helpful in considering whether a team has played well. I haven’t considered how effective it might be if limited to situations where games are tied or close, though those stats are also available.

    CF% can be imo useful in considering comparisons between teammates if quality of opponents and teammates and zone starts are considered.