Hockey’s Real Future

Awhile ago, Vic Ferrari posted something on his site about Mike Smith and his column at the Hockey News. Since Vic pointed it out I read it all the time, it’s an interesting viewpoint from a guy from the Roger Neilson school of hockey.

This month’s column is a dandy, and I think it tells us the Oilogosphere is not only on the the right track but is also very advanced. I don’t think we thank them enough, and would like to use this post do to just that.

Thanks. :-) The game is more interesting to me than it was a decade ago.

written by

The author didn‘t add any Information to his profile yet.
Related Posts

26 Responses to "Hockey’s Real Future"

  1. PDO says:

    Fantastic read, though I’m not sure I’m a big fan of the situation stuff.

    I mean, I noticed a guy like Lupul seemed to only EVER score when the game was out of reach…

    But I tend to think a lot of that is dice clanging and all that.

    I’ll throw out a guess that Kovalchuk is the unnamed player.

  2. godot10 says:

    Mike, NOT Neil.

  3. Lowetide says:

    godot10: Thanks! I knew it too, but always get them mixed up.

  4. godot10 says:

    It would be interesting to know if they test their analytics for statistical significance. Or have enough information to evaluate the statistical significance of their analytics.

    Statistical significance basically tells you how large the standard deviation or “error bars” are, or what the chances are of the analysis being significant (or meaningless).

  5. Lowetide says:

    I’m encouraged that they have a few years of date. It would be interesting to see how many players are consistently in the top tier (while making allowances for injury, etc).

  6. godot10 says:

    For example, on almost all the analytics that is being provided, no one is yet putting an error bar on the numbers, or give a percentage confidence level in the number. i.e. say Crosby’s EV/60 has a one sigma (67%) chance of being above X% or a two sigma (97%) change of being above Y%.

    Has baseball sabremetrics (I gave up on baseball during the strike) even advance that far?

  7. Shawn says:

    I think this kind of information certainly compliments normal hockey management. It’s got to be a mix of all things, but I can’t see why any team would completely ignore it.

  8. Oilmaniac says:

    Good read… Thats for that link LT

    Im wondering how long it would take the local number crunchers to decipher the mystery softie…

  9. Lowetide says:

    Well the language of the quote is misleading–Jagr might be the cuplrit but the vast majority of his career would have been through the roof.

    When did Yashin play last?

  10. Jonathan Willis says:

    And here my guess was Lecavalier.

  11. Dennis says:

    I don’t have any worries about Lupul.

    He’s the new Mike Bossy!

  12. St George says:

    Is that kind of detailed transactional data available anywhere? I would like try putting it in a multi-dimensional database …

  13. Lowetide says:

    St. George: I’m way over my head with this question, but wouldn’t the first step be to standardize all of the measures? When we’re talking about zones and areas and close games, don’t we need to cement the criteria?

  14. The Forechecker says:

    Man, I wish I didn’t have a day job getting in the way of this stuff. Assuming his DB is built off of the same real-time data the NHL records (and there’s little reason to believe it isn’t), this could all be replicated off of the published game files.

  15. Lowetide says:

    Well if this was a religion we’d buy a Sunday morning tv/radio show and ask for money.

  16. mc79hockey says:

    I’m going to write about this but I doubt that what Smith is selling has much value, unless he’s clever enough to bullshit in his Hockey News column about what it is he’s selling.

    You can probably get a good idea on who’s using his stuff from that bit about making the playoffs seventeen times out of twenty though. Assuming he’s maintined the same clientele throughout, the possibilities are limited.

  17. bookie says:

    godot10 – everything you mention about statistical significance only applies when you are working with a sample (usually, you are looking at 1000 people out of 100,000 or something like that – In this case, you could look at 1/10 of the games but they do not do that).

    It does not apply to this data because you have all instances of the data. So, there is no error bar, if Bill D. Player Scored 22 Game winning goals on Wednesdays with a full moon, that is exactly what he did – no error – no statistical significance.

    The only ‘error’ that applies is its effective as a predictor of future situations and there are some statistical tests that compare recent data (lets say the last 10 games) with the longer trend.

  18. YKOil says:

    I would imagine Smith would have more clients if his stuff wasn’t intangibles heavy (i.e. I’m with PDO on this one).

    In terms of the mystery player – starts from the Thornton year 1997 as he was the first rookie to get big bucks via bonus BUT he didn’t make them year 1 (I think)…

    … and neither did Lecavalier or Kovalchuk or Spezza

    Heatley maybe?

    And that is all the time I will spend on that worthless pursuit.

    However – to Vic, mc et al


  19. Alice says:

    Let’s see, it’s supply,
    “It’s pretty much hush-hush. We limit the number of franchises because we believe the information is too valuable to let every team have it.”
    OR… it’s demand,
    “when we first started I called 27 GMs”

    My guess is it’s demand, so he adjusts the spin to match the take-up rate.

    And where does qual-comp factor into it – if you get the heavy lifting and own-zone work when it counts… and you’re delivering 20th percentile offence… maybe that’s not what you’re being asked to deliver??
    Great call turning it into a business, though.

  20. Shannon says:

    Well, based on the article comments, the Oiler’s did not buy it last year and have not bought the coaching package in the past 3 years … is anyone surprised ?

  21. linnaeus says:

    There are so many things wrong with this “analytics” approach it is hard to know where to start.

    1. Smith is using vaguely defined categories and criteria. For example, what constitutes a close game? Seems like a simple question, doesn’t it? However, it is anything but simple.

    Lets say team B is down 5-0to Team A coming into the third period. For sake of argument lets say B comes back and ties it up 5-5and then loses 6-5 in overtime.

    Now exactly what scorer gets credit for scoring in a blow out? How many of the first five goals by team A are blow out goals using this analytics approach? Or do they all retroactively become “close game” goals? When does the game become close and where is the scoring window open?

    I seem to remember from my playing days that it was pretty damn tough to predict the outcome of a hockey game while it was happening. In other words, this guy is making a fundamental mistake. For all the first guy who socred knew he was scoring the game winner. Thus surely the first goal would always be a “close game” goal. So would the second guy’s goal for team A. Few games end up 1-0 so the likelihood is that the second goal is huge. So it is a “close game” goal. And so on.

    My point is there are two kinds of possible tests for close games. One, do the players feel it is close and how the hell would we determine that? Two, is it statistically close and what is the test of that? Isn’t it the final outcome that matters rather than slices of artificially defined time?

    Then there is the obvious difficulties arising from similarly trying to figure out which goal is more “valuable”. Is goal 6, the rally igniter, not an important goal? If you accept that it is, after all it changes the course of this game, then Lupul would have been a hero on a better team, one where when he scored when the game was out of reach his teammates followed up and brought it back in reach.

    I mean, we could go further and ask, is a 6-5 game winning goal, as important as a 1-0 game winning goal? Does the quality of the goalie scored against matter? How about when in the season it happens?

    Then as others have pointed out here there is the issue of quality of competition and quality of team. All of these are possible mistakes of category and criteria.

    2. Misuse of analytical tools.

    The most obvious mistake here is that he is using tools designed to analyze populations. First, the analysis is being applied to subjects and objects that don’t belong together. It is like he is rounding up every bird in Edmonton and then commenting on how often each bird shits on the High Level Bridge.

    The problem is very few spruce grouse are ever going to get close to the High Level. Similarly, a checking centre isn’t likely to be out much when his team is down 5-0 but much more likely to be out when it is 5-5. In other words some players simply will see more ice time when the game is close than others, especially the later in the game and the more important the game as coaches shorten their benches.

    Second, he is applying these tools rather carelessly across time. He is saying he knows what chunk of time is most important for his game with in a game analysis. I grew up watching a handful of great teams, Islanders, Flyers, Canadiens and Oilers when they won multiple Stanley Cups. They all won because they ground you down. They just kept coming in waves. When you say Glen Anderson was a great clutch scorer, which was recently debunked – though Smith analytics would come to a different conlcusion, you miss how much of an impact Kevin McLelland and Marty McSorley might be having pressing defencemen into the glass all night, and completely missing how worn down the oppostion goalie was after facing down thirty or more quality scoring chances earlier in the game. In otherwords, the problem here is there is no allowance for factors not stemming from the individual.

    Tonight Dan Clearly scored a very big goal for Detroit. Not the first of his career. Had he stayed in Edmonton how often would he have gotten a chance to score such a huge goal? Yet leaving Edmonton wasn’t his choice.

    3. Most importantly, where is the prove that this approach Smith is touting actually works? He talks a lot about teams making the playoffs if they buy his system. However, that raises far more questions than it answers.

    First, we would need to know how many of his clients made the playoffs the year before they were his clients. If he has lost clients how have they fared without his packages?

    Perhaps more importantly is how far have they been going in the playoffs while they were his clients versus before and after? Perhaps his “analytics” have more predictive value for the regular season than the playoffs.

    Another important question would be how much are his “analytics” actually being used? How are they performing in the real world? Do coaches stand behind the bench consulting his cheat sheets before they send out their next line? Do they mix and match players using his factoids? Do they match the player to the situation during the game?

    If you assembled a team using his analytics would they win every game?

    The proof of Bill James and Peter Palmer’s genius is that their sabermetics actually can be demonstrated to have predictive power in baseball. You can analyze each player in two teams lineups and work out which team is likely to win an individual game, a series, a season.

    No tool yet proposed for measuring individual NHL players or teams has that kind of predictive power. If this system did what Smith claims he could be making a lot more money gambling on the outcome of games than by selling his services to the teams.

  22. St George says:

    LT – absolutely, but the step before that is even understanding what’s publicly available.

    The Forechecker suggests that there are “game files” available. Are these available from the NHL?

  23. Bank Shot says:

    I’m not sure why any team would sign up for this package.

    If a GM were honestly interested in game statistics he could hire his own stats guy and have that man calculate any, and every possible permutation and leave everyone else out of the loop.

  24. Vic Ferrari says:


    I generally agree with what you are saying. Keep in mind, though, that Smith isn’t doing God’s work, he’s trying to sell a product.

    It’s a product that’s built from data that the team’s already have access to and use, so he has to find a niche. Is it a bad choice (assuming he really is focussing on ‘clutchness’ and not just misdirecting competitors)?

    From a ‘will it work?’ point of view, damn unlikely.

    From a ‘will it sell?’ point of view, maybe. Granted he’s only charging $50k per annum iirc, lunch money for an NHL team, and he’s not having much success attracting or maintaining clients by the sounds of it.

    Smith will blame that on the market’s lack of sophistication. The market will blame it on the product’s lack of efficacy. And we can take a best guess based on the snippets of information available. And I’m leaning towards the latter, clearly you are as well.

  25. Vic Ferrari says:

    Just to add, linnaeus, you are wildly overvaluing Runs Created. Like linear weights and base runs, these global valuators of conventional stats elements ignore context in the game state, and by their very design eliminate the context of the individual games themselves.

    So adding ‘caught stealing’ to linear weights makes no difference at all to winning and losing, in fact only the tiniest negative effect.

    This of course because by bringing in ‘caught stealing’ you are bringing in ‘stealing as well’. Can’t divorce the two. The other factors are far more common and universal and are largely independent of strategy.

    The fact is, if managers had the right balance of strategy, you’d expect CS to have a slightly positive effect on runs created and by multiplicative and linear methods. And null effect through the first five innings. This suggests that managers don’t steal quite enough, though teams with lighter bats probably skew that. And of course, as with sacrifice bunts, the decision revolves principly around the bullpens on both teams, and also on the quality of the starter, and whether he has good stuff on he day, and of course the ballpark. And it’s near as dammit anyways. Plus solving for least squares error is unlikely to yield the best model, though I don’t know the alternative.

    The early linear weights models (first world war era) are near as dammit anyways.

    The economics of the game is where it gets iffy. If a coach wants a puck moving defenceman, the GM will usually try to find him one. The coach doesn’t care how much the player is going to cost. It’s a small market, just a few coaches have to feel this need and convince their GMS … and the moderate increase in demand causes a huge increase in price.

    And that affects the market for other players (all of a sudden some veteran forwards who have a history of being able to outchance, but don’t have much finish … they can’t find work at what they think is reasonable coin. The money has flowed elsewhere.)

    Two years later the situation is reversed. Or maybe something completely different.

    The Oilers are looking for size on the forward ranks, guys who can win puck battles and drive the puck into the offensive zone and keep it there. Good for them.

    If only three or four other GMs have come to the same conclusion for their teams, then that doesn’t bode well for Oiler fans, because the market for those guys will swing too far, and in terms of salary or trade demands the price will be too high. And IF that happens, big IF, then we’ll see if Tambellini is stuck on dogma or if he understands the vagaries of this quirky little market.

  26. Ben says:

    In regards to the unknown player, I don’t think rookie bonuses ever allowed players to make more than 5million every year. I think Kovalchuk made the most money on his entry level deal and I think he topped out at 4.4 million after bonuses.

    If you ignore rookie contracts, there are only 4 players other than Crosby and Ovechkin who made over a 5 million average on the second contract onwards; Kovalchuk, Nash, Vanek, and Richards. Since Smith seems to be calling out a more established player than Vanek or Richards, I would guess Kovalchuk.

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

© Copyright -