Chasing Shadows, Moonlight Mystery

by Lowetide

I’m a big believer in risk averse drafting but the key is getting the right definition. People often say (I do it all the time) “just print off the Bob McKenzie list and take best number available” but the truth is a little more nuanced. Shades of grey abound. (Cover photo: Rob Ferguson).

THE ATHLETIC!

Give The Athletic as a gift or get it yourself and join the fun! Offer is here, less than $4 a month! I find myself reading both the hockey (Willis, Dellow, Pronman, et cetera) and the baseball coverage a lot, it’s a pure pleasure to visit. We’ll sell you the whole seat, but you’ll only need the edge.

The Vesel and Marino posts are just the beginning, both men doing good things this past winter. If you haven’t checked in with either player for a time, there has been real progress.

SPITTING CHICLETS

  • Milan Lucic: For me it’s just mentally having fun going to the rink again and mentally looking forward to the challenges we face as a team and as an athlete every single day where I think my mindset got very negative last year. So I was almost my own worst enemy where this year I’m going in with a happy healthier mindset and I think that’ll help me get back to the player I am and I think when you’re playing with the best player in the world it gives you a vote of confidence to try and step up and not let him down.I think we’re all feeling that and we all had fun in the 2016/17 season so we want to get back to being that team and winning on a nightly basis.

This is via Spittin’ Chiclets podcast via Original Pouzar via Beer League Heroes. In the interview (OP posted it at the end of yesterday’s post comments section) Lucic mentioned the darkness of winter is a difficult thing to overcome.

I have some experience with this, Mrs. Lowetide has had some issues (mostly in the past) surrounding early sunsets. There’s a name for it, can never remember, but the first winter we went through it was Regina, maybe 1984. I tried everything to get her out of the funk and it didn’t take. I bought her jewellery, we took a trip to Mexico, she finally started running after work (before the sun went down) and that seemed to help. It’s a thing, kind of like depression.

As for Lucic, it sounds like he’s in a good space and looking forward to the season. In a way, he’s a rookie again, with a lot to prove. I’ll be cheering like hell for him.

DRAFTING THE MCKENZIE LIST, 2011

This is the 2011 entry draft, I used it to show a specific example of what might have changed for the Oilers at the draft table. These numbers are correct save for the “HM” beside Travis Ewanyk, I am sure he was an honorable mention but it isn’t in my archive so there’s some question there.

The Oilers drafted in tune with Red Line Report, Samu Perhonen the only reach on their list. The McKenzie list shows Musil as a reach, and the rest of the group through No. 114 to be anything from a minor to major reach. The current Oilers drafting more risk averse, but does the team have the formula right? Let’s first have a look at this year’s drafting.

The first three selections are risk averse, the only dissenting vote coming from Corey Pronman on the Bouchard selection. My question is this, and I ask it in the hope this looks foolish five years from now: Are the Edmonton Oilers punishing lack of speed enough at the draft?

In the summer of 2016, in an item called A turn north at the draft table? I listed my priorities for a successful draft:

  • Value skill above all other things.
  • Let math do the work. Travis Ewanyk was a long shot the moment he was selected, the Oilers have been better in the last two drafted in this area. I also thought 2013 lined up pretty well with math, but the 2014 effort has me wondering if the organization is capable of a repeat performance.
  • Don’t walkabout in the top 100. Edmonton has been better in recent seasons (2014 aside).
  • In an unusual draft like 2014 (or 2003) make better use of those late picks.
  • Print off the Bob McKenzie list and compare it to your own list. If a player is ranked on the McKenzie list, and not on the Oilers list, why? There should be a very good reason and it can’t be ‘saw him bad’ or ‘he never looks good when I see him’ and that’s for sure. Have a good long look at the Pronman list, too.
  • If there is a shy offensive player on the McKenzie list, move that name down. Every time.

We also talk a lot about speed, how every season sees veterans get left behind. The David Musil pick (and the Griffin Reinhart trade) was part of a market correction that involves all kinds of big defensive defensemen (Dylan McIlrath, Luke Schenn). Maybe speed will increase again and again and maybe we reach a point where a player judged to have sufficient speed becomes deficient seemingly overnight because the league keeps getting faster. Are the Oilers anticipating speed increases or the current limit?

SPEED DEMONS

Since 2015, how many speedsters have been drafted by the Oilers? Based on visual evidence and trusted scouting reports, I count Connor McDavid, Caleb Jones, John Marino, Jesse Puljujarvi, Kailer Yamamoto, Skyler Brind’Amour, Ryan McLeod. That’s seven.

It doesn’t mean Edmonton is drafting a bunch of slow trains, but should speed be placed on the same level as skill? I think it’s an interesting question.

EVAN BOUCHARD

Since the draft I’ve been getting a few emails each week in regard to Evan Bouchard’s foot speed. It reminds me of the conversation I had with Steve Serdachny after the Nuge was drafted in 2011. I asked about speed, Sedarchny talked about Nuge’s great edges and suggested he would be effective immediately with his stops and starts. Nuge has had some issues as an NHL player (injury, 5-on-5 offense) but skating isn’t one of them.

  • HockeyProspect.com:  A transitional defenseman with good overall speed, his skating stride is awkward but he still generates power and manages to shift-gears quickly, allowing him to cut-wide on defenders and create additional offensive-chances.

I don’t have your answer, we’ll have to find out together. I do know Evan Bouchard is the most substantial offensive player who plays the position this team has drafted since Paul Coffey. It’s going to be a different experience finding out about him, Edmonton simply doesn’t invest in this player type often.

LOWDOWN WITH LOWETIDE

A busy day on the Lowdown as we motorvate toward the weekend. At 10 this morning, TSN1260, scheduled to appear:

  • Steve Lansky, BigMouthSports. Holidays, Blue Jays, final turn at Glen Abbey.
  • Aaron Kasinitz, Penn Live. The Baltimore Ravens have a quarterback situation.
  • Matthew Iwanyk, TSN1260. When will the Eskimos have a good start to a football game?

10-1260 text, @Lowetide on twitter. See you on the radio!

186 comments
0

You may also like

0 0 vote
Article Rating
186 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
rickithebear

15-16 oilers
(TOI/GM) EVGF60
Fwd:
Maroon (13:46) 4.36
Mcdavid (15:06) 3.62
Hall (16:19) 2.91
Drai (15:26) 2.86
Pouliot (13:14) 2.72
Eberle (15:09) 2.70
RNH (15:28) 2.61
Purcell (14:11) 2.49
Yak (12:28) 2.01
Khaira (10:17) 1.94
Kassian (12:01) 1.39
Korpikoski (11:06) 1.37
Pakarinen (9:13) 1.34
Letestu (11:18) 1.29
Hendricks (10:52) 1.22
Lander (9:19)

How many forwards could outscore the deans triangle EVGA + .31

1 – Maroon
Schultz (16:50) 3.25; 3.56
2 – Maroon, Mcdavid
Nurse (17:57) 3.05; 3.36
2 – Maroon, Mcdavid
Sekera (17:51) 2.82; 3.13
2 – Maroon, Mcdavid
Klefbom (17:20) 2.77, 3.08
2 – Maroon, Mcdavid
Reinhart (16:13) 2.68; 2.99
2 – Maroon, Mcdavid
Fayne (15:02) 2.66, 2.97
2 – Maroon, Mcdavid
Oesterle (18:21) 2.3; 2.62
6 – Maroon, Mcdavid, Hall, Drai, Pouliot, Eberle
Gryba (15:32) 2.11; 2.42
8 – Maroon, Mcdavid, Hall, Drai, Pouliot, Eberle, RNH, Purcell
Clandening (14:31) 2.06; 2.37
8 – Maroon, Mcdavid, Hall, Drai, Pouliot, Eberle, RNH, Purcell
Davidson (16:05) 1.90; 2.21
8 – Maroon, Mcdavid, Hall, Drai, Pouliot, Eberle, RNH, Purcell

Blender is not really a option for fwd lines.

rickithebear

Woodguy:

Thar she blows was posted by me weeks ago.
A full 4 season look at every unit GF and GA.

Not a team rank look but year to year gf & GA results.

But rule 1.

Cf-CA is dictated by
Quick transition passes from d to forwards.
No rover skating puck up
Avoiding NZ trap from Opp forwards.
Achieving HD penetration.
Avoiding 0% Corsi
Limited Off dmen down low out of position.
Forward in position for NZ trap
These are all decisions based on high ratio negative affects from failing.

Cf-ca are the failings of
(3 fwds – 2D – 1G)
(3fwd – 1rover – 1D)
(4 fwd – 1D)
Are the
3fwds or 3fwds & a rover or 4 fwds

Most on here are not fan of Dmen.
They are fan of rovers.

Looking at one of my past posts on HF boards.
I looked at past differentials from last wild card position.
With all teams even at special teams
forwards needed to outscore
Def triangles (2D-G) evga60 by .30.
I use .31 as a rough safe.

Reason I hated Rovers is HD abandonment that yields high HD shot density.
Requiring high EVGF/60 3 fwd units.
That is why CA established by Fwd rover failure is important.
It is the baseline dmen can reduce.
HD density is per CA measure.
HD dmen are really low Corsi success % location per CA dmen.

These are all things I explained pre 2011 on Lowetide.
But repeated
2011 on at HF boards.
3-5 years after Lowetide.

13-14 which is part of the +.31 evgf60 wild card success determination..
Now that can be be reduced by special team goal diff.
Cup core
1. HD sys coach
Run a system that emphassises Low EVGA.
1G outscores 0G
2G outscores 1 G
3 G outscores 2 G
4g outscores 3G
2. Top 10 HD goalie.
Is really one of 2 save% performance measure.
Above HD save% avg established.
Above open shot save% avg established.
3. 3+ Top 60 HD Dmen
Usually means 2 low Save% avg Dpair units.
Avoid a bottom 30 3rd pair.
Prefer both to be 1st comp capable.
4. Top 8 team top 125 fwd depth
Deep even forward depth to outscore ga rates.
5. +ve special teams.
Less pressure on even performance.

Georgexs

VOR,

There’s really quite a lot in this paper.

Take a look at the non-zero beta coefficients for goals. The vast majority of these players are forwards. Most defensemen have their coefficients shrunk to zero, meaning that, after controlling for other players, they have no effect on winning the goal battle. This is exactly the point I was trying to make to WG. Forwards have a bigger effect on winning and losing than defensemen.

It’s weird that you found this work and yet you insist on presenting it as irrelevant and uninformative. Where do you even get that the correlation with salary is so strong? Did you read the part where they showed that for negative effect players, their goals metric has no predictive power when it comes to salary. Meaning your smart GMs pay whatever they feel like to players who contribute more to losing than winning. And, yet, you side with the GMs every time…?

As for why the sabermetrician community hasn’t run with this, I don’t know who those people are. I don’t know what they do or how they go about things. I read some Yost; I read Hohl; I read Tulsky. I stopped reading. They’re working from a very limited base and they’re not particularly good at what they do. And they didn’t get better over the years. So when you post a link to their work and say here’s proof of something or other, it’s not a meaningful data point for me. I would pay more attention to something you yourself have done.

As you probably know and appreciate, it’s important to think for oneself and to test ideas before accepting them. I can understand what the authors of this paper have tried to do and what they’ve done. It’s just good, straightforward data science. It uncovers much. But it can stand improvement and it deserves further iterations. I want to thank you again for pointing me to it.

rickithebear

OriginalPouzar:
I saw Sunil’s tweet earlier this morning and it lines up with my thought that Ryan Strome at 1RW/2RW is something that should be explored by the coaching staff.He has great metrics with both McDavid and Drai (although the sample size with Connor is maybe too small to put any stock in to) and, if I remember correctly, even Drai and McDavid had better metrics with Strome that without.

I’m starting to think that Strome is an under-rated player.

Of course, in order to make that work, if the plan is still to keep Nuge on the wing, someone needs to step up and play 3C.

Well, after the trade deadline last year where the Blues traded Stastny, Brodziak moved up the lineup and had 11 points (8 primary) at 5 on 5 in 16 games and produced at a team leading 2.74 P/60 (5 on 5).

I don’t propose to play him full time at 3C but there is something to explore there.

When I first started looking at WOWY with Excel.
One of the files deleted @ work.

I looked at Upper tear of situational groups.
Then with Desjardins site.
Established quick situational baselines for skaters.
evg60; Evp60, GF60, GA60, Goal diff.

A lot of manual work and video review.

Then was made aware of stats hockey analysis.
Which allowed me too look for trended standards for
Diffrent core skills.
Was the perfect inventory of Bowman’s forward pairs.

We discussed line combinations.
I said from day 1
Look for the highest true data goal diff forwards pairs.
Create a list of all viable pairs presented in best goal diff order
Or
By high EVG60; EVP60;

Now NAT STAT provides a simplified reference.
With additions at free agency.
I did a WOWY visit of our players the best pairs
Looked for unit weaknesses.
But in the process had the same obsrpervation as you
And
Proffesor Q’s.

Try Strome at 2RW?
And
Khaira @ C.

crazy Coaches PG Khaira narrative had me on board day.
Son of immagrant Red Neck (Hard Working)(Cement truck driver)
Smaller than all his pears who finally got dads size.

Right after his 18yr old WCHA (12-13) season I went back on previous peers.
To this point I separated peers into 3 groups by game bias.

Sub 5’10” forwards nhl reference screwed by gm height bias.
Now with high PPG numbers and speed of game.
They are getting thier chance.

5’10” to 6’1” Forwards develop skate acceleration and body mass faster than 6’2”+ forwards.
It is why I always listed lb/in for each player.

Here is the origional list.
6’2” + fwds
12-13
Kerdiles 2” 1.03
Nieves 3” .73
Khaira 3” .68
10-11
Coyle 3” .70
Bjugstad 6” .69
K. Hayes 5” .45
09-10
Krieder 3” .61
Sheehan 3” .46
D. Shore 3” .46
08-9
Coulburne 5” .78
07-8
JVR 3” 1.10
06-7
Toews 2” 2” 1.35
Galliardi 2” .94
03-4
Stafford 3” .89
02-3
Vanek 2” 1.38
Kessler 2” .78
00-01
Umberger 2” 1.16
Steckel 6” 1.06

I see him as a great Center option.
That should play 2/3.

hunter1909

Professor Q: dominate

You say you’re a professor?

rickithebear

Important here!
A difference in philosophy.
I prefer true data compared as a +/- performance versus the mean
rather
than a flattening of data by spreading the group and presenting it as a none true data % which can making a multitude of differentials appear on the same location on a % based S curve.
Yet those same % based Diff are 2-10 completely diffrent goal diff cause of diffrent base GA. The true data appears. On diffrent parts of a true data S curve.
This creates a false impression of regression to mean.

As long as you use % you are not accurately placing your data correctly on data charts relative to performance.

I posted a list of players from 17-18 a few blogs back with same % with completely different goal diffs.
That would all Have diffrent standard deviations locations.

I figure this is why all of your not even close to getting it.
You are fooling your self!
Poor method.

VOR

Georgexs: 1. The paper looks at assessing a player’s contribution to winning. It controls for the contribution for the players he plays with and for situation (even strength, PP, PK). The Rel numbers you see on analytics sites don’t do this. The RelTGF% stat that WG uses doesn’t really do this either. It’s an ad hoc approach and it has trouble dealing with the multicollinearity that pops up when players play together a lot. The technique the authors are using is standard stats: determine the influence of one variable while controlling for the influence of other variables.

2. They use penalized logistic regression because the input matrix is sparse and highly collinear. I don’t know what you mean by hockey analytics has moved on from regressions. I don’t know the community that well. But I know how to assess whether a person has some training in statistical techniques. And I also know that penalized regression models are used extensively in machine learning. They’re effective, they’re interpretable, they’re easy to train. When combined with ensemble methods, they’re quite powerful. Linear models are simply a useful part of the machine learning toolkit.

3. You say they add data not knowledge. But you keep talking about just the players listed in the paper. Have you downloaded their repository from git and looked at their full dataset? There’s more there than Forsberg and Dom are really good. They’ve done quite a lot of work and provided a good foundation for anyone who wants to build on their ideas. Then, because they’re academics with real research interests, they moved on. I don’t know why you’re so harsh after calling the paper brilliant. But you’ve called a whole bunch of guys brilliant, so maybe that word means something different to you.

4. They discuss random forests but they were working on a way to retrospectively rank players by their individual contribution to winning. They weren’t working on a supervised learning problem. They couldn’t have used random forests for the problem they were working on because they didn’t frame the problem with prediction in mind.

5. You seem to feel it’s important to add understanding, not data. But you also keep going on about the limits of linear models and false positives. Random forests may produce better predictions. So may convolutional neural networks. But these models are basically opaque. Researchers are working on extracting the patterns embedded in their structure. But it’s complex work. If you want to increase understanding, you want simple, interpretable linear models.

6. Smart GMs or false positive. Are those the only two choices? When you say false positive, do you mean the model has incorrectly identified a bad player as a good player or that it has identified a good player as a bad player? And are you saying in situations where the predicted player value differs from the salary awarded by GMs, you pick GMs because they’re smart? Before I get defensive and snarky, I just want to make sure this is your thinking here.

7. As for that last paragraph, it’s my experience that while very, very few people are very, very good at many different things, lots of people feel confident in their understanding of many, many different things. it’s a conundrum.

My smart GMs comment was specific to the authors finding a relationship between a metric they just created and salaries NHL players are paid. The correlation is so strong it is like GMs and Agents already used this metric in salary negotiations. Did they just intuit it? Or maybe it is classic false positive. And since false positives pop up frequently in the use of linear regressions it seems like a safe bet.

I said quite clearly that I thought it was a false positive because otherwise you’d have to assume that GMs had intuitively reached the same conclusion without benefit of the math. I don’t think that is likely. There are other explanations for there metric and salaries aligning. But we could go on and on.

I called this paper brilliant for some of the same reasons you do. It is an extremely well thought out and executed attempt to isolate the impact of an individual player. I think this is a highly sought after goal.

Typically papers I refer to as brilliant have tremendous visualizations or beautiful math. Here as in The Least Best Shooter Hypothesis the math is explained in detail, with a clarity that makes it replicable, and it is an elegant solution to a complex problem. Every paper in analytics should be this well done.

I admire the time, effort, skill, and presentation. I accept the results without hesitation. I just find them trivial. By which I mean not useful in the real world. The authors themselves point out the results aren’t replicated from year to year. Thus they aren’t predictive. They are purely descriptive.

Ask yourself a simple question. How would you use these results to make smarter hockey decisions? This isn’t pure research. It is meant to have real world application. Show me that application.

And just so I am sure I understand – are you really saying it is better to make worse but transparent predictions than to make better but opaque predictions. If so I chose to disagree. I kind of like predictions to be predictive. Would it be wonderful if they were transparent as well? Without a doubt. But if I have to choose one or the other I pick predictive power.

This, in a nutshell is what is wrong with hockey analytics. The results however intriguing aren’t useful and their predictive power is poor.

If more opaque math leads to useful tools and heightened predictive power we should all learn the new math.

I don’t mean to trivialize the effort or skill you or other amateur sabermetricians bring to hockey analytics or to doubt your passion. I get that many people think they will stumble on a better way, a hockey version of OBP if you like. I am right in that race with you.

But here is a brilliant paper by outstanding statisticians taking a direct stab at what should be a game changer. I have long been a fan of Tom Awad and adjusted plus minus. I have talked about his work and that metric quite often here. This paper builds on that work and does everything right. It should be revolutionary gold standard stuff.

I will leave you with a final question. Why didn’t this approach come to be prominent among sabermetricians?

I stand by my assessment. The problem is reduction without synthesis.

Georgexs

VOR: I think they failed because there is nothing n this paper (beyond the method itself and the relationship with salaries both of which I will return to in a moment) that you can’t know with the naked eye or some very simple stats. I and millions of other people could tell you Joe Thorton had some good years, that Dom Hasek even in his twilight years was the Dominator, that Pavel Datysuk was a serious out scorer etc.

Ask yourself, does this paper enrich your understanding of hockey? It just muddies the waters. The authors don’t even attempt to explain why using Corsi and Goals gives two totally different lists. They also gloss over the fact that what is they call them PMP and PFP give different results, choosing once again to not explain why

There is nothing here of any use to hockey professionals either. It doesn’t help fans, the analytics community, or hockey pros and it certainly doesn’t illuminate the game.

This leaves us with the possibility the method itself will help someone else do something useful in the future. The method is explained in great detail and in my opinion considerable clarity. So it has value as a tool for future researchers. But that isn’t the stated goal of the paper. We are supposed to be seeing the creation of a new stat. We aren’t.

This in part is because it is a bit of work to prepare full rankings and the rankings don’t add value as I have said above. They add data not knowledge.

But the entire section where they discuss decisions trees and other non-regression tools tells you why it hasn’t been adopted hockey analytics has moved on from regressions. They produce far too many false positive relationships.

The relationship of their metric to salaries is almost certainly an example. GMs intuit this metric they created or it’s a false positive. Smart GMs or false positive. I know which I pick.

I would suggest you see how often this paper appears in SCI. For those of you following at home Scientific Citation Index is how scientists keep score.

Nothing arty about it. Not that, despite your defensive snarkiness, there would be anything wrong with that. Nope years in Socratic graduate seminars dedicated to reading scientific papers.

1. The paper looks at assessing a player’s contribution to winning. It controls for the contribution for the players he plays with and for situation (even strength, PP, PK). The Rel numbers you see on analytics sites don’t do this. The RelTGF% stat that WG uses doesn’t really do this either. It’s an ad hoc approach and it has trouble dealing with the multicollinearity that pops up when players play together a lot. The technique the authors are using is standard stats: determine the influence of one variable while controlling for the influence of other variables.

2. They use penalized logistic regression because the input matrix is sparse and highly collinear. I don’t know what you mean by hockey analytics has moved on from regressions. I don’t know the community that well. But I know how to assess whether a person has some training in statistical techniques. And I also know that penalized regression models are used extensively in machine learning. They’re effective, they’re interpretable, they’re easy to train. When combined with ensemble methods, they’re quite powerful. Linear models are simply a useful part of the machine learning toolkit.

3. You say they add data not knowledge. But you keep talking about just the players listed in the paper. Have you downloaded their repository from git and looked at their full dataset? There’s more there than Forsberg and Dom are really good. They’ve done quite a lot of work and provided a good foundation for anyone who wants to build on their ideas. Then, because they’re academics with real research interests, they moved on. I don’t know why you’re so harsh after calling the paper brilliant. But you’ve called a whole bunch of guys brilliant, so maybe that word means something different to you.

4. They discuss random forests but they were working on a way to retrospectively rank players by their individual contribution to winning. They weren’t working on a supervised learning problem. They couldn’t have used random forests for the problem they were working on because they didn’t frame the problem with prediction in mind.

5. You seem to feel it’s important to add understanding, not data. But you also keep going on about the limits of linear models and false positives. Random forests may produce better predictions. So may convolutional neural networks. But these models are basically opaque. Researchers are working on extracting the patterns embedded in their structure. But it’s complex work. If you want to increase understanding, you want simple, interpretable linear models.

6. Smart GMs or false positive. Are those the only two choices? When you say false positive, do you mean the model has incorrectly identified a bad player as a good player or that it has identified a good player as a bad player? And are you saying in situations where the predicted player value differs from the salary awarded by GMs, you pick GMs because they’re smart? Before I get defensive and snarky, I just want to make sure this is your thinking here.

7. As for that last paragraph, it’s my experience that while very, very few people are very, very good at many different things, lots of people feel confident in their understanding of many, many different things. it’s a conundrum.

Wilde

Woodguy v2.0: Did they switch from 16/17 to 17/18?

Yes

VOR

Georgexs: I notice you use “brilliant” rather liberally. I’ve seen you use it for work done by folks who clearly lacked a background in statistical analysis. This time, however, the authors are the real deal.

Is this how they acknowledge their work was for nought?

“Regardless of these and other possible complex extensions, we argue strongly that our simple L1 penalized logistic regression has much to recommend it. The model is very simple to interpret and relies upon minimal restrictive assumptions on the process of a hockey game. Our measures are also much faster to compute than any of the alternatives. These qualities make sophisticated real-time analysis of player effects possible as games and seasons progress.”

I want to thank you for the link though. This is what I was telling Ryan I would try if I ever get around to parsing the JSON files off of nhl.com, right down to the actual techniques they employed. The only thing they didn’t do that I would try is to use the model in a predictive capacity rather than just use it retrospectively. They decided to test their metric by correlation with salary (market value). Very academic choice to make.

But the authors are legit. Not sure why you think they failed or why you think they think they failed. Maybe it’s the artist in you. Your criticism is not at all technical, more Sorrows of Young Werther.

I think they failed because there is nothing n this paper (beyond the method itself and the relationship with salaries both of which I will return to in a moment) that you can’t know with the naked eye or some very simple stats. I and millions of other people could tell you Joe Thorton had some good years, that Dom Hasek even in his twilight years was the Dominator, that Pavel Datysuk was a serious out scorer etc.

Ask yourself, does this paper enrich your understanding of hockey? It just muddies the waters. The authors don’t even attempt to explain why using Corsi and Goals gives two totally different lists. They also gloss over the fact that what is they call them PMP and PFP give different results, choosing once again to not explain why

There is nothing here of any use to hockey professionals either. It doesn’t help fans, the analytics community, or hockey pros and it certainly doesn’t illuminate the game.

This leaves us with the possibility the method itself will help someone else do something useful in the future. The method is explained in great detail and in my opinion considerable clarity. So it has value as a tool for future researchers. But that isn’t the stated goal of the paper. We are supposed to be seeing the creation of a new stat. We aren’t.

This in part is because it is a bit of work to prepare full rankings and the rankings don’t add value as I have said above. They add data not knowledge.

But the entire section where they discuss decisions trees and other non-regression tools tells you why it hasn’t been adopted hockey analytics has moved on from regressions. They produce far too many false positive relationships.

The relationship of their metric to salaries is almost certainly an example. GMs intuit this metric they created or it’s a false positive. Smart GMs or false positive. I know which I pick.

I would suggest you see how often this paper appears in SCI. For those of you following at home Scientific Citation Index is how scientists keep score.

Nothing arty about it. Not that, despite your defensive snarkiness, there would be anything wrong with that. Nope years in Socratic graduate seminars dedicated to reading scientific papers.

Woodguy v2.0

Georgexs: Maybe. Demonstrate away.

It’ll be good to see your criteria for evaluating defensemen.

I go with how many minutes they play, what’s the score in the minutes they play, what’s the score in the minutes they don’t play. Then I compare them to similar others based on where they’re at in their career.

Teams may not have enough defensemen to cover all of their TOI spots. It’s bound to happen. I guess. That’s sort of a buy in.

I’d like to see who you don’t like. Because I think you like Demers, Hamonic, Hamilton, Carlo, Hjalmarsson, etc. Which is, let’s see…

I’ll try to put something together this weekend. Not sure if I’ll have enough time, but I’ll start.

Might put it on my blog.

Woodguy v2.0

Wilde: It’s also pointing to the switch in own-zone structure.

You can’t play man with an excess in slow-stepping forwards unless you teach it really, really well.

They didn’t so they couldn’t.

Did they switch from 16/17 to 17/18?

Wilde

Georgexs: http://shiftchart.com/

gracias for this and your other efforts this thread

OriginalPouzar

Scungilli Slushy: And Eberle was barely a PPG scorer in junior. Bouchard as a D was a far better point producer, and also not on a strong team. I would bet if not for the ‘Boys’ wrong headedness because Eberle’s WJC heroics Ebs would have dropped to 2nd round like McLeod did.

I don’t think Eberle played in the World Juniors (U-20s) until after he was drafted. He played in the U-18s but not the main U-20s. Not many on the Canadian team are undrafted.

Georgexs

Wilde: I was talking about a visuallisation

http://shiftchart.com/

Wilde

Georgexs: ?

nhl.com

here’s an example:

http://www.nhl.com/stats/rest/shiftcharts?cayenneExp=gameId=2011020001

I was talking about a visuallisation

Georgexs

Scungilli Slushy,

“Trying to find the global parameters that seem to have relevance in macro detail.”

If this is what WG is doing, I’m glad it makes sense to one of us.

Georgexs

VOR: None of the metrics I mentioned are directly linked to Corsi never mind Corsi for percentage. They are all about transitioning the puck. Corsi is based on shots. But you know that.

You probably also know synergy is rapidly replacing reduction in sports analytics. This is because the usefulness of highly reduced concepts and isolationist regressions for either coaches or GMs is nearly zero. I can actually demonstrate this is true for your idea of difference makers.

The following brilliant paper comes at the same idea but goes even further isolating the player from teammates completely. It is like what you are proposing but on steroids.

https://arxiv.org/pdf/1510.02172.pdf

From all that tremendous math they learned Peter Forsberg, Joe Thorton, Pavel Datysuk, Sidney Crosby and Alex Ovechkin can really impact a season but so can Mark Streit and Lubomir Visnovsky and Dom Hasek as measured by real +/- (isolated GF% if you like). But when they did the same analysis with Corsi For % they got totally different results. Oh they also learned Jack Johnson is terrible. A genuine negative difference maker.

And Ryan Nugent Hopkins was once value for money.

They also learned that their own metrics didn’t give the same result. Oh and the results weren’t stable from year to year. But at least salary follows real isolated +/-. Which of course means GMs gain no advantage from knowing this since it is already baked in.

So let’s see, the most sophisticated attempt ever undertaken to establish the true difference makers led to diddly squat of value. Even the authors start discussing synergistic approaches as they acknowledge all their work was for nought.

I have dozens of papers I can site that show dividing hockey up into smaller chunks adds little to no new knowledge. Eventually you and the other reducers will realize the problem isn’t the choice of chunk or of statistical tool. Nope the problem is much as we wish it so hockey doesn’t reduce.

I notice you use “brilliant” rather liberally. I’ve seen you use it for work done by folks who clearly lacked a background in statistical analysis. This time, however, the authors are the real deal.

Is this how they acknowledge their work was for nought?

“Regardless of these and other possible complex extensions, we argue strongly that our simple L1 penalized logistic regression has much to recommend it. The model is very simple to interpret and relies upon minimal restrictive assumptions on the process of a hockey game. Our measures are also much faster to compute than any of the alternatives. These qualities make sophisticated real-time analysis of player effects possible as games and seasons progress.”

I want to thank you for the link though. This is what I was telling Ryan I would try if I ever get around to parsing the JSON files off of nhl.com, right down to the actual techniques they employed. The only thing they didn’t do that I would try is to use the model in a predictive capacity rather than just use it retrospectively. They decided to test their metric by correlation with salary (market value). Very academic choice to make.

But the authors are legit. Not sure why you think they failed or why you think they think they failed. Maybe it’s the artist in you. Your criticism is not at all technical, more Sorrows of Young Werther.

digger50

deardylan:
Re: SAD and also JetLAG. Body rhythm is out of whack.

I mentioned to my coach that sometimes I feel sluggish even a week after a long flight: my body and time just isn’t aligned.

He said “have you put your feet into the soil and walked a few minutes on the earth?”

I said no I live in apartment and wear shoes with rubber soles.

He said “your body needs grounding and connection to Mother Earth”

So I would walk barefoot in local park and sit on the grass for a bit. Everytime I did this – it did the trick.

Not sure if SAD could also be connected to our lack of grounding and connection to the earth?

In dark winter in Canada maybe even harder. Wonder if my coach would recommend I jump into hottub and then jump out of it to roll in the snow for a few minutes? Or making a snowman? Throwing snowballs?Snow-angels?

Anyone others ideas of how you ground/connect to nature in the wintertime?

In northern Alberta, snowmobiling is fun.

Go as fast as you can until you see God, then grab for the brake.

That will get you going.

Georgexs

Wilde:
By the way, do we know any sites that have shift charts with actual time stamps?

?

nhl.com

here’s an example:

http://www.nhl.com/stats/rest/shiftcharts?cayenneExp=gameId=2011020001

BONE207

dustrock,

I felt I wasn’t as productive at work, was drinking more than usual.

I found this to be the case when I went for liquid lunches. Summer time Fridays this should be mandatory.

VOR

Scungilli Slushy:
WILDE
GEORGEXS

Love your passion for analyzing.

We’ve been down this road before over years, several times. Perhaps you were involved under different pseudonyms.

The reason teams and those involved push back on blog stats is that currently there is simply a lack of access to data, they know that. What those bloggers with the math background can do is necessarily limited. They (teams) have more data. Many don’t know how to use it perhaps, but many do.

It’s not at a point where there is enough data to be definitive for us. At least without a massive effort by bloggers to track, which some are doing, but we’re talking thousands of hours league wide. It isn’t happening.

So what WG is doing makes sense to me. Trying to find the global parameters that seem to have relevance in macro detail.

There currently isn’t micro detail that is informed by enough data to know if it’s whole. So the debates go on back and forth, no success in nailing down a metric that stands yet without significant limitations.

I enjoy reading what you come up with, it’s important and interesting pushing the envelope, just my 2c.

I would second your esteem for Georges and Woodguy’s passion for analysis.

My problem is I think they need to spend some time on integrative models and synergistic metrics.

This is true of hockey analytics as a discipline. It is highly slanted towards the reductionist. Not to mention there has been a near total dependence on linear regression as the analytic tool of choice despite its well understood weaknesses.

I think your argument that more data would make it possible to reduce hockey and produce useful stats is exactly backward. Data smog is a very real phenomenon. Even if the signal to noise ratio remained constant (and it wouldn’t) the total noise of big data sets can make analysis nearly impossible (see paper I just linked). As the amount of noise relative to signal rises it truly becomes impossible.

VOR

Georgexs:
VOR,

So working together is CF% then?

None of the metrics I mentioned are directly linked to Corsi never mind Corsi for percentage. They are all about transitioning the puck. Corsi is based on shots. But you know that.

You probably also know synergy is rapidly replacing reduction in sports analytics. This is because the usefulness of highly reduced concepts and isolationist regressions for either coaches or GMs is nearly zero. I can actually demonstrate this is true for your idea of difference makers.

The following brilliant paper comes at the same idea but goes even further isolating the player from teammates completely. It is like what you are proposing but on steroids.

https://arxiv.org/pdf/1510.02172.pdf

From all that tremendous math they learned Peter Forsberg, Joe Thorton, Pavel Datysuk, Sidney Crosby and Alex Ovechkin can really impact a season but so can Mark Streit and Lubomir Visnovsky and Dom Hasek as measured by real +/- (isolated GF% if you like). But when they did the same analysis with Corsi For % they got totally different results. Oh they also learned Jack Johnson is terrible. A genuine negative difference maker.

And Ryan Nugent Hopkins was once value for money.

They also learned that their own metrics didn’t give the same result. Oh and the results weren’t stable from year to year. But at least salary follows real isolated +/-. Which of course means GMs gain no advantage from knowing this since it is already baked in.

So let’s see, the most sophisticated attempt ever undertaken to establish the true difference makers led to diddly squat of value. Even the authors start discussing synergistic approaches as they acknowledge all their work was for nought.

I have dozens of papers I can site that show dividing hockey up into smaller chunks adds little to no new knowledge. Eventually you and the other reducers will realize the problem isn’t the choice of chunk or of statistical tool. Nope the problem is much as we wish it so hockey doesn’t reduce.

Scungilli Slushy

ashley:
There was much concern about Eberle’s foot speed in his years after draft.It was partly why he dropped to where he was drafted.It’s hard to tell until we see them play a game in the NHL.Time will tell with Bouchard.

And Eberle was barely a PPG scorer in junior. Bouchard as a D was a far better point producer, and also not on a strong team. I would bet if not for the ‘Boys’ wrong headedness because Eberle’s WJC heroics Ebs would have dropped to 2nd round like McLeod did.

Wilde

By the way, do we know any sites that have shift charts with actual time stamps?

Wilde

Georgexs: I agree. Auvitu is worth the bet no GM appears willing to make.

It’s half Auvitu being pretty damn good, and half that there’s an /abundance/ of defencemen who are not quite reliable enough in coverage to play in the top four, but give you so much more elsewhere that they can play 3rd pairing, yet good, successful teams would rather play guys that literally do none of those things like Yannick Weber and Dan Girardi and Dion Phaneuf.

Scungilli Slushy

WILDE
GEORGEXS

Love your passion for analyzing.

We’ve been down this road before over years, several times. Perhaps you were involved under different pseudonyms.

The reason teams and those involved push back on blog stats is that currently there is simply a lack of access to data, they know that. What those bloggers with the math background can do is necessarily limited. They (teams) have more data. Many don’t know how to use it perhaps, but many do.

It’s not at a point where there is enough data to be definitive for us. At least without a massive effort by bloggers to track, which some are doing, but we’re talking thousands of hours league wide. It isn’t happening.

So what WG is doing makes sense to me. Trying to find the global parameters that seem to have relevance in macro detail.

There currently isn’t micro detail that is informed by enough data to know if it’s whole. So the debates go on back and forth, no success in nailing down a metric that stands yet without significant limitations.

I enjoy reading what you come up with, it’s important and interesting pushing the envelope, just my 2c.

ashley

There was much concern about Eberle’s foot speed in his years after draft. It was partly why he dropped to where he was drafted. It’s hard to tell until we see them play a game in the NHL. Time will tell with Bouchard.

Wilde

To be clear, I think playing man at 5v5 and a hyper-aggressive wedge+1 4v5 is the ideal way to play hockey in the current NHL.

You have to have the personnel to execute it and the teachers to teach it, though, and maybe most importantly you cannot be benching players for mistakes as a diciplinary measure and absolving ownership of the team’s failures to the media. There has to be zero fear and an ultimate level of trust in your teammates, including your goaltender.

Georgexs

VOR,

So working together is CF% then?

VOR

Georgexs: How would you measure work the best together?

Hockey is very much a team game. Not sure where either of us has missed that point.

I’m arguing forwards are more important than defensemen for team success. WG is arguing the other side. Different teams could pursue different strategies. It seems a stretch to say we’re unaware that teams do things that run against each of our positions.

As for in game and in season adjustments, not sure how much room there is for that. Bets have been placed. You’d have to replace the gamblers to see different bets.

I’d use an efficiency measure to determine best together. We use them all the time in basketball. The one Hudson proposes above is a good place to start. As are controlled exits from your own end and controlled entries at the other end.

I’d predict there is an algorithm that links the three events that would give a very good proxy of unit efficiency.

Then I’d see if that correlated with winning hockey games. Hudson is right Spotrslogiq is probably capable of knocking out a data base like he proposes with ease and I know they already collect zone exit and entry data by player and unit.

That is of course just off the top of my head. I’d need to think about it.

Georgexs

Wilde:
The single largest driver of the 5v5 GA problem was not defensemen underperforming or injury.

It was the tactical decisions and implementation thereof by the coaching staff.

Preach.

When the coaches do their job so badly, it’s very hard to get a read on players. I’ve only witnessed this with the Oilers. I assume it happens all over. So much luck involved for some players to catch on.

Georgexs

VOR: I would make a counter bet. That neither of you can prove that you are right because the value ofdefencemen and forwards is intrinsically tied to their ability to work together.

In case you miss my point. I am accusing you both of the logical flaw of reducto absurdum. You are both trying to divide the indivisible.

Don’t get me wrong it is great fun to read. So please continue.

Really not sure what’s going on.

In looking at the data, I’ve observed that a difference making forward is much more likely than a difference making defenseman. Results, as measured by goals, follow forwards. Over many seasons. (Because goals need lots of data.) They are much less likely to follow defensemen.

The work together thing is like pointing out hey guys you do know hockey is played with sticks, right?

Wilde

The single largest driver of the 5v5 GA problem was not defensemen underperforming or injury.

It was the tactical decisions and implementation thereof by the coaching staff.

Georgexs

Wilde:
Speaking of NHL defensemen, I’d like to take this moment to utter into the ether that Yohann Auvitu is a better 3LD than at least 20 NHL teams will dress at that position in the 2018-19 season.

I agree. Auvitu is worth the bet no GM appears willing to make.

Georgexs

Wilde:
I think defensemen seem very, very thresholdy.

The most potent, poisonous action you can take to a group of 4 NHL players is complete the quintet with a non-NHL defenseman.

The frequency and danger of the chances given up is unreal and has no remedy. If the ‘NHL’ hoop isn’t jumped through in the adage “NHL defencemen don’t affect on-ice save percentage” then the floor becomes false and everything sets on fire.

Take Ethan Bear last year. When Bear was on the ice every forward trio placed against him had the results of an uber-first line. Connor McDavid had the highest rate of NaturalStatTrick’s claimed high danger chances for, his rate rounds up to 16 per hour from not far away.

When Bear was on the ice, his unit gave up 20 and a half. Three and a half actual goals. Pushing a goal against every 15 minutes.

(I like Ethan Bear as a prospect, just using him as an example)

Great way of putting it.

Every young player trying to break in has to meet a threshold of performance. Otherwise they’re easy marks for the vets.

Once a defenseman meets the threshold, they have a development period in which they can improve and take on the physical and mental demands of more minutes, that is, reach the next threshold. A good enough defense has all players playing at their level of competence. As WG says, they mess with the opposition sorties and get the puck to the forwards.

As WG also points out, coaches make the mistake of playing a defenseman beyond his proven threshold. They usually pay for this mistake in goals and losses. So, unless they have no other options, they adjust too.

Wilde

Woodguy v2.0:
Goals Against/60 McDavid Off
16/17 2.05
17/18 2.59
Almost an identical drop as McDavid. (0.56 & 0.54)
This is all pointing to “Dcorps & Goalie” driving the GA/60 up.

It’s also pointing to the switch in own-zone structure.

You can’t play man with an excess in slow-stepping forwards unless you teach it really, really well.

They didn’t so they couldn’t.

VOR

Georgexs: How would you measure work the best together?

Hockey is very much a team game. Not sure where either of us has missed that point.

I’m arguing forwards are more important than defensemen for team success. WG is arguing the other side. Different teams could pursue different strategies. It seems a stretch to say we’re unaware that teams do things that run against each of our positions.

As for in game and in season adjustments, not sure how much room there is for that. Bets have been placed. You’d have to replace the gamblers to see different bets.

I would make a counter bet. That neither of you can prove that you are right because the value of defencemen and forwards is intrinsically tied to their ability to work together.

In case you miss my point. I am accusing you both of the logical flaw of reducto absurdum. You are both trying to divide the indivisible.

Don’t get me wrong it is great fun to read. So please continue.

Georgexs

Ryan: If I were conducting a retrospective study using data from EMR’s in which I was looking for risk factors for osteoarthris of the knee and I did an analysis using the categorical variables of normal bmi, overweight and obese, would this be somewhat analogous to what woodguy’s doing? Granted, I would probably use binomial logistic regression to see if these differences were statistically significant.

The “binning of competition” data that some math nerds have frowned upon is simply a way of creating categorical data. In my analogy, it’s analogous to BMI.

The DFF and quality of comp data Woodguy uses are independent data sets. In this instance the DFF only differs from a diagnosis in that it’s a continuous rather than categorical variable.

I did this using aggregate data, not event data. I’ve been meaning to fully parse the nhl play by play and shift chart files. Haven’t gotten around to it. But, yes, I’d probably try ridge regression with shots as the records and goals as the binomial response and players as the categorical explanatory variables. I’d expect to find that the coefficients for nearly all defensemen to be insignificant, based on what I’ve seen so far. Because the on/off results for defensemen rarely reach the standard of significance. I won’t speak for WG as far as what he’s doing.

Georgexs

VOR:
Georgesx and Woodguy v 2.0,

Isn’t the correct question which team’s D and forwards work the best together?

And does that correlate with winning hockey games?

While your argument is fascinating intellectually It misses the irrefutable fact that hockey is a team game.

You also seem to be unaware of the possibility that the answer to which is more important Fs or D could be different for different teams. The answer could also change depending on game state, regular season or playoffs and for that matter under the influence of score effects.

How would you measure work the best together?

Hockey is very much a team game. Not sure where either of us has missed that point.

I’m arguing forwards are more important than defensemen for team success. WG is arguing the other side. Different teams could pursue different strategies. It seems a stretch to say we’re unaware that teams do things that run against each of our positions.

As for in game and in season adjustments, not sure how much room there is for that. Bets have been placed. You’d have to replace the gamblers to see different bets.

Scungilli Slushy

Team systems and buy in have to be factored in as well to understand individual performance.

This is why vet teams with stable rosters repeat performance yearly. And why Vegas performed above expectations and sre prime for regression to their talent level. There is also behind closed door aspects to this.

IMO it’s also why coaches seem to do unsmrt things. There is a lot to factor in. The venerable Vic Ferrari said he didn’t think teams were stupid, and I don’t think that was said to get hired.

OriginalPouzar

Tom Wilson – 6 X $5.156M.

Yes, that Tom Wilson.

Wilde

Speaking of NHL defensemen, I’d like to take this moment to utter into the ether that Yohann Auvitu is a better 3LD than at least 20 NHL teams will dress at that position in the 2018-19 season.

Ryan

Georgexs: It’s really important you revisit your thinking here. Because, the way you’ve explained it, this is not where you really started to understand. You took a set of data, broke it into smaller sets on a variable, observed the variance, and failed to ask if the variance could be attributed to the variable.

If I were conducting a retrospective study using data from EMR’s in which I was looking for risk factors for osteoarthris of the knee and I did an analysis using the categorical variables of normal bmi, overweight and obese, would this be somewhat analogous to what woodguy’s doing? Granted, I would probably use binomial logistic regression to see if these differences were statistically significant.

The “binning of competition” data that some math nerds have frowned upon is simply a way of creating categorical data. In my analogy, it’s analogous to BMI.

The DFF and quality of comp data Woodguy uses are independent data sets. In this instance the DFF only differs from a diagnosis in that it’s a continuous rather than categorical variable.

Wilde

I think defensemen seem very, very thresholdy.

The most potent, poisonous action you can take to a group of 4 NHL players is complete the quintet with a non-NHL defenseman.

The frequency and danger of the chances given up is unreal and has no remedy. If the ‘NHL’ hoop isn’t jumped through in the adage “NHL defencemen don’t affect on-ice save percentage” then the floor becomes false and everything sets on fire.

Take Ethan Bear last year. When Bear was on the ice every forward trio placed against him had the results of an uber-first line. Connor McDavid had the highest rate of NaturalStatTrick’s claimed high danger chances for, his rate rounds up to 16 per hour from not far away.

When Bear was on the ice, his unit gave up 20 and a half. Three and a half actual goals. Pushing a goal against every 15 minutes.

(I like Ethan Bear as a prospect, just using him as an example)

Pescador

LadiesloveSmid: Thank you WG

Someone commented on it a few weeks ago. I told the story of me meeting Smid last summer outside a bar in Calgary. Super nice guy, I imagine the ladies loved him.

He married now,
LadyloveSmid

Professor Q

King George 10.5, adding to the confusion and intellectual hockey discourse and debate with Woodguy v2.0.

This is a great Friday night!

So, defencemen are both important and unimportant at the same time, influential and inconsequential to Goals and TOI, in both of your views.

Schrödinger’s Blueliner.

Georgexs

Woodguy v2.0:
Georgexs,

The variance in the on/off performance of defensemen is just not that big. They provide a mostly neutral background against which forwards decide outcomes.

I think I understand where you are coming from now.

When I really started to understand how important Dmen were to a team is when I started to look at their results with each line (usually looking at a C as a proxy for the line)

Since Dmen play with each line, and each line achieves different CF% and GF% the fact that they play with each line washes out their overall rels to “meh, doesn’t matter much”

It’s really important you revisit your thinking here. Because, the way you’ve explained it, this is not where you really started to understand. You took a set of data, broke it into smaller sets on a variable, observed the variance, and failed to ask if the variance could be attributed to the variable.

Georgexs

Woodguy v2.0:
Georgexs,

In contrast, there are many difference making forwards, both good and bad. In all the years we had a not NHL capable defense and suspect goaltending, Taylor Hall still won his minutes. And a whole bunch of forwards who are no longer in the league lost their minutes.

Hall is in a class of forwards that has a population of ~20

I’ll agree that the very elite among the forwards make a huge difference when they are on the ice.

A couple teams have 2.

Most teams have 1.

Some teams have none.

All teams have 4-5 Dmen who play more 5v5 TOI/gm than their elite forward(s).

Is this how you get to defensemen are more important? Comparing TOI of forwards and defensemen?

VOR

Georgesx and Woodguy v 2.0,

Isn’t the correct question which team’s D and forwards work the best together?

And does that correlate with winning hockey games?

While your argument is fascinating intellectually It misses the irrefutable fact that hockey is a team game.

You also seem to be unaware of the possibility that the answer to which is more important Fs or D could be different for different teams. The answer could also change depending on game state, regular season or playoffs and for that matter under the influence of score effects.