Back in February I presented the first post on adding valued context to existing hockey possession metrics. The need for enhancing stats of this nature was to provide additional (missing, but meaningful) context to the existing advanced analytics.
Since February I’ve been working on a series of analytics models that provide enhanced resolution and greater context for today’s standard possession metrics. The focus of the initial effort was to normalize all game metrics and make comparable throughout the season. This will assist in the identification of false peaks and valleys, identify quality of play regardless of game results and assist in identifying strengths and weaknesses in the Capitals as a team.
The purpose of this post is to provide a very brief update on the progress of modeling and goals for the upcoming 2023-24 season.
But first, a brief recap. Our first step in the model development was to factor in the strength of opposition for each game played. In short an expected goals for percentage (xGF%) against the Anaheim Ducks is not comparable to an xGF% against the Boston Bruins. In other words, how well did the Capitals perform considering the strength of opposition?
The first version of the model simply multiplied the Capitals expected goals for percentage for a game by the oppositions pre-game winning percentage. In addition, it was determined that an average game score was an xGF% of 50.0% against a .500 team, which equates to constant value of 25. Thus, 25 is subtracted from the product of (xGF% x OppWin%) in order to derive a differential.
[xGF%(game) X OppWin%] – 25
Pretty straight forward, but now we can more accurately compare any game against any other game because we have normalized the standard game values.
The following graphic is from the initial post, which summarizes the first 53 games of the Capitals season.
In taking a quick glance at the resultant “game scores”, the Capitals overall performance over time begins to stand out. You can clearly see the slow start to the season but improvement in performance in November, the positive spike in performance in December and the early signs of a wobble and decline in January.
We can also ascertain, to a certain degree, the significance (weight) of the trend. Rather than seeing games as over or under the 50% plateau, we can also see weighted scoring for each game, with the strength of opposition factored in. Game score values over stretches of games can be summed to find a higher level of understanding.
Like all initial iterations of an analytics model, anomalies surface that require assessment, modification and fine tuning. Towards the end of the 2022-23 season we begin to modify the initial model to provide additional resolution to the quantitative performance measures.
Enhanced Performance Index
There is no question that teams win and lose games they shouldn’t have won or lost. All of the stats, eye-tests and supporting data say the team outplayed the opposition, but because of all sorts of outside factors, including puck luck, penalties, injuries in the game, etc., the final results didn’t agree.
In the first release of the model, it was noticed that there were three or four games games where the performance score did not completely agree with the overall performance of the team. As a result, it was determined that by adding the goals differential and expected goals scored differential, the final game performance score was more accurately represented when compared to the ground truth.
[[xGF%(game) x OppWin%] – 25] + (GF-GA) +(xGF -xGA)
The following screen shot from the second-generation model reflects those changes for all games since the return from the 2023- All-Star break, and provides new enhanced game scores. It also provides a color coding for each game to assist in identifying the trends of the team.
You may recall the Capitals dominating the Boston Bruins in the first game back from the All-Star Break, followed by a dud of a performance against the San Jose Sharks and a solid performance against the Carolina Hurricanes in the first three games back from the break. You can also clearly identify the slow decay to the season that follows.
VERSION 3 AND NEXT STEPS
Up to this point I’ve focused on the performance of the team as a whole, but in order to gain greater insight, we need to also apply advanced performer measures to logical components of the team, including forward lines, defensive pairs and individual performance. Next steps for version 3 of the model will begin to account for those factors, as well as begin to consider:
- injuries and other lineup variations against strength of opposition,
- line configuration and performances vs. strength of opposition,
- odds and betting lines.
The aforementioned fortification of the expected goals stat is only one additional brush stroke to the overall painting, but as we’ve often stated, the more brush strokes the better. In follow-up posts we will look to build on these qualitative performance measures as well as explore other areas for enhancing the meaning of other existing advanced metrics.
[The statistics used in this post are courtesy of Natural Stat Trick and the NoVa Caps Advanced Analytics Model (NCAAM). If you’d like to learn more about the statistical terms used in this post, please check out our NHL Analytics Glossary]
Much more to come this summer and fall.
By Jon Sorensen
Interesting, unique look at the games/season’s biorhythms.
intriguing to mention biorhythms even though here it was more used figuratively and not literally. But, individually, all of us are subject to rhythms in which our performance levels rise and fall and it may be possible to map an individual’s performance on the ice in such a way that will determine on what days in the cycle they can be expected to be at or near their peak and when they may be in a bit of a valley.
That’s a very interesting topic, Yogi. Now that all players are wearing sensors that collect biometrics, the data is there, I just don’t think it will ever become public, because of Hippa laws (health data privacy).
An example, Teams know how fast a skaters speed degrades in a game, which may be why he sits more in the third period. We (the public) would never know, and incorrectly surmise that he was benched for a blunder.
There is a data divide that is occurring (teams vs. public), which I have another piece on coming up.
Interesting note on Hippa laws. Agree, other than small snippets, the public will probably never see much of the player tracking data, unless NHLPA approves.
wow, very cool. Never thought about that with the skating but it does make sense. And my guess is that the troughs and peaks are not permanent but cycle through even within the course of a game.
I’m be intangible that seems to be omitted from most metrics is how well a team is performing at the time. Are they themselves in a peak or valley? That would be good to see.
Absolutely, a valued add for sure. I was pondering the development of a “last 10 games” multiplayer, but haven’t landed on anything that fits very well. But I will continue to work on it.
You know you’ve met the right Girl when she wants to hear about Standard Deviation, skew, kurtosis, 4-sigma, R-squared, expected value, central tendency and Caucy Distribution
Lol. You got that right! Very rare in the wild.
impressive stuff, Jon. It would be interesting to see if over the three years of the HCPL era, the shape of the Capitals’ performance curve was generally the same. Even before HCPL, I have the vague impression the Caps have been at their best in mid November through early January.
Thanks Yogi, and a very good idea re: HCPL. I will give it some thought on how best to pull/assess the data.
one wonders — especially with our aging players — if we can also map rhythms over time in which given players are more susceptible to injury. If we know that most injuries to Oshie, for instance, occur in days 8-11 in a 21 day cycle (assuming such a cycle can be found, or course, which is not a given), then maybe we sit him on those days.
Definitely! I personally would love to see that, but like Jon mentioned above, Hippa las may prevent release of player health/body metrics. HOWEVER, I bet teams are looking at that data after every game. Probably won’t be long before a player is waived because of health data and a lawsuit emerges from the NHLPA.
can definitely imagine that would happen.
I like this idea, weighting all stats to strength of competition.