It’s no secret that the NHL is in the process of making a generational leap in data sourcing, aggregation and analysis that will redefine the way the game is analyzed. Hockey “big data” is essentially making the leap from 2D to 3D. The next-gen data generated will provide an exponentially greater resolution of game events, player’s play and overall game characteristics. That’s certainly a positive for all of those involved. So when should we expect to see the data? Not so fast.
I’ve touched on the basics of what the NHL’s next-gen data ecosystem is likely to look like in a number of previous posts. (Here, here, here, and here). To simplify for the purpose of this post, players and pucks will now be emitting real-time data, hundreds and thousands of data points per second, which will paint an exponentially greater picture of the game being played, all in real time. Think video versus photograph.
General Access Hierarchy
So back to the “big data” produced and aggregated from these new real time data generating devices. Who will get to see it?
Simply put, we (fans, analysts and the general public) are likely to see a tiered structure for data access.
Think of it as an upside down pyramid, with teams, players and the league having the greatest access to the largest cross-section of the data generated.
The next primary access tier will likely be the third-party vendors and analysts that have an agreement or official partnership with the league.
Next will likely be some sort of a-la-cart data store provided by the league and/or third-party vendor to the general public (fewer data points/types, but at a cost). At the bottom will likely be the general public and armchair analysts.
Data life-Cycle and Accessibility Tiers
The data emitted by the players, pucks and officials will likely go to a master (secured) data warehouse, located on or off-site. The raw data will likely be controlled by the league, with teams and players having a big say in who sees that data. (Data related to player biorhythms and other health-related matters are likely subject to the same data rights a doctor-patient relationship has). Data privacy issues are likely being worked out as we speak.
The raw data dump will be reviewed (automatically through front-end data processors) for issues/errors and scrubbed before residing in a master-database, still only accessible by the League. Certain data types will be made immediately accessible to teams for in-game analysis and adjustments. A somewhat reduced data set will be made available to certain media and formal partners mentioned above, for real time use in television broadcasts and gambling revenue streams (which will be fueled by the next-gen data).
The next step down will likely be a data set that is further pruned and accessible through the NHL data warehouse for official partners and third-party data vendors. This will likely be where analysts and the fans see their initial access point for the new data. This is where the best publicly accessible analytics will also be found, but you will likely need to pay to be the best.
The next tier in the new data accessibility hierarchy will be a fairly stripped-down data set accessible to third parties for exploration and application development. It may be first privately provided to academia or used for in-house contests and other “crowd-sourcing events” to see what the “wisdom of the crowd” can do with the data sets, what apps they can develop, etc. These folks will possibly also have a leg up on the general public as far as resolution of analytics, but it’s likely any such access will come with a hefty non-disclosure agreement (NDA) limiting the value to the general public.
The final tier will be the data set that is accessible by all. Think of the NHL’s current stats site, and the accessibility of the data it currently provides, but with many more new data types available to download and use. This will also be fairly pruned down and be significantly less informative than the tiers outlined above.
What data types will be made available at each tier is still a huge unknown. It will likely start fairly stripped down and build over time, once legal concerns and data rights issues are finalized.
Affects of the New (Data) World
So the big question is, what affects will the aforementioned new data access structure have on the analytics community? It’s pretty clear the analysis community is about to get parsed into several pieces.
Its likely we will see primary media outlets (television partners) with elevated/top-level access rights to the largest data sets. We will see them begin to hire a pool of hockey analytics people (if they haven’t already). The primary media outlets will likely be one of the leading edges in next-gen analytics at the start. If they are smart, they also have a business plan in place to monetize that elevated access.
Gambling partnerships will also see similar elevated access. Public (armchair) analysts will have to recognize that there are better models being developed in the upper tiers (teams, partners, media), fueled by a greater resolution of data sets than what was traditionally seen in the past.
One final thought: How will the tiered access affect operations and management for teams? For example, contracts and contract negotiations? Teams will likely have access to a lot more data and information not readily accessible by others (player agents). If a trade is going to occur, will teams trade player data sets, not openly available to other teams? Will players be comfortable with that? This will probably be addressed in the next Collective Bargaining agreement (CBA). It’s likely players will eventually have access to almost all data generated.
That’s it for now. We will have much more on the new data systems and the third party industries approaching (gambling) in the coming weeks and months.
By Jon Sorensen
KHL Rolls Out New Data and Analytics Systems Ahead of NHL
Commissioner Bettman Provides Update on Puck and Player Tracking
Monumental Sports and Entertainment and William Hill Enter Partnership
Tracking Chips To Be Used In World Cup of Hockey
Update: Player and Puck Tracking Capabilities Come to the Capitals