We have previously defined notable terms in our Artificial Intelligence (AI) Dictionary, considered the importance of data quantity and quality as inputs and injecting contextual factors, and most recently, discussed analytics transparency with Zone7.
In this next post on AI in sports science, we turn our attention to the system outputs. Specifically, we are going to look at the delivery portion of the data pipeline. However, before discussing how practitioners can integrate the system into their daily workflow, it is worth becoming more familiar with the information the user receives.
When it comes to system output design, there are a multitude of decisions that need to be made with great care and intention by the provider. Thanks to Zone7 (specifically Tal, Eyal, Rich, and Ben) for sharing insight into their system outputs and how they make these design decisions.
Why are the outputs important?
AI outputs have the potential to bamboozle human operators. IBM’s AI-driven Deep Blue software infamously beat world chess champion Gary Kasparov in 1997. One move was particularly unusual and mentally threw Kasparov off his game; he could not comprehend why the system would output this move. In reality, it had chosen a random move!
“One of Deep Blue’s designers has said that when a glitch prevented the computer from selecting one of the moves it had analysed, it instead made a random move that Kasparov misinterpreted as a deeper strategy… The world champion was supposedly so shaken by what he saw as the machine’s superior intelligence that he was unable to recover his composure and played too cautiously from then on… He over-thought some of the machine’s moves and became unnecessarily anxious about its abilities, making errors that ultimately led to his defeat.”
Mark Robert Anderson, THE CONVERSATION
Of course, in this example the AI and the operator were competitors. In the context of injury risk analysis and load management design, the operator and “system” are teammates. Yet, this story illustrates the potential negative, even unsettling impact outputs can have when they are not understood.
As such, any team or company creating AI systems has a great responsibility when designing these outputs. Model results need to be transformed into outputs that provide the right level of detail, an appropriate rationalisation and are actionable. The choices behind this transformation must be difficult. Hence, it is hugely important for providers to engage with experts in the applied environment and use their domain knowledge when developing output design.
As is often the case in data science, it presents a delicate balance between usability and information overload and finding the right tradeoff between reductionism and oversimplification.
Injury risk as the output
Injury risk forecasting has arguably become one of the most prevalent uses of AI in sports science currently. It is worth noting the emphasis on injury risk, rather than injury per se. It is not predicting injury occurrences in terms of their exact timing, severity or mechanism, but forecasting risk for a predefined time frame.
We are required to think probabilistically, much like we do with the weather forecast!
When I canvassed industry questions on AI, many practitioners wanted to know why the focus was on injury risk, rather than other variables, most notably, performance. There was concern that we may be limiting resilience or even performance itself, if our decision making is overly focussed on injury risk. This is an important discussion, given the negative connotation around sports science promoting a risk-averse approach to performance.
I had this same concern but as I’ve become more familiar with AI in sports science, I understand the reasoning. Data science solutions benefit from an objective outcome. Injuries are (mostly) binary events that are documented objectively and consistently. Yes, in actuality, they exist on a continuum and there are many grey areas but, we do strive to objectively define and quantify them (Fuller et al., 2016).
As Eyal Eliakim, Co-founder and CTO at Zone7 told me;
“One of the key things in building strong performing machine learning algorithms is clearly defining the event or trend we want the algorithms to learn. That is one of the reasons we feel injury risk forecasting is a very powerful use case - an injury is a very well-defined event.”
Meanwhile performance, particularly in the team sport setting, is complex. It is not purely about who runs the furthest or fastest. Therefore, performance is not easily defined universally, and game outcome may not even reflect performance!
Performance can differ based on positions, players, and opponents to name only a few contextual factors, making it difficult to model across a large set of leagues, teams, games, and players.
And yet, perhaps we’re not best thinking about performance and injury risk as two opposite ends of the spectrum. Availability is correlated with performance and competition outcomes across many sports. It simply makes sense that having your players available for more games gives your team a better chance of winning.
Yes, some could take the approach too far and wrap players in cotton wool. But generally practitioners seem to be united in the need to maximise availability while promoting physical capacities and load exposure in the quest for high performance.
I’m interested in tools such as Zone7’s micro-cycle simulator (see below), for instance, that enable future workload projection up to seven days ahead of time. Injury risk may provide the foundation for this system, but it can be applied and interpreted with maximising load and performance in mind. Ultimately, this comes down to how the human operator acts on the information provided.
Let’s dive deeper into the specifics of the outputs in relation to Zone7. What do they look like and why? And, what does that mean in practical terms?
So handing over to Zone7 now; What is your approach to output design?
The most important thing Zone7 has to consider when presenting its AI-derived outputs is determining and fully appreciating the needs of end users within the context of their specific environment. This is done to ensure the end user can apply the insights as they deem appropriate, thereby making our outputs more impactful. Through translating mathematical models numbers, percentages and probabilities, Zone7 provides the human end user with information that would otherwise be unobtainable with such efficiency. Translation of our model results into our outputs is designed to make complex deep learning processes interpretable, digestible and ultimately actionable. Initially, this centres on converting numbers into more meaningful states, such as bucketing athletes into broad risk categories like Low, Medium or High Risk.
Further outputs are added to increase usability and trust for the practitioner applying Zone7’s AI system:
Risk Factors
Specific parameters are highlighted as the greatest contributing risk factors to the injury risk forecast being presented. This important output allows practitioners to understand “Why” an athlete is identified as being at risk (e.g., is risk driven by overload patterns in X or underloading patterns in Y).
Risk Management
Other usable outputs provided to Zone7 clients are suggestions on potential workload modifications using historical reference training sessions, which if applied can help to mitigate the identified injury risk.
This information is presented as optimal value ranges for the parameters contributing to the risk alert, such as X amount of absolute sprinting distance or Y amount of high intensity decelerations.
Risk Simulation
As mentioned earlier, we also provide injury risk forecast simulations based on future projected workloads and parameters, as predetermined by practitioners. Whilst this is primarily intended to function with injury risk in mind, outputs from this tool could be utilised for optimising performance potential.
Discretising continuous data for risk forecasting purposes, is of course not without limitations, such as the dangers in context-loss driven by broad calibration thresholds. To determine suitable thresholds, the calibration process is based on each specific environment, conducted in a collaborative manner with practitioners, is a transparent process to foster trust and should be flexible to allow different ‘configurations’ to be implemented effectively.
As Ben Mackenzie, a Data Research Analyst at Zone7, recently discussed with Leaders in Performance how giving each team the power to calibrate the outputs according to their philosophy is important.
“For each client, Zone7 can set individual thresholds for risk alerts which can be increased accordingly as individual management in a team setting is important. Some environments desire a greater focus on physical development, while other departments want maximum availability for players and will therefore put less emphasis on that physical development to ensure availability. That is something Zone7 is able to cater for through calibrating attitude to risk – much like when making a financial investment.”
What Next?
As with any dashboard, the outputs should be chosen in a very deliberate manner. In sports science, we strive to reduce data abundance into simplistic, interpretable numbers; yet, reduce too much and essential detail may be lost.
When it comes to AI this responsibility is arguably greater, as data results and presentation guides the human operator that is consuming them. It is important that practitioners challenge providers on the ‘hows’ and ‘whys’ behind their output design.
Now that we have considered the outputs, we will move onto what us practitioners can do with the information. This is what this series has all been building up to - the application of the data within real-life settings. So, keep a lookout for that next post!
This article is supported by Zone7. For more information about their technology, visit their website.
Comments