One of the greatest challenges for managing training load is finding a suitable balance for the quantity of metrics to analyse, interpret, and present. I’ve previously described this on Sportsmith as the Goldilocks Strategy, as we strive to not include too many, or too few metrics in our analysis, but a quantity that is “just right”. It is widely acknowledged that a multivariate approach is necessary given the complexity of the training process (Weaving et al., 2017). However, reducing training load to too few metrics can omit valuable information.
Conversely, it is important that we do not overwhelm key stakeholders – as well as ourselves – when managing such data. The purpose of monitoring training load is to transform data into objective information that can inform decision making in support of the optimal management of athletes. If we are swimming in metrics and calculations, we will not be able to see the wood for the trees. We must strive for as simple, but as informative as possible.
The purpose of monitoring training load is to transform data into objective information that can inform decision making in support of the optimal management of athletes.
Here we focus on external load measures, those that “measure the work completed” (Halson et al., 2014), most commonly captured via tracking technologies such as Global or Local Positioning Systems, Optical Tracking, and Inertial Measurement Units. This attention is due to the wide array of metrics such technology typically provides. However, I acknowledge the importance of integrating such measures within a holistic, multivariate system that also includes measures of internal load and response. For now, let's dive into three steps to reduce training load overwhelm.
Step 1: Think in levels
When considering the myriad tracking metrics provided by a system using a zoomed out perspective, we can start by grouping them into a simple classification system utilised by Andrew Gray and described by Martin Buchheit and Ben Simpson. The levels are as follows:
Level 1: distances covered in different velocity zones
Level 2: changes in velocity, acceleration, and direction
Level 3: events derived from inertial sensors
Some metrics may present a hybrid of different levels, such as Metabolic Power which combines data from levels 1 and 2 or Player Load per metre which includes level 1 and 3 metrics. Regardless, this simple framework can provide the first step to reducing overwhelm by classifying external load measures.
In the NSCA's Essential for Sports Science textbook, Andrew Murray and myself presented these levels with a brief explanation of how each can be used to inform different parts of the monitoring systems, as follows:
Level 1: to quantify activity profile and track changes across sessions and the season
Level 2: to inform on game demands and subsequently assist with drill design
Level 3: to track fatigue in individuals
While this is a simplified approach, it highlights how different types of metrics can be stratified and used to serve different purposes within an athlete monitoring system.
Step 2: Focus on those most valuable to others
Presented data should be limited to those of most importance: those that answer the questions that coaches and athletes have actually asked and can have an impact on the programme (Buchheit, 2017). Yet, gaps exist between the importance and impact practitioners perceive GPS data to have and the importance and application of GPS data on a training programme, as perceived by the coaches (Nosek et al., 2020).
Perhaps we need to pull our noses out of our datasets and start with/revisit the coaches’ questions. Coaches have identified measures of “work rate/intensity” and “high-intensity actions” as most valuable to them (Nosek et al., 2020). This should be our starting point for coach feedback. Much like “minimal effective dose” in the weight room, perhaps we can endeavour to present the “minimal effective metrics” on training load reports.
Coaches have identified measures of “work rate/intensity” and “high-intensity actions” as most valuable to them (Nosek et al., 2020). This should be our starting point for coach feedback.
These findings are from a survey of English football (soccer) coaches and so may differ in different sports and environments. It may also have different meaning within the same setting. The “high-intensity actions” of an offensive lineman are different to a defensive back in American football. While, those of a goalkeeper differ from a striker. Thus, we must seek to translate research and practice most suited to each specific context.
In addition, those deemed most valuable may diverge for different practitioners. A large difference was found between performance staff and coaches on the relative value of squad or position average workload (54% vs 21%, Proportion ratio 0.4). Similarly, moderate differences were found in the importance of analysing individual drills, individual player workload, and fatigue responses, with performance practitioners seeing more value in each application. This does not question the absolute value of such applications, but reflects the different roles and needs of different staff members. Therefore, when filtering training load metrics we must stratify by different practitioners and appreciate the value to each role.
Step 3: Reduce datasets with statistical approaches
Anyone working with tracking technology will recognise similarities across many metrics. If an athlete covers relatively greater distance above a “high-speed” threshold, it is likely they will also cover a relatively higher amount above a “sprint” threshold. The same can be said for multiple dimensions of load; distances, speed, accelerations, decelerations, and accelerometry-derived vector-magnitudes. In fact, velocity (m/s) and acceleration (m/s/s) are the first and second differentials of distance and therefore, highly correlated, but commonly reported as separate variables (Weaving et al., 2019).
This represents multicollinearity; when an independent variable is highly correlated with one or more of the other independent variables (Allen et al., 1997). If we can understand the multicollinearity within our training load data we can avoid reporting measures that capture the same phenomena, such as PlayerLoad over 7 days and Total Distance over 7 days as per one study by Weaving and colleagues (2019).
Figure 1 (right) is an example correlation matrix produced on a load dataset by using the corrplot library in R. The shades of red represent a positive relationship, blue represents a negative relationship, and the darker the shade the stronger the correlation. Given the particularly strong correlations between variable D, E, F, and G we may not need to present all of these metrics, for example.
Furthermore, multicollinearity can be removed from a dataset while still maintaining the variance by conducting Principle Component Analysis (PCA). For instance, a training load dataset from rugby union was reduced into three principle components; ‘cumulative load’, ‘changes in load’, and ‘acute load’ (Williams et al., 2017). This approach enables an objective means to reduce a dataset to the most parsimonious metrics without losing meaningful variation or distinctions within the data.
By trying to avoid a reductionist approach to training load, practitioners may be left feeling overwhelmed by the array of metrics to analyse and present. However, we can simplify our datasets by grouping metrics, focussing on the most valuable, and applying data reduction techniques. While we must respect the complexity of the training process, working with the most parsimonious dataset allows us to focus on transforming data into meaning and action.