Why Every Practitioner Needs to Know Goodhart’s Law
Goodhart’s Law, named after British economist Charles Goodhart, was originally expressed as “Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes”. However, you’re probably more familiar with the reworking of the phrase by Marilyn Strathern:
“When a measure becomes a target, it ceases to be a good measure”.
As sport is overflowing with both measures and targets, this is an important decree for us to explore. How can the evolution of a measure into a target limit its value? What implication does this have for practitioners in the applied setting?
Finding Ways to Meet (or Beat) Targets
A measure is made into a target by associating some sort of positive or negative connotation to the outcome. Perhaps this is taken one step further with the addition of a reward or even, a punishment. When a “carrot” or “stick” is introduced, people (athletes included) may consequently look to find strategies to maximise their output in relation to the target.
For instance, body weight may be regularly assessed in a sports setting. The initial justification may be as a simple and non-invasive check on body composition, potentially as a proxy for nutrition and lifestyle factors. A ranking system might be used to assess the measures and sometimes punishments (i.e. fines) are involved. The threat of a punishment may urge some athletes to find ways to cheat the system, such as drinking significant water to increase their weight or using the sauna to cut weight.
In such cases, not only is the original purpose for the assessment undermined, but action has been undertaken that can be detrimental to performance. Using a sauna to sweat out water weight prior to a weigh-in can leave an athlete dehydrated for the subsequent training session.
In other cases, a change in behaviour may be less deliberate. For instance, when an athlete is striving for a velocity based training (VBT) target, they may adjust their technique, in search of the number that will light up green on the screen, but could also lose form. Whether conscious or not, both these examples demonstrate a measure being manipulated in a manner counterproductive to performance, all to strive for a target.
Imperfect Correlates with Performance
As we continue to discuss on this blog, team sport performance and injury risk are complex entities. A single measure cannot represent, nor predict them. Thus, we frequently adopt measures that are proxies of the ultimate goal. As there is no single measure of “performance” or even “football strength” for example, surrogates are employed that are believed to be related to the ultimate outcome. However, any such proxy is imperfectly correlated with the goal.
“[N]early every measurement you can think of is an imperfect reflection of the true thing you want to measure. If that metrics becomes a target, then you are likely to drift from your true goals.” - The Four Flavors of Goodhart’s Law
It is this potential to “drift from your true goals” that is the greatest danger for practitioners. We have previously discussed the hazards of naïve interventionism and why practitioners should consider potential downsides to any intervention they employ. According to Goodhart’s Law, turning measures into targets provides another example of potential naïve or negative interventionism.
As well as body weight and VBT targets, there are many other examples from sports performance that could fall foul to Goodhart’s Law. In fact, a Twitter discussion by Des Ryan and Peter Vint provided one illustration.
I'm sure there is value in the above test measure in tracking development and guiding individualised athletic preparation programmes. If, however, this measure was transformed into a target, we may change the behaviour of the athlete.
We also risk chasing numbers that are not only (at best) an imperfect correlate with performance in team sports, but also with physical performance. The on-field success of the athlete above, despite their assessment scores, is a reminder of the complex and interdisciplinary nature of performance. (On the other hand, one occasion when athletes should be optimising for physical assessment targets rather than sporting performance is in the preparation for Scouting Combine events.)
"Key Performance Indicators"
Let’s contemplate a specific measure from physical preparation. Eccentric hamstring strength appears important for injury risk reduction, as well as for speed, which in many sports and positions can be a notable factor for physical performance.
It is common practice to utilise a target for eccentric hamstring testing, potentially derived from research or analysis of your own internal dataset. However, improving this measure or reaching the specified target will not necessarily cause an improvement in performance. It is imperfectly correlated with performance, injury risk, and strength (even hamstring strength itself is more complicated that a single test, with one movement and contraction type).
These examples demonstrate precisely why, in my opinion, the term “key performance indicator” is used too flippantly in the applied setting. How often is it actually an indicator of performance? More commonly, it is a measure, sometimes made into a target, that a practitioner consciously selects based on their belief that it is important to performance.
While such measures may indeed be important to performance, perhaps statistically correlated, remember they are imperfectly related to our ultimate goal of performance. They neither fully represent, nor necessarily cause, the outcome.
What to do with Goodhart’s Law?
So, do we just throw all targets out of the window? As always, our approach should be more nuanced than that. Depending on how they are utilised, objective measurements can be a useful tool to support preparation in the sporting environment.
Targets can add context to measures and can also help gain athlete buy-in to the process. I’m sure we’ve all witnessed how leaderboards and targets can assist with motivation during the testing process; after all, athletes are competitive beings! We should however, consider whether such external motivation is appropriate in each circumstance, as this can be a confounding factor in itself.
Employing non-naïve interventionism, by considering both the potential benefits and harm for every intervention, also applies to creating targets from measures. We should consider if it is an appropriate target for the sport/position/individual, and thereafter, how a target might change an athlete’s behaviour (consciously or sub-consciously).
If a target is then adopted, we cannot underestimate the importance of our own interpersonal skills in integrating the target. Communicating the purpose and reasoning behind the target to the athlete will help to give them an appreciation of the process.
It is also important to connect for them how this measures fits into the overall puzzle of athleticism, performance and injury. In a conversation I had with professional rugby player, Adam Byrne, for Output Sports, Adam reiterated numerous times the importance for athletes to be told the purpose behind any test or measure that is being collected from them.
Thereafter, it is important practitioners observe how a measure is achieved. Whether that is through immediate feedback such as observing and coaching during VBT. Or whether that means more generally over time, such as by having a relationship with an athlete and understanding their ongoing tendencies, habits and off-field situation.
Goodhart’s Law reminds us of the peril of naïve interventionism. Assigning importance to a measure by creating an objective target has the potential to change behaviour. We are in danger of optimising for a metric that may not influence our ultimate goal (performance or injury) or, worse still, could potentially harm the goal. We should respect the complexity of the complex constructs we are dealing with – performance, injury, strength, wellbeing – and be cautious of reducing them to targets.
That said, I absolutely would still employ objective targets with physical assessments and VBT. Yet again we are served well by striving for non-naïve interventionism. We must consider the influence a target (plus any associated reward or punishment) can have on the data collection.
Plus, we need to utilise our communication and relationship skills to convey the context and relevance of this target to the athlete. While we appreciate that this is just one measure within the (much) wider, (more) complex performance puzzle, it is important out athletes understand that too.