Learning from Science Gone Astray
Reflecting on our own biases is the first step towards trying to address them. I am biased in favour of science and more often than not, I will trust the scientific dialogue. But should I? Are there times when I should not trust science? How can I identify those times? Can understanding this bias help me communicate with those who are less inclined to trust science? I decided to explore this debate through the book, “Why Trust Science?” by Professor Naomi Oreskes, an American Historian of Science.
When Science Goes Awry
In the “Science Awry” chapter, Oreskes presents historical examples of when scientists went astray. From this, we can learn when we are justified in not trusting science. In turn, we can study these examples as what not to do with our own work and when consuming and critiquing the work of others. In Prof Oreskes’s words, these examples can help us to “recognize cases where it may be appropriate to be sceptical, to reserve judgment, or to ask with good reason for more research”. As scientists, we have an even greater responsibility to do this.
In brief, the historical examples discussed were:
The Limited Energy theory. Edward H. Clarke’s theory that higher education destroys the reproductive function of women. Thankfully, this theory dates back to the 1800s.
The rejection of continental drift evidence by American earth scientists in the 1920s and 30s that the continents are not fixed.
Eugenics. The social movement that intended to improve the quality of the human species through selective breeding of desirable traits and sterilization of those deemed less desirable.
The historic dismissal of the link between hormonal birth control (i.e. the contraceptive pill) and depression.
The frequent claim in mainstream media that dental floss does not make a difference.
By reviewing these cases, Oreskes identifies five themes that are required to produce reliable knowledge, which these examples failed to do. They are: consensus, method, evidence, values, and humility. I will discuss each below through the lens of sports performance.
According to Oreskes, “scientific facts” are claims about which scientists have come to agreement. For each historical example discussed, notable empirically-informed dissent existed within the scientific community. In the case of eugenics, for example, prominent geneticists and social scientists objected to eugenic claims with evidence of the genetic and environmental influence on various traits, behaviours, and accomplishments.
As such, we should seek to establish if there is consensus on a topic. Similarly, if consensus seems to exist, we should seek to determine if the (dis)agreement is evidence-based. A recent example in sports science literature is the debate on the acute chronic workload ratio (ACWR) and, more specifically, its ability to predict injury. I have previously discussed this debate in 2016 and 2018 (might be due another post!). This calculation (specifically ACWR, as opposed to training stress balance and other methods that already existed) took hold in the applied setting at a rapid rate since the first paper around 2014.
Given a rapid uptake across the industry, you could be forgiven for perceiving this as consensus. However, the research process takes time and this time course should be considered. Technology companies can act relatively quickly to add metrics or calculations to their software. They do so in response to customer demand, which can gain momentum from early studies, conversations, presentations/conferences and social media discussions. Further research exploring a concept takes time to study and publish, especially if it is carried out with wider data collection. Hence, why we have seen a time lag from the early ACWR studies to the more recent reviews outlining the methodological limitations and concerns (such as Impellizzeri et al., 2019; Impellizzeri et al., 2020; Wang et al., 2020). Such reviews, in addition to early questioning of the ratio (such as Menaspa, 2017; Williams et al., 2017), highlight a lack of consensus regarding the ability of this approach to predict injury.
So when considering a possible “scientific fact”, it is important to consider these timeframes and not necessarily take social media agreement or technology inclusion as the scientific consensus we seek. We can benefit from intentionally seeking disagreement and study the evidence put forward by those with divergent perspectives.
Some of the issues in the historical examples stemmed from scientists favouring a particular method or discounting evidence obtained by other methods. Oreskes describes this as methodological fetishism. Evidence comes in many forms, and while the randomised double-blind trial may be considered the gold standard, that does not discount the potential insight from other forms. In fact, the hierarchy of evidence, frequently presented as a pyramid, is still being debated.
In the case of the pill and depression, evidence gathered via self-reporting was described as “iffy methods”. This jumped out at me. In sport, we can utilise self-report as one method of gathering information on our athletes: rating of perceived exertion, wellness questionnaires, as well as more informal conversation utilise a subjective approach. Doubts may often be cast on the honesty and reliability associated with this data. However, research by Anna Saw and colleagues has demonstrated such measures to be consistent with objective counterparts and more sensitive.
Whilst we must acknowledge the complexity behind subjective information, especially when dealing with human physiology (and sociology), we should not outright dismiss it as “iffy data”. As long as we understand the merits and limitations of respective methods, a variety of information can serve as evidence.
Obviously, scientific theory should be based on evidence. However, this is not always the case, as was demonstrated in the historical examples. Particularly, the Limited Energy theory, which Clarke built on the basis of seven patients and provided no evidence that their reproductive systems were weakened. He cherry picked his subjects and their symptoms in support of his theory. While such cherry picking of evidence to support a theory (a.k.a. confirmation bias) can be inherent in humans, it is integral to the scientific process to try to assess all the evidence testing a hypothesis by endeavouring to set aside such bias.
One of the greatest threats to this theme is the pandemic of pseudoscience throughout today’s society. Pseudoscience is built upon such confirmation bias. This makes the critical appraisal of the quantity and quality of evidence even more vital.
“By cloaking itself in the trappings of science, pseudoscience appeals to the part of us that recognizes science is a reliable way of knowing. But pseudoscience doesn’t adhere to science’s method. It’s masquerading.” - thinkingispower.com
Sports technology is one realm that has been identified as vulnerable to pseudoscience. Some (of course, not all) companies may employ such tactics, underpinned by social psychology, in their marketing to attract attention. Therefore, there is a need to critique their claims by seeking the evidence in support, and against, their theories.
The three themes of reliable knowledge thus far – consensus, method, and evidence – clearly overlap with each other. There is a responsibility to assess the level of evidence available, the quality of methods used, and whether this has established a consensus in the scientific community. The remaining two themes take a bit of a different approach, by acknowledging the role of the scientist behind the science and considering their values and humility.
This book includes an interesting discussion around the place of values in science. In brief, scientists traditionally may want to portray the scientific method as a purely objective process, existing in a vacuum separate to values. However, in dealing with scientists, individual biases are involved, whether consciously or not. Furthermore, science as a social process – especially works with moral, ethical, social, political, or economical conclusions – involves judgement that can be influenced by values and bias.
Oreskes argues for increasing diversity in science as a “homogenous community will be hard-pressed to realize which of its assumptions are warranted by evidence and which are not.” Without diversity we may be witness to the “asymmetry of application”. The most obvious of the case studies related to this was the Limited Energy theory that, among many other fallacies, was asymmetrical. Apparently, higher education did nothing to limit a man’s reproductive capabilities, it only applied to women.
In this case, I reflect upon the strides we have made, but also the vast scope to further study the differences in performance science across different groups. Female athletes have been studied disproportionately compared to males. In a recent systematic review of load monitoring research in professional soccer players, studies of female athletes represented only 5% of the samples. Meanwhile, I read with interest the renewed focus on understanding the effect of the reproductive cycle on athletic performance and health (such as research by Georgie Bruinvels and application by Dawn Scott, among many others).
Similarly, we now more widely acknowledge that youth athletes are not just small adults. This has been discussed in recent times in relation to physical development, talent scouting, injury, screening, nutrition, and sports drinks. We have a long way to go to address the asymmetry of application in sports research, but are making progress.
With reference to the historical examples, Oreskes states “their failings are a reminder that anyone engaged in scientific work should strive to develop a healthy sense of self-skepticism”. Science is about doubt and uncertainty, that is how we make progress. The scientific method actually sets out to disprove your own hypothesis. We should be just as sceptical of our own work as when consuming the work of others.
In today’s world of sensationalised scientific communication and social media promotion, it can be difficult to convey your humility, as well as to observe humility in others. Scientists can be judged on their publications and the impact factor of the journals, their h-index, or their Research Interest score. They may also be judged by their numbers of Instagram followers, Twitter likes, or LinkedIn connections.
With all this pressure and judgement based on 240 characters or a single picture, it can be difficult to display humility by admitting mistakes, let alone sharing such an admission. Yet, Billy Hulin recently publicly shared his mistakes and regrets regarding workload and the ACWR (you may remember the lack of consensus on the topic discussed earlier!) in an editorial now available on Sport Performance and Science Reports.
In Think Again, Adam Grant describes "confidence humility" as the sweet spot; having faith in our capability while appreciating we might not have the right tools and solutions at that time. By emphasising this in our work, we can combat cognitive entrenchment, maintain doubt and uncertainty, and stay true to the scientific method of producing reliable, trustworthy knowledge.
In discussing case studies where science “went awry”, we can identify and consider the mistakes relevant to us as both producers and consumers of science. The five themes that can help produce reliable knowledge are as relevant to performance as other realms of science:
Consensus – seek evidence of both agreement and disagreement on a topic
Method – be aware of the types of evidence but try to avoid methodological fetishism
Evidence – understand the quantity and quality of evidence available
Values – acknowledge the scientific process involves humans and can be influenced by values and bias; seek diversity to avoid asymmetry of application
Humility – be humble and sceptical of your own work
Global Performance Insights act as an External Teammate to teams and individuals; auditing your processes and helping you steer clear of pseudoscience. To make sure your scientific support has not gone awry, get in touch for a conversation.