Breaking Down Force Plate Analysis with PCA and K-means Clustering
- Guest
- 25 minutes ago
- 7 min read
In this guest post, Sports Scientist Ashmeet Anand shares how he combines coaching intuition with advanced statistical techniques to transform athletic assessment of force plate data.

We've all heard about the legendary "coach's eye" - that intuitive ability to assess athletes that comes from years of experience watching movement, performance, and potential. But here's the challenge that keeps me up at night:
If you were to group athletes using force plate data, how confident are you that your clusters would match an algorithm?
Picture this: You've been working with an athlete for months, watching their movement patterns, noting their explosive capabilities. You'd naturally group them with your 'power athletes.' But what if the algorithm places them in a different group instead?
This isn't about being right or wrong - it's about discovering the hidden performance traits that our eyes might miss. The real magic happens when human judgment and data-driven perspectives collide. Maybe that athlete you see as 'powerful' actually shares force-time characteristics with athletes who excel at technical efficiency rather than physical/power output.
Creating Common Ground in Force Plate Analysis
I wanted to create a potential solution to this challenge. So, I’ve spent two months creating a tool that ingests your VALD ForceDecks jump data, runs Principal Component Analysis (PCA - explained below), and automatically buckets your athletes into physical performance clusters. All achieved in minutes.
The goal isn’t to replace human judgment. Instead, it creates a shared space where your entire staff can gather around the same data visualisation. When S&C coaches, sport scientists, and performance staff see the same clusters, it sparks conversations:
"Why is athlete X grouped here?"
"What patterns are we missing?"
Each person brings their experience and perspective to interpret what the data means.
Understanding PCA in Sports Performance
To understand how the app works and how you can upload your data (explained towards the end of this blog), first, let's get familiarised with Principal Component Analysis (PCA). It is a data reduction technique that simplifies large, complex datasets by highlighting the key factors that explain performance variation (7). Think of it as creating a “performance fingerprint”.
Instead of comparing athletes across 30+ individual force plate metrics (e.g., jump height, RFD, peak power, etc.), PCA reduces this complexity into principal components that capture underlying performance patterns (1). Learning techniques such as these has helped take my data science skills to another level since I last posted here about my professional development in Tableau and Python.
For example, PCA might reveal that athletes who excel in concentric power also tend to have specific eccentric characteristics, grouping these correlated traits into principal components. Research has shown PCA effectively identifies playing positions in rugby based on force plate profiles (2) and distinguishes between elite and sub-elite kickboxing athletes in kicking performance (3).

Category-Based Analysis: Speaking Your Language
Since most of the performance world communicates in the language of speed, power, and force production, I've added category-based PCA to be run using only those specific metrics. Compare your athletes as you see fit, and test your hypotheses based on the outcomes.
For example: Using Force Production (28 metrics), Jump Mechanics (13 metrics), and Rate of Force Development (21 metrics), combining these metrics under each of these umbrellas together to build a cluster as shown in the picture below.

K-means Clustering: Finding Natural Groups
Once PCA has reduced the data, K-means clustering groups similar athletes together - like having an unbiased assistant coach who only sees the data patterns. The algorithm identifies natural clusters based on performance similarities, which might reveal unexpected groupings: your speedster might cluster with your power athlete due to similar force production patterns you hadn't noticed (6).
Sports scientists have used K-means to identify talent in youth soccer (4), and optimize training groups (5). The beauty is that it's hypothesis-free - the algorithm doesn't know who your starters are or their positions, yet often confirms (or challenges) coaching intuitions about athlete categories, such as Force Production and Speed/Velocity as shown below.
Below is an example of the same athletes using:
Category-1 (Power Output)

Power Analysis: P5 and P12 cluster together but are spatially separated - they achieve similar peak power outputs through different neuromuscular strategies (one more force-dominant, one more velocity-dominant in the power equation).
Category-2 (Speed/Velocity)

Speed Analysis: P5, P7, and P12 cluster tightly together - they share similar rate coding, motor unit recruitment patterns, and movement velocity profiles regardless of their power generation differences.
Tying the two clusters together:
Physiological Insight: This reveals the force-velocity relationship in action - athletes can reach similar power zones (Force × Velocity = Power) via different paths along the curve.
Training Application: These three athletes would benefit from similar speed/reactivity training (plyometrics, reactive strength) but may need different strength training approaches based on their individual force-velocity profiles.
The 10-Minute Workflow
Interested to try this tool on your own dataset? You can access it here:
If you need an example dataset, you can use this:
This is a synthesized dataset so may not be entirely accurate!
Note: please be aware of data security and privacy considerations when uploading data to the platform.
Upload your Force Deck CSV export - Drag and drop, that's it
Select your analysis focus - Full metrics vs. category-specific
Choose optimal cluster count - The app suggests based on your dataset
Review PCA variance explanation - See how much of your data story is captured
Explore athlete groupings - Interactive plots for team discussions
Export insights - Ready for training program adjustments
Making It Stick in Your Workflow
Here are my suggestions for integrating this analysis longitudinally:
Monthly cluster reviews during periodization planning
New athlete assessments against existing clusters
Progress tracking - Watch athletes migrate between clusters as they develop
Real-time discussions - Pull up the app during team meetings
Performance Tracking Over Time

Get quick access to the preselected metrics or use other options with a single click to assess whether your strength programming has been paying off. The visual interface provides immediate insights into:
Trend analysis - Is your training moving athletes in the right direction?
Cluster migration - Watch athletes develop and move between performance groups
Program effectiveness - Validate your coaching interventions with objective data
Individual trajectories - Track specific athlete development patterns
Technical Integration
Streamlit platform - Secure, web-based access
200 MB file limit - Approximately two years of standard force plate data
Currently compatible with VALD systems only!
Direct integration options - For teams wanting automated data ingestion
Export capabilities - Take your insights wherever you need them
Get Involved
I would love for people to reach out if they would like to:
Incorporate it into their database - Direct integration possibilities
Use the app alongside current data collection - Parallel analysis workflow
Pilot test with their teams - Early adopter opportunities
Let's test that coach's eye together and discover the stories your data has been trying to tell you.
FAQs
What is Principal Component Analysis?
Principal Component Analysis is a fundamental dimensionality reduction technique that
transforms high-dimensional data into a lower-dimensional form while preserving as much variance as possible.
What is K-means clustering?
K-Means Clustering is an unsupervised machine learning algorithm that helps group data points into clusters based on their inherent similarity. Unlike supervised learning, where we train models using labeled data, K-Means is used when we have data that is not labeled and the goal is to uncover hidden patterns or structures. For example, an online store can use K-Means to segment customers into groups like "Budget Shoppers," "Frequent Buyers," and "Big Spenders" based on their purchase history
What exactly am I looking at in the PCA plot?
Each dot is one athlete. Dots close together represents similar force plate profiles. The closer they are, the more alike their movement patterns. Athletes in the same colored cluster share similar characteristics and could potentially train together.
How many clusters should I use for the PCA?
Start with 3-4 clusters for most teams. The app suggests an optimal number, but use your judgment too. If clusters seem too mixed (fast and slow athletes together), add one more cluster. Aim for groups that make practical sense for training.
What if the algorithm puts my best athlete in the 'worst' cluster?
Clusters aren't rankings - they're just different profiles. Your best athlete might succeed through technique or factors the force plate doesn't measure. Use this as a discussion point to understand why they're grouped this way.
How how much data do I need before the clusters actually mean something?
A minimum of 10 athletes but 20+ is better for reliable patterns. For data points, at least 3-4 testing sessions per athlete. More data provides more reliable clusters.
If an athlete moves from one cluster to another over time, is that good or bad?
It's just a change - whether it's good depends on your training goals. Moving toward your target performance profile is positive. Use it to track if your training is pushing athletes in the intended direction.
Can I use this with data from other force plate systems, or only VALD's ForceDecks?
Currently, VALD is only available due to data formatting. Other systems organise data differently, so uploads won't work correctly. Contact me (Ashmeet Anand) if you want to discuss expanding to your system.
References
1) Stone, J. D., Merrigan, J. J., Ramadan, J., Brown, R. S., Cheng, G. T., Hornsby, W. G., ... & Hagen, J. A. (2022). Simplifying external load data in NCAA Division-I men's basketball competitions: A principal component analysis. Frontiers in Sports and Active Living, 4, 795897.
2) Parmar, N., James, N., Hearne, G., & Jones, B. (2018). Using principal component analysis to develop performance indicators in professional rugby league. International Journal of Performance Analysis in Sport, 18(6), 938-949.
3) Vagner, M., Cleather, D. J., Kubový, P., Hojka, V., & Stastny, P. (2022). Principal component analysis can be used to discriminate between elite and sub-elite kicking performance. Motor Control, 27(2), 354-372.
4) Bazmara, M., & Jafari, S. (2013). K nearest neighbor algorithm for finding soccer talent. Journal of Basic and Applied Scientific Research, 3(4), 981-986.
5) Shelly, Z., Burch, R. F., Tian, W., Strawderman, L., Piroli, A., & Bichey, C. (2020). Using K-means clustering to create training groups for elite American football student-athletes based on game demands. International Journal of Kinesiology and Sports Science, 8(2), 47-63.
6) https://www.geeksforgeeks.org/machine-learning/k-means-clustering-introduction/