Breaking Down Force Plate Analysis with PCA and K-means Clustering

Guest
Oct 8, 2025
7 min read

In this guest post, Sports Scientist Ashmeet Anand shares how he combines coaching intuition with advanced statistical techniques to transform athletic assessment of force plate data.

We've all heard about the legendary "coach's eye" - that intuitive ability to assess athletes that comes from years of experience watching movement, performance, and potential. But here's the challenge that keeps me up at night:

If you were to group athletes using force plate data, how confident are you that your clusters would match an algorithm?

Picture this: You've been working with an athlete for months, watching their movement patterns, noting their explosive capabilities. You'd naturally group them with your 'power athletes.' But what if the algorithm places them in a different group instead?

This isn't about being right or wrong - it's about discovering the hidden performance traits that our eyes might miss. The real magic happens when human judgment and data-driven perspectives collide. Maybe that athlete you see as 'powerful' actually shares force-time characteristics with athletes who excel at technical efficiency rather than physical/power output.

Creating Common Ground in Force Plate Analysis

I wanted to create a potential solution to this challenge. So, I’ve spent two months creating a tool that ingests your VALD ForceDecks jump data, runs Principal Component Analysis (PCA - explained below), and automatically buckets your athletes into physical performance clusters. All achieved in minutes.

The goal isn’t to replace human judgment. Instead, it creates a shared space where your entire staff can gather around the same data visualisation. When S&C coaches, sport scientists, and performance staff see the same clusters, it sparks conversations:

"Why is athlete X grouped here?"
"What patterns are we missing?"

Each person brings their experience and perspective to interpret what the data means.

Understanding PCA in Sports Performance

To understand how the app works and how you can upload your data (explained towards the end of this blog), first, let's get familiarised with Principal Component Analysis (PCA). It is a data reduction technique that simplifies large, complex datasets by highlighting the key factors that explain performance variation (7). Think of it as creating a “performance fingerprint”.

Instead of comparing athletes across 30+ individual force plate metrics (e.g., jump height, RFD, peak power, etc.), PCA reduces this complexity into principal components that capture underlying performance patterns (1). Learning techniques such as these has helped take my data science skills to another level since I last posted here about my professional development in Tableau and Python.

For example, PCA might reveal that athletes who excel in concentric power also tend to have specific eccentric characteristics, grouping these correlated traits into principal components. Research has shown PCA effectively identifies playing positions in rugby based on force plate profiles (2) and distinguishes between elite and sub-elite kickboxing athletes in kicking performance (3).

Force Plate Analysis screen showing PCA clustering map with colored dots representing athletes. Slider set to 3 clusters, text notes analysis. — Screen capture of the Force Plate PCA Tool

Category-Based Analysis: Speaking Your Language

Since most of the performance world communicates in the language of speed, power, and force production, I've added category-based PCA to be run using only those specific metrics. Compare your athletes as you see fit, and test your hypotheses based on the outcomes.

For example: Using Force Production (28 metrics), Jump Mechanics (13 metrics), and Rate of Force Development (21 metrics), combining these metrics under each of these umbrellas together to build a cluster as shown in the picture below.

Menu interface for Enhanced PCA Analysis: Category-Based. Options for metric categories with descriptions. Dark background, instructional text. — Category-based PCA

K-means Clustering: Finding Natural Groups

Once PCA has reduced the data, K-means clustering groups similar athletes together - like having an unbiased assistant coach who only sees the data patterns. The algorithm identifies natural clusters based on performance similarities, which might reveal unexpected groupings: your speedster might cluster with your power athlete due to similar force production patterns you hadn't noticed (6).

Sports scientists have used K-means to identify talent in youth soccer (4), and optimize training groups (5). The beauty is that it's hypothesis-free - the algorithm doesn't know who your starters are or their positions, yet often confirms (or challenges) coaching intuitions about athlete categories, such as Force Production and Speed/Velocity as shown below.

Below is an example of the same athletes using:

Category-1 (Power Output)

Scatter plot titled Enhanced PCA Analysis - Power Output shows dots in red, blue, green representing clusters on PC1 and PC2 axes.

Power Analysis: P5 and P12 cluster together but are spatially separated - they achieve similar peak power outputs through different neuromuscular strategies (one more force-dominant, one more velocity-dominant in the power equation).

Category-2 (Speed/Velocity)

Scatter plot titled "Enhanced PCA Analysis - Speed/Velocity" showing three clusters: red, blue, and green. Axes: PC1, PC2.

Speed Analysis: P5, P7, and P12 cluster tightly together - they share similar rate coding, motor unit recruitment patterns, and movement velocity profiles regardless of their power generation differences.

Tying the two clusters together:

Physiological Insight: This reveals the force-velocity relationship in action - athletes can reach similar power zones (Force × Velocity = Power) via different paths along the curve.

Training Application: These three athletes would benefit from similar speed/reactivity training (plyometrics, reactive strength) but may need different strength training approaches based on their individual force-velocity profiles.

The 10-Minute Workflow

Interested to try this tool on your own dataset? You can access it here:

👉 ForcePlate - Analysis

If you need an example dataset, you can use this:

This is a synthesized dataset so may not be entirely accurate!

Note: please be aware of data security and privacy considerations when uploading data to the platform.

Upload your Force Deck CSV export - Drag and drop, that's it
Select your analysis focus - Full metrics vs. category-specific
Choose optimal cluster count - The app suggests based on your dataset
Review PCA variance explanation - See how much of your data story is captured
Explore athlete groupings - Interactive plots for team discussions
Export insights - Ready for training program adjustments

Making It Stick in Your Workflow

Here are my suggestions for integrating this analysis longitudinally:

Monthly cluster reviews during periodization planning
New athlete assessments against existing clusters
Progress tracking - Watch athletes migrate between clusters as they develop
Real-time discussions - Pull up the app during team meetings

Performance Tracking Over Time

Force Plate Analysis screen showing performance trends over time, with line graphs showing athlete outputs over time. — Performance tracking dashboard showing time-series data and metrics

Get quick access to the preselected metrics or use other options with a single click to assess whether your strength programming has been paying off. The visual interface provides immediate insights into:

Trend analysis - Is your training moving athletes in the right direction?
Cluster migration - Watch athletes develop and move between performance groups
Program effectiveness - Validate your coaching interventions with objective data
Individual trajectories - Track specific athlete development patterns

Technical Integration

Streamlit platform - Secure, web-based access
200 MB file limit - Approximately two years of standard force plate data
Currently compatible with VALD systems only!
Direct integration options - For teams wanting automated data ingestion
Export capabilities - Take your insights wherever you need them

Get Involved

I would love for people to reach out if they would like to:

Incorporate it into their database - Direct integration possibilities
Use the app alongside current data collection - Parallel analysis workflow
Pilot test with their teams - Early adopter opportunities

Let's test that coach's eye together and discover the stories your data has been trying to tell you.

FAQs

What is Principal Component Analysis?

Principal Component Analysis is a fundamental dimensionality reduction technique that

transforms high-dimensional data into a lower-dimensional form while preserving as much variance as possible.

What is K-means clustering?

K-Means Clustering is an unsupervised machine learning algorithm that helps group data points into clusters based on their inherent similarity. Unlike supervised learning, where we train models using labeled data, K-Means is used when we have data that is not labeled and the goal is to uncover hidden patterns or structures. For example, an online store can use K-Means to segment customers into groups like "Budget Shoppers," "Frequent Buyers," and "Big Spenders" based on their purchase history

What exactly am I looking at in the PCA plot?

Each dot is one athlete. Dots close together represents similar force plate profiles. The closer they are, the more alike their movement patterns. Athletes in the same colored cluster share similar characteristics and could potentially train together.

How many clusters should I use for the PCA?

Start with 3-4 clusters for most teams. The app suggests an optimal number, but use your judgment too. If clusters seem too mixed (fast and slow athletes together), add one more cluster. Aim for groups that make practical sense for training.

What if the algorithm puts my best athlete in the 'worst' cluster?

Clusters aren't rankings - they're just different profiles. Your best athlete might succeed through technique or factors the force plate doesn't measure. Use this as a discussion point to understand why they're grouped this way.

How how much data do I need before the clusters actually mean something?

A minimum of 10 athletes but 20+ is better for reliable patterns. For data points, at least 3-4 testing sessions per athlete. More data provides more reliable clusters.

If an athlete moves from one cluster to another over time, is that good or bad?

It's just a change - whether it's good depends on your training goals. Moving toward your target performance profile is positive. Use it to track if your training is pushing athletes in the intended direction.

Can I use this with data from other force plate systems, or only VALD's ForceDecks?

Currently, VALD is only available due to data formatting. Other systems organise data differently, so uploads won't work correctly. Contact me (Ashmeet Anand) if you want to discuss expanding to your system.

References

1) Stone, J. D., Merrigan, J. J., Ramadan, J., Brown, R. S., Cheng, G. T., Hornsby, W. G., ... & Hagen, J. A. (2022). Simplifying external load data in NCAA Division-I men's basketball competitions: A principal component analysis. Frontiers in Sports and Active Living, 4, 795897.

2) Parmar, N., James, N., Hearne, G., & Jones, B. (2018). Using principal component analysis to develop performance indicators in professional rugby league. International Journal of Performance Analysis in Sport, 18(6), 938-949.

3) Vagner, M., Cleather, D. J., Kubový, P., Hojka, V., & Stastny, P. (2022). Principal component analysis can be used to discriminate between elite and sub-elite kicking performance. Motor Control, 27(2), 354-372.

4) Bazmara, M., & Jafari, S. (2013). K nearest neighbor algorithm for finding soccer talent. Journal of Basic and Applied Scientific Research, 3(4), 981-986.

5) Shelly, Z., Burch, R. F., Tian, W., Strawderman, L., Piroli, A., & Bichey, C. (2020). Using K-means clustering to create training groups for elite American football student-athletes based on game demands. International Journal of Kinesiology and Sports Science, 8(2), 47-63.

6) https://www.geeksforgeeks.org/machine-learning/k-means-clustering-introduction/

7) https://www.youtube.com/watch?v=FgakZw6K1QQ

Tags