logo

View all jobs

Data Scientist for football performance understanding - March 2026

remote/Herzo, remote/Herzo
For one of our clients in the sports clothing industry, we are looking for a freelance Data Scientist for football performance understanding



Overview:
This project aims to advance football performance understanding and product innovation by integrating large-scale connected ball data with StatsBomb event datasets. Through advanced analytics and machine-learning techniques, the project identifies gender-specific tactical and technical patterns, uncovers key performance-determining factors (PDFs), and maps how actions such as passing, shooting, and dribbling influence match outcomes. In parallel, the project establishes robust data-engineering foundations for high-quality data pipelines, validated data fusion, and scalable structures—to achieve reproducible insights and inform future athlete performance research and product development.

The project has the purpose that/to:
1) Identify meaningful, data-driven insights of the tactical and technical distinctions between the men’s and women’s game to inform product innovation. 2) Deepen the understanding in key performance drivers in football and quantify them based on large scale datasets with a clear focus on men’s vs. women’s differences

Tasks:
  • Independent implementation in R/Python to identify gender-specific performance patterns
  • Creation of integrated analysis using connected ball and statsbomb data
  • Development of enhanced understanding of technical in game actions
  • Provision of technical consultancy regarding the optimal structure, integration, and validation of football-related data sources for robust analytics, research, and product development.
  • Validation of statsbomb and connected match ball data fusion
1. Women’s Football / Patterns of Play Research
  • Integrated Analysis Using Connected Ball and StatsBomb Data: Conduction of integrated analysis combining connected ball datasets with StatsBomb event data extracted from real match scenarios. The dataset will cover major tournaments including EURO24, WEURO25, WC22, and WWC23.
  • Identification of Gender-Specific Performance Patterns: Identification of genuine, evidence-based areas of differentiation in patterns of play between men’s and women’s football using advanced analytical methods based on own expertise. Provision of insights regarding ball-interaction characteristics, tactical tendencies, and tempo differences to inform product opportunities
2. Performance Determining Factors (PDFs) in Football
  • Enhancement regarding Understanding of Technical In-Game Actions: Refinement of current knowledge base around PDFs related to the events: shooting, passing, dribbling, and running with the ball. Application of granular in-game performance data to map how these actions influence outcomes across different contexts, phases of play, and competitive levels.
  • Big Data Feature Identification: Application of large-scale analytical methods—including feature engineering, clustering, and machine learning techniques—to identify the most relevant performance features associated with PDFs. Development of a framework that links raw data signatures to performance outcomes in a reproducible, methodical way.
  • Gender Comparison Focused on PDFs: Investigation and highlight of the critical differences between men’s and women’s football as they relate to PDFs. This includes the identification of specific performance behaviors, physical and technical tendencies and contextual factors that differentiate the two games as well as the provision of professional/technical consultancy for a more nuanced understanding of performance demands and product needs.
3. Data Engineering Requirements for Football Data
  • Data Engineering for StatsBomb and COMB Data Pipeline: Specification of data models, transformations, metadata standards and ingestion processes based on own expertise and feedback from client provided within weekly project related meeting. This includes the specification of data models, transformations, metadata standards, and ingestion processes to set up a dataset which is optimized for efficient consumption, scalable analyses.
  • Validation of StatsBomb–COMB Data Fusion: Provision of technical consultancy regarding the fidelity, harmonization, and consistency of merged StatsBomb and COMB datasets, focusing specifically on passes and shots. This involves development of validation metrics, identifying discrepancies, and providing recommendations to improve data fidelity and harmonization.


The performance of the contractor has the goal: Independent generation of actionable, data-driven insights into football performance through integration and analysis of advanced football datasets, and enablement of better athlete performance research and innovative product development.

TIMELINES:
  • After 3 months: Establishment of data foundations by defining data models, building the initial Statsbomb/connected ball ingestion pipeline
  • After 6 months: Scaled data ingestion across all data, complete robust data fusion validation and development of first version of performance-determining factor (PDF) features and patterns of play analysis
  • After 9 months: Contextual and robustness checks of analytical models and translation of insights into early product hypotheses and dashboards.
  • After 12 months: Finalization of validated models, production of product opportunity briefs, documentation of code and analytics, delivery of comprehensive gender-based performance insights

Start: 01.03.2026
Duration: until 28.02.2027
Capacity: 40 hours per week
Location: 50% remote, 50% onsite (Herzogenaurach in Germany)

 

Share This Job

Powered by