The eighth edition of the master’s program continues its journey, this time with a session led by Marino Arnáiz, Head of Data at Sevilla FC, who presented “The Life Cycle of Data in Professional Football.”
The session began with a brief personal introduction, explaining the role of a data engineer within a football club and how their work integrates into innovation and analysis projects. For context, it was highlighted that data has become a key strategic asset for sports decision-making, from scouting to tactical preparation. The goal of the talk was to help attendees understand the complete data journey, from its origin to its end use in the applications employed by analysts and scouts.
Next, the origin of data was addressed, explaining the different types generated in a football environment—event data, tracking data, GPS data—and the main providers that supply them (Opta, WyScout, SkillCorner, etc.). The associated challenges of data acquisition, such as quality, formats, or update frequency, were also analyzed, using a visual example of the flow a raw dataset follows before processing.
In the technical section, the processes of ingestion and storage were explored in depth, describing how data pipelines (ETL) are designed and automated using APIs, cron jobs, or streaming flows. The design of the Data Lake, database management, and the importance of data lineage for ensuring data traceability and auditing were explained.
Subsequently, the transformation and modeling phase was explored, where data is cleaned, normalized, and structured into analytical models (dimensions, facts, hierarchies). Examples of derived metrics—such as those used in ProVision—were shown, and the importance of maintaining consistency and validation throughout the process was emphasized.
In the final part, it was explained how data is made available to the end user through internal and external APIs, and how it is integrated into workflow applications like AiFootball. A complete example illustrated the cycle data follows: from its initial capture to its final visualization in a scouting tool, underscoring the importance of continuous feedback with analysts and scouts to improve models and processes.
The talk concluded with a Q&A and debate session, where concerns about the practical application of Big Data in sports performance and the optimization of information flows in professional clubs were addressed.
This masterclass is one of many offered in the master’s program, reflecting the importance of data today. Information and technology have transformed what was once a luxury into what is now a strategic necessity within clubs.


