The guided eye
- Moroni Mariano
- 18 nov 2024
- 4 Min. de lectura
Data can guide the scout's eye to the right place. For this, parameters associated with the specific search need must be used. We will address some examples to understand how statistics can help us overcome obstacles, quickly bringing us closer to the available market options that best match our objectives.
The search for a footballer always entails a specific goal. Important data such as playing characteristics, age, biotype, physical attributes, technical qualities, etc., are part of a profile that we must try to match as closely as possible to the selected candidates.
As an example, we will attempt in this article to show the different characteristics of full-backs in Argentina's first division. It is interesting to see how with a simple tool like Tableau and data from Wyscout, we can perform various analyses, focused on different aspects that lead to different conclusions.
In decision-making, it is imperative to broaden the scope of the analysis to consider as many variables as possible, thereby reducing the margin of error. This is why Big Data contributes significantly to the world of scouting by considerably reducing a scarce and irreplaceable factor: working time. In the past, a scout needed to watch 50 players to find 5 of interest; today, thanks to our information, this can be greatly reduced, allowing us to reach players of interest more quickly.
There are two phrases attributed to one of the most famous sports directors that I would like to mention on this occasion. One of them is, "Without data, I don't sign any player; only with data, neither." The other is, "Matches are no longer watched; players within matches are watched." Both phrases refer to the current approach of any football actor, with precise information about practically every player on the planet.
When watching a match, one can learn about the career development of the players on the field. This gives us the chance to confirm or question certain characteristics we already know about the player and observe many others that data cannot provide, such as returning to defensive positions after an attack, emotional situations during a match, etc.
Let's begin this analysis by considering a single variable: defensive duels. Beyond the names that appear on the graph, we aim to study the players who engage in the most defensive duels with the highest possible effectiveness in a single visualization. The players at the top right of the graph are those who participate in the most duels and have the highest efficiency.
When clubs request a search for a specific position, they define their needs with words, and one must have the skill to translate those words into data. When responding to a specific search, the use of Big Data must adapt to the request. One cannot have a rigid search matrix, as it will not be able to respond to different requests and will instead be trapped in rigid, contextless results. It is reassuring when making a recommendation to have reached the player or players of interest through different search channels and not just by finding that they excel in one particular characteristic.
When the search requires a full-back who "closes the flank" or is defensively solid, we can start by examining their defensive behavior. Initially, we must define our filters for the search. For the cases we will analyze later, these filters will be players who have played more than 600 minutes, with more than 55% efficiency in defensive duels, and an average of 3 interceptions per match.

In this graph, in addition to seeing the number of duels and their efficiency in the defensive area, we also see circles and colors. The color of the circles indicates the number of fouls committed: green is the highest scale, yellow is intermediate, and red is the lowest scale. The size of the circles indicates successful attacking actions. The larger the circle, the greater the participation.
Thus, in a simple visualization, we can relate four factors: average defensive duels per 90 minutes + defensive duel efficiency % + average fouls committed per 90 minutes + average successful attacking actions per 90 minutes.
From another perspective, we will now try to get a broader view of the performance, both offensive and defensive, of the full-backs in Argentine football. Therefore, we analyze in this graph the efficiency of the players in attack and defense, linking the effectiveness of these aspects.

Players positioned further to the right have higher defensive effectiveness, and those positioned higher have greater offensive effectiveness. The ideal case is to be at the top right, as it fulfills both functions. The color of the circles indicates the average number of offensive duels per match: green is the highest scale, yellow is intermediate, and red is the lowest scale. The size of the circles indicates the average number of defensive duels per match. The larger the circle, the greater the participation.
In this visualization, we again see four related variables. On one side, we see a graph that defines efficiency in attack and defense. On the other, through the colors and sizes of the circles, we see if this efficiency also translates into the number of participations. Efficiency is different for a player who engages in few duels compared to one who engages in many. Another important factor is whether these performance data are achieved by the player playing more than 60 minutes or coming off the bench for a handful of minutes.
One of the most common mistakes in Big Data occurs when a player contributes coming off the bench. The system extrapolates what was done in the minutes played, assuming that the same can be achieved over 90 minutes. A player's intensity is not the same throughout the entire match, so this is a point to be especially careful about.
To conclude, Big Data has infinite possibilities and millions of possible combinations. Just as not all musicians write the same song, not all data analysts reach the same players. It is important to highlight the skill of the person conducting the search as a means to obtain a result that meets the pre-established objectives.
Moroni Mariano
Fuente: DataMoroni
Comments