Using the substitute_in minute, you can create a histogram of when players enter a match.
An informative article based on World Cup player appearance data could cover several areas:
This dataset is structured in a "long" format, where a single player appears multiple times—once for every World Cup tournament in which they participated. jfjelstul worldcup data-csv appearances
Goalkeepers and center-backs from finalists dominate. In 2022, Emiliano Martínez (Argentina) or Hugo Lloris (France) would top the list with ~690+ minutes. But the real magic is historical: In 2014, Manuel Neuer played every single minute of Germany’s run, including the final.
# Load data df = pd.read_csv('worldcup_appearances.csv') Using the substitute_in minute, you can create a
Here is how you load the dataset directly from the source using pandas and answer a real question: Which substitute scored the most goals?
Calculate the average minute of the first substitution per decade. In 2022, Emiliano Martínez (Argentina) or Hugo Lloris
# Visualizing Top 10 Players by Appearances top_players = df['player'].value_counts().head(10) plt.figure(figsize=(10,6)) top_players.plot(kind='bar') plt.title('Top Players by World Cup Appearances') plt.xlabel('Player') plt.ylabel('Number of Appearances') plt.show()
By aggregating player_id counts grouped by team_id and tournament_id , analysts can determine the experience level of a squad.