tl;dr
I used R to animate the goal that won Manchester City the 2011/12 Premier League title in breathtaking fashion.
Inspired by Ryo Nakagawara, who makes awesome R-related soccer content that you can find on his site and on Twitter.1
The problem
Soccer has run dry.
Leagues have been cancelled or decided on a contentious points-per-game basis given that there’s no precedent. The fate of the 2019/20 English Premier League season is still unknown.2
I figured it would be a good time to revisit a season that finished in the most emphatic fashion; one that was decided in the final minute of the final game.
The game
City and United were level on points at the top of the Premier League as they entered their final matches of the 2011/12 season. Only goal difference separated them.
Pos | Team | Played | GD | Points |
---|---|---|---|---|
1 | Manchester City | 37 | +63 | 86 |
2 | Manchester United | 37 | +55 | 86 |
As the game entered the closing stages, a dominant City somehow found themselves 2-1 down to a lacklustre Queens Park Rangers side.
After sustained pressure, Edin Dzeko scored a towering header from a corner after 92 minutes. The game was level at 2-2, but it wouldn’t be enough to win the title; one more goal was needed.
Meanwhile, Manchester United had won their concurrent game at Sunderland and had every right to think the title was theirs.
Pos | Team | Played | GD | Points |
---|---|---|---|---|
1 | Manchester United | 38 | +56 | 89 |
2 | Manchester City | 38 | +63 | 87 |
But after 93 minutes, City’s Nigel De Jong burst into QPR’s half. Sergio stepped forward, received the ball and beat his man. He passed to Mario Balotelli and continued his run into the box. Super Mario slid to the ground and pushed the ball into Agüero’s path.
The rest is history: Sergio received the ball, beat a slide tackle and smashed the ball into the goal. Cue commentator Martin Tyler screaming ‘AGÜEROOOOO!’.
Pos | Team | Played | GD | Points |
---|---|---|---|---|
1 | Manchester City | 37 | +64 | 89 |
2 | Manchester United | 37 | +56 | 89 |
City had done the impossible to win their first Premier League trophy and first top-flight title in 44 years.
Reliving the moment
So the sensible thing to do is to use R to make a gif of the player movements in the build-up to the goal.
You may have seen something like this before from Ryo Nakagawara and others. I took a slightly different approach to Ryo, but the result is basically the same.
You need three packages:3
- {ggplot2}, created by Hadley Wickham, to provide the plotting framework
- {ggsoccer}, by Ben Torvaney, for the grid and pitch theme
- {gganimate}, by Thomas Lin Pedersen, for animating each step and interpolating between them
# Load packages
library(ggplot2) # Create Elegant Data Visualisations Using the Grammar of Graphics
library(ggsoccer) # Plot Soccer Event Data
library(gganimate) # A Grammar of Animated Graphics
library(tibble) # Simple Data Frames
I also used {tibble} to create data frames with tribble()
, but this isn’t a requirement.
Set coordinates
You need to start with coordinate data for the players and ball. {ggsoccer} defaults to a 100- by 100-unit pitch on which to plot these data. But where do you get it from?
You could use Opta’s premium service for accessing player-tracking data. My approach was more… artisanal. I just watched some grainy YouTube videos and roughly guessed where the players were.
A really nice interactive tool that makes the process easier is the soccer event logger by Ben Torvaney, creator of {ggsoccer}.
Players
The first data frame contains each player’s coordinates, with a row for each frame of the final animation. I added the player name so it could be used as a label.
I chose to focus on the three active players in the build-up to the goal. This made the final graphic clearer, yes, but more importantly it meant I had fewer data points to input.
I created the data frame using tribble()
from the {tibble} package because I found it easier to input the data in a row-wise fashion. It’s also easy to write a comment per line to explain what’s happening.
# Player position data
players <- tribble(
~frame, ~name, ~x, ~y, # column names
1, "De Jong", 50, 50, # advances from own half
2, "De Jong", 56, 50, # advances into oppo half
3, "De Jong", 64, 50, # passes to Agüero
4, "De Jong", 64, 50, # off the ball
5, "De Jong", 64, 50, # off the ball
6, "De Jong", 64, 50, # off the ball
7, "De Jong", 64, 50, # off the ball
8, "De Jong", 64, 50, # off the ball
1, "Agüero", 85, 70, # diagonal run to meet ball from De Jong
2, "Agüero", 80, 65, # diagonal run to meet ball from De Jong
3, "Agüero", 75, 60, # receives pass from De Jong
4, "Agüero", 76, 63, # beats defender, passes to Balotelli
5, "Agüero", 80, 50, # advances to edge of box
6, "Agüero", 87, 38, # receives pass from Balotelli
7, "Agüero", 93, 36, # shot
8, "Agüero", 94, 33, # goal
1, "Balotelli", 83, 61, # waiting on edge of box
2, "Balotelli", 83, 61, # waiting on edge of box
3, "Balotelli", 83, 61, # waiting on edge of box
4, "Balotelli", 83, 57, # waiting on edge of box
5, "Balotelli", 83, 55, # recieves pass from Agüero
6, "Balotelli", 83, 55, # passes to Agüero
7, "Balotelli", 83, 54, # off the ball
8, "Balotelli", 83, 54, # off the ball
)
So each player has coordinates for each time step.
# Preview the data frame
head(players[order(players$frame), ])
## # A tibble: 6 x 4
## frame name x y
## <dbl> <chr> <dbl> <dbl>
## 1 1 De Jong 50 50
## 2 1 Agüero 85 70
## 3 1 Balotelli 83 61
## 4 2 De Jong 56 50
## 5 2 Agüero 80 65
## 6 2 Balotelli 83 61
Ball
I put the coordinate data for the ball in a separate data frame. This made it easier to specify and modify separately the ball and player data.
# Ball position data
ball <- tribble(
~frame, ~x, ~y,
1, 51, 50, # De Jong possession
2, 57, 50, # De Jong pass
3, 74, 60, # receievd by Agüero
4, 77, 63, # Agüero pass
5, 83, 54, # received by Balotelli
6, 88, 38, # received by Agüero
7, 94, 36, # Agüero shot
8, 100, 46 # goal
)
Graphics
The first step in producing the animation is to create a single plot object that contains all the points. {gganimate} takes the object and animates it frame by frame, interpolating the data points between each time step.
Static plot
To produce the plot object:
- Plot the pitch area
- Add ball data first so the points will appear ‘under’ the player points
- Add player points and labels
- Add a title
# Plot all the data
plot <-
ggplot() + # blank canvas
annotate_pitch( # plot 100 * 100 unit pitch
colour = "white", fill = "#7fc47f", limits = FALSE
) +
theme_pitch() + # theme removes plotting elements
coord_flip( # rotate and crop pitch
xlim = c(49, 101), ylim = c(-12, 112)
) +
geom_point( # add ball data
data = ball,
aes(x = x, y = 100 - y),
colour = "black", fill = "white", pch = 21, size = 2
) +
geom_point( # add player data
data = players,
aes(x = x, y = 100 - y),
colour = "black", fill = "skyblue", pch = 21, size = 4
) +
geom_text( # add player labels
data = players, aes(x = x, y = 100 - y, label = name),
hjust = -0.2, nudge_x = 1
) +
ggtitle( # add title
label = "MCY [3]-2 QPR",
subtitle = "93:20 GOAL Sergio Agüero"
)
I’ve chosen to rotate the plot and crop it because we only need to see one half of the pitch. Note that this means the y-aesthetic for the points is set to 100 - y
.
The output plot
object is composed of all the frames that we set out in the player
and ball
data sets. You wouldn’t plot this object as-is, but here’s what it looks like:
plot
{gganimate} will take each time-step—specified by the frame
variable—to render the animation. Here’s each of those frames from the player
data.
plot + facet_wrap(~ frame) + ggtitle(NULL, NULL)
Animated plots
{gganimate} turns the static plot into an animation in one step.
The transition_states()
function builds on top of the plot
object. I specified the time-step variable; the durations for showing the frame and the interpolated frames between; and whether or not the animation should loop back to the start.
# Animate the plot
animation <-
plot + # the plot object
transition_states(
frame, # time-step variable
state_length = 0.01, # duration of frame
transition_length = 1, # duration between frames
wrap = FALSE # restart, don't loop
)
You can use the animate()
function to render it.
animate(animation)
AGÜEROOOOO!
You can save the result as a gif with anim_save()
, which works like ggsave()
from {ggplot2}: the default is to save the latest animation to your working directory.
anim_save("9320.gif")
Luckily the gif keeps looping so you can keep watching until a decision is made on how the current Premier League season will end.
Session info
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 3.6.1 (2019-07-05)
## os macOS Sierra 10.12.6
## system x86_64, darwin15.6.0
## ui X11
## language (EN)
## collate en_GB.UTF-8
## ctype en_GB.UTF-8
## tz Europe/London
## date 2020-05-04
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date lib source
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0)
## blogdown 0.17 2019-11-13 [1] CRAN (R 3.6.0)
## bookdown 0.18 2020-03-05 [1] CRAN (R 3.6.0)
## cli 2.0.2 2020-02-28 [1] CRAN (R 3.6.0)
## colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.0)
## crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0)
## digest 0.6.25 2020-02-23 [1] CRAN (R 3.6.0)
## dplyr 0.8.5 2020-03-07 [1] CRAN (R 3.6.0)
## ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.6.0)
## evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0)
## fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.0)
## farver 2.0.3 2020-01-16 [1] CRAN (R 3.6.0)
## gganimate * 1.0.5 2020-02-09 [1] CRAN (R 3.6.0)
## ggplot2 * 3.3.0.9000 2020-03-11 [1] Github (tidyverse/ggplot2@86c6ec1)
## ggsoccer * 0.1.5.9000 2020-04-29 [1] Github (torvaney/ggsoccer@f2d55dc)
## glue 1.4.0 2020-04-03 [1] CRAN (R 3.6.2)
## gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.0)
## hms 0.5.3 2020-01-08 [1] CRAN (R 3.6.0)
## htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.0)
## knitr 1.28 2020-02-06 [1] CRAN (R 3.6.0)
## labeling 0.3 2014-08-23 [1] CRAN (R 3.6.0)
## lifecycle 0.2.0 2020-03-06 [1] CRAN (R 3.6.0)
## magick 2.2 2019-08-26 [1] CRAN (R 3.6.0)
## magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.0)
## pillar 1.4.3 2019-12-20 [1] CRAN (R 3.6.0)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.0)
## plyr 1.8.5 2019-12-10 [1] CRAN (R 3.6.0)
## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.0)
## progress 1.2.2 2019-05-16 [1] CRAN (R 3.6.0)
## purrr 0.3.4 2020-04-17 [1] CRAN (R 3.6.2)
## R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.0)
## Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 3.6.1)
## rlang 0.4.5 2020-03-01 [1] CRAN (R 3.6.0)
## rmarkdown 2.1 2020-01-20 [1] CRAN (R 3.6.0)
## scales 1.1.0 2019-11-18 [1] CRAN (R 3.6.0)
## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0)
## stringi 1.4.6 2020-02-17 [1] CRAN (R 3.6.1)
## stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.0)
## tibble * 3.0.1 2020-04-20 [1] CRAN (R 3.6.2)
## tidyselect 1.0.0 2020-01-27 [1] CRAN (R 3.6.0)
## tweenr 1.0.1 2018-12-14 [1] CRAN (R 3.6.0)
## utf8 1.1.4 2018-05-24 [1] CRAN (R 3.6.0)
## vctrs 0.2.4 2020-03-10 [1] CRAN (R 3.6.1)
## withr 2.2.0 2020-04-20 [1] CRAN (R 3.6.2)
## xfun 0.13 2020-04-13 [1] CRAN (R 3.6.2)
## yaml 2.2.1 2020-02-01 [1] CRAN (R 3.6.0)
##
## [1] /Users/matt.dray/Library/R/3.6/library
## [2] /Library/Frameworks/R.framework/Versions/3.6/Resources/library
Also a fellow builder of {brickr} soccer players.↩
But do check out posts by Ben Torvaney and Robert Hickman on predicting Premier League outcomes with R.↩
An aside: I used the {annotater} RStudio Addin by Luis D Verde to annotate these library calls.↩