The advanced data set allows for a more detailed look at how players distribute the ball
The data set provided by the MCFC Analytics project so far has excited analysts and football fans alike. Some are using it to help predict team’s chances of winning the Premier League, whilst others are using it to analyse their team’s next opponent.
One of the first tableau visuals I made showed how teams passed the ball. This was done using the standard MCFC Analytics dataset. It showed the number of passes forwards, backwards and left and right.
The advanced dataset however shows the x and y co-ordinates of where the pass took place and also the final x and y co-ordinates of the outcome. This lends itself to a more detailed (therefore more actionable) data visual. d3.js has some great visuals and one of them is this one that shows flight routes across the USA. So can we make a similar visual in Tableau Public using this advanced data set?
Below shows an aggregation of all passes and their direction. It’s pretty, looks like a firework, but makes less sense than having one per player. The green shading indicates the minute that the pass happened (light green indicates the start of the match, dark green towards the 90th minute)
Next we split these passes out by position. Now we can start to actually find insights in the data. Goalkeepers’ passing is generally the longest (not a surprise I know). Maybe you’d use this data to see how defenders switch their style of passing once they go a goal up/down. Reassuringly the visual shows Goalkeepers passing forwards only, so I guess I manipulated the data correctly (more on that later).
Tableau Public allows you to make animations also. By using the “page shelf” we can insert a field or metric (in this case we’ll add “minute”) and the visual will cycle through all the permutations. So in this case, the visual will cycle through from the start of the match (minute = 0) to the end (minute = 94). You can also use the “page shelf” to leave a “trail” so your animation grows over time. Sadly the page shelf isn’t yet supported in dashboards and only works locally in worksheets so the below video shows the effect:
We can then plot all the players onto a dashboard in the way they lined up:
The above visual is actually a series of maps showing a series of “a to b” plots. All passes start at a longitude and latitude of (0,0) then go off in the desired direction. The underlying map is then “washed out”. One flaw is that not all maps are to the exact same scale (i.e. to be sized nicely the strikers charts seem slightly more zoomed in, thus distorting the data as their passes look longer).
I first read in the f24 xml data into Excel using the ribbon menu Data>From Other Sources>From XML Data Import (i’m using Excel 2010, Excel 2007 should be similar). I did the same for the f7 xml file and saved both these flat looking Excel files as CSVs.
The next step involved SAS. Read in both CSV files then join the two together so you know the player that is causing each event. I then made a new unique event code by joining the club (“M” for Man City and “B” for Bolton) with the event ID. This is necessary as the event ID starts at 0 for both teams and counts upwards for both teams. So there may be two events with event ID 10 for example. The 10th event for Man City and 10th event for Bolton, whatever these might be.
I then dedupe the data so I have one row per successful pass. From here we re-join the other qualifier IDs (the other attributes regarding the event). We then use the original x and y co-ordinates with the final x and y event co-ordinates to calculate the net shift in x and y.
The Tableau bit
After transforming the data we’ll have a row for each pass that shows the shift in x and y. For each row we make a duplicate row which has x and y co-ordinates of 0,0. Like below:
PosID is the used (as described in this tutorial) to plot paths on a map.
I hope this tutorial/example shows you just what varied things can be done with the MCFC Analytics data set (or any data set actually) by using Excel, Tableau and SAS. Perhaps you’ll use the data to make accurate forecasts with the data before going to Paddy Power’s team betting pages and making yourself a fortune on the findings or maybe you’ll use the data just to settle a long running football argument you’ve had with mates in the pub.
The live Tableau Public workbook (which you can download to look at further):
Comments, as always, are welcome…