Tableau Public Vector Diagrams using the MCFC Analytics dataset

The advanced data set allows for a more detailed look at how players distribute the ball

The data set provided by the MCFC Analytics project so far has excited analysts and football fans alike. Some are using it to help predict team’s chances of winning the Premier League, whilst others are using it to analyse their team’s next opponent.

One of the first tableau visuals I made showed how teams passed the ball. This was done using the standard MCFC Analytics dataset. It showed the number of passes forwards, backwards and left and right.

The advanced dataset however shows the x and y co-ordinates of where the pass took place and also the final x and y co-ordinates of the outcome. This lends itself to a more detailed (therefore more actionable) data visual. d3.js has some great visuals and one of them is this one that shows flight routes across the USA. So can we make a similar visual in Tableau Public using this advanced data set?

The Outputs

Below shows an aggregation of all passes and their direction. It’s pretty, looks like a firework, but makes less sense than having one per player. The green shading indicates the minute that the pass happened (light green indicates the start of the match, dark green towards the 90th minute)

Tableau Public Vector Diagram

Every successful pass made in the Bolton vs Man City game

Next we split these passes out by position. Now we can start to actually find insights in the data. Goalkeepers’ passing is generally the longest (not a surprise I know). Maybe you’d use this data to see how defenders switch their style of passing once they go a goal up/down. Reassuringly the visual shows Goalkeepers passing forwards only, so I guess I manipulated the data correctly (more on that later).

Tableau Public Vector Diagram by Position

Tableau Public Vector Diagram by Position

Tableau Public allows you to make animations also. By using the “page shelf” we can insert a field or metric (in this case we’ll add “minute”) and the visual will cycle through all the permutations. So in this case, the visual will cycle through from the start of the match (minute = 0) to the end (minute = 94). You can also use the “page shelf” to leave a “trail” so your animation grows over time. Sadly the page shelf isn’t yet supported in dashboards and only works locally in worksheets so the below video shows the effect:

We can then plot all the players onto a dashboard in the way they lined up:

Team Vector Diagram by Position

Team Vector Diagram by Position

The above visual is actually a series of maps showing a series of “a to b” plots. All passes start at a longitude and latitude of (0,0) then go off in the desired direction. The underlying map is then “washed out”. One flaw is that not all maps are to the exact same scale (i.e. to be sized nicely the strikers charts seem slightly more zoomed in, thus distorting the data as their passes look longer).

The process

I first read in the f24 xml data into Excel using the ribbon menu Data>From Other Sources>From XML Data Import (i’m using Excel 2010, Excel 2007 should be similar). I did the same for the f7 xml file and saved both these flat looking Excel files as CSVs.

The next step involved SAS. Read in both CSV files then join the two together so you know the player that is causing each event. I then made a new unique event code by joining the club (“M” for Man City and “B” for Bolton) with the event ID. This is necessary as the event ID starts at 0 for both teams and counts upwards for both teams. So there may be two events with event ID 10 for example. The 10th event for Man City and 10th event for Bolton, whatever these might be.

I then dedupe the data so I have one row per successful pass. From here we re-join the other qualifier IDs (the other attributes regarding the event). We then use the original x and y co-ordinates with the final x and y event co-ordinates to calculate the net shift in x and y.

The Tableau bit
After transforming the data we’ll have a row for each pass that shows the shift in x and y. For each row we make a duplicate row which has x and y co-ordinates of 0,0. Like below:

SAS Output

SAS Output

PosID is the used (as described in this tutorial) to plot paths on a map.

I hope this tutorial/example shows you just what varied things can be done with the MCFC Analytics data set (or any data set actually) by using Excel, Tableau and SAS. Perhaps you’ll use the data to make accurate forecasts with the data before going to Paddy Power’s team betting pages and making yourself a fortune on the findings or maybe you’ll use the data just to settle a long running football argument you’ve had with mates in the pub.

The live Tableau Public workbook (which you can download to look at further):

Comments, as always, are welcome…

Related posts:

  1. MCFC Analytics Heat Map with advanced data set
  2. MCFC Analytics – Work in progress
  3. Visualise Google Analytics Multi-Channel Funnel Reports in Tableau Public
  4. How does your football team pass the ball? – MCFC Analytics
  5. How to make lollipop charts in Tableau Public

2 comments to Tableau Public Vector Diagrams using the MCFC Analytics dataset

Leave a Reply

  

  

  

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>