Hi all,
A few weeks back I wrote a basic StatsBomb python script and spoke on how we can join some of the data ready, fit for use in Tableau to create freeze frame shot maps.
I want to take that blog one step further and start to look at pitch visibility. This blog will feel like starting half way through a book – so please forgive me and read the previous StatsBomb Shot Freeze Frame post before moving onto this one.
Let’s pick up from where we ended last time. We have our alteryx flow that has created our grid for our shot maps with freeze frames, we have identified our player teams and we have the shots that are on target.
So how do we add in pitch visibility?
Well you may notice in the previous datasets we have visibility in the match frame data. These are a list of co-ordinates that make up the marks for what part of the pitch can be seen when the freeze frame was taken. (Hence why all our players remain within the polygon)
You can see there are quite a few duplicates as each player for each ID will essentially replicate the same visibility fields.
Lets revisit our Alteryx flow. (Of course, this is free to download from my GitHub repo under the title)
The amended workflow with “Pitch Visibility” now has a new dataset that we feed into it. To get that new dataset, I amended our previous python script to handle the X & Y co ordinates from the freeze frames dataset.
Here is what that script looks like:
You will see we have added a couple for loops into the python code from last time, this is to split out the visible area to have new co-ordinates. For each co-ordinate we will have a new row.
This code leaves us with one issue that we will tackle through Alteryx. That issue is knowing which order the points are joined together (our path) to create the equivalent of a convex hull. We cant assume the order of the dataset would join the polygon up the correct way.
The new steps in the workflow find the unique records from the visibility data. We then look to make a centroid mapping of the x and y point, reason being is we use Poly build to find the convex hull. Once we’ve found the sequence in which they are put together, given each ID, we then extract that information using Poly split and information tools.
Above is an example of what that looks like – after that, it is a simple case of joining this data back into our original data on the common field of ID.
You can find all the referring CSV’s files in the repo to test for yourself. I am increasingly becoming a fan of the Poly Build and Split tool!
Hopefully the python script and Alteryx flow really help you with taking your data prep to the next level.
Once we have the data prepped for tableau, It’s just a case of creating the new layer and making sure we get our correct field onto path.
Let me know if you have any questions.
Going further,
- Try write all the transformations in the data in Python
- Add in labels from the freeze frame data for each players name
LOGGING OFF,
CJ