Hi all,

Short Tableau one this week on creating rolling xG charts. I’ve previously written about match xG charts here.

Todays content will look at season xG moving averages.

Here is what we will create.

What is xG? Let’s take the FBRef definition.

Very simply, xG (or expected goals) is the probability that a shot will result in a goal based on the characteristics of that shot and the events leading up to it. Some of these characteristics/variables include:

Every shot is compared to thousands of shots with similar characteristics to determine the probability that this shot will result in a goal. That probability is the expected goal total. An xG of 0 is a certain miss, while an xG of 1 is a certain goal. An xG of .5 would indicate that if identical shots were attempted 10 times, 5 would be expected to result in a goal.

Without going into details, there are a number of xG models out there as to what makes up that probability of a goal being scored.

For today we will just look at the calculations needed to create the chart in Tableau.

A few shout outs of other cool xG tutorials:

This one from Maram using excel and powerpoint

Son of a Corner xG rolling plot using python

Statsbomb – how can xG be used

So how can we recreate this in Tableau?

Let’s take west ham’s xG shot data for the past two seasons. You can find it in the github repo at the top of the page.

First things first we need to make sure we order the data in the correct way. I do his in excel based on the match date and minute.

I then quickly create the de-duplicated version of the match IDs and rank them (they are now already in order)

The main reason I do this is because there is various breaks in football and so we cant necessarily use a date field in Tableau so we have to use the equivalent of what would be an order of matches. For me those matches start at the start of the 21/22 season and end at the end of the 22/23 season. We’ve limited our dataset to just look at West Ham matches.

Here’s what that will look like as data.

Now we have the data its time to split out whether the action corresponds to West Ham or the opposing team.

if [A Team] = ‘West Ham’ and [H A] = ‘a’ then ‘xG Created’
ELSEIF [H Team] = ‘West Ham’ and [H A] = ‘h’ then ‘xG Created’
ELSE ‘xG Conceded’
END

The above should make sense, pretty much says if its west ham, dependent on home or away then count it as created else its a conceded expected goal.

We can double check the calculation by bringing it into the view (I’ve left it in the workbook)

See how if we just look at the sum of xG with xG Flag on detail we have for each game (date) the flickering xG combinations – but it does feel a bit hectic on the page!

We can amend our calculation to be a moving calculation. In this example above i’ve chosen a moving calculation of the previous 5 games.

My only hesitation with this calculation is for the first 5 games, it feels like an inaccurate ‘rolling’ value. Perhaps it would be nice to have the 5 games of the season before to help flatten out the xG spike at the start of our chart.

Now this is theoretically the final version of our line chart version so you could stop here. But i’ve seen individuals take it even further doing area charts.

Create two new calculations to split out the different moving averages for West Ham and the Opponent.

xG Created

WINDOW_AVG(SUM(

if [xG Flag] = ‘xG Created’ then
[X G]
END

), -5, 0)

and

xG Conceded

WINDOW_AVG(SUM(

if [xG Flag] = ‘xG Conceded’ then
[X G]
END

), -5, 0)

The -5 and 0 here refer to how many forwards and backwards we want to look at in our moving average.

To finalise our chart we need to create a few calculations to figure out what value is greater out of the conceded and created.

By now plotting these new calculations as measure values in the rows column we can create an area chart as desired.

xG Greater is as below.

if [xG Conceded] < [xG Created] then [xG Created]
ELSE [xG Conceded]
END

Of course, xG smaller is the opposite of this.

Now How to build from scratch:

Open a new sheet

Drag Order onto columns as an unaggregated dimension.

Add measure values to the rows.

Add measure names to filter and only filter for measure values of:

  • xG conceded, xG created, xG Greater Value, xG Smaller value

Add measure names to column.

Change the marks card to area.

Currently it will have it so that they are stacked. Go to Analysis stack marks and turn it off.

Re-arrange your marks to have xG Smaller Value at the front, make this white in colour.

xG Conceded should be the next mark in the colour range order as it will show where the opposing teams aggregated xG was higher than west hams.

xG Greater value is when West Hams xG was greater than the opposition so this should be the third mark of colour.

xG Created isn’t necessarily needed but i left it in just so you can check your calculations are correct.

(For our marks we may need to turn the stack marks off under analysis, thank you Deb Simmonds for helping me with this aspect! – I was a little confused about why the area and line version originally didnt look the same)

And there we have it.

I think the only think I want to close out on is saying, Im unsure on how to best approach the idea of rolling averages of xG in terms of the start and end if using the idea of previous x number of games. Maybe I need more data, but this tutorial really is focusing on the calculations as oppose to your own implementation. If you think a different value is better served here by all means amend where necessary.

The only other thing is, where the data has fewer marks of games there becomes natural ‘cross over’ points between xG created and conceded where the lines cross one another, I don’t think this is the prettiest in style but unsure on alternative methods to overcome it.

Final touches are cosmetic, I added some reference lines to show the split in season games, made colours indicate based on team brand, and added some padding and titles to help indicate what the chart was showing.

Remember all my resources are free to download on GitHub and Tableau Public

Going further:

  • Try create the chart using your own data.
  • Try look at all teams xG charts within a season
  • Add in some extra formatting to show average across the whole season

West Ham are massive.

LOGGING OFF,

CJ