This week we have something really quite special. Will Sutton & I sat down to talk all about supplementary skills that will elevate you on your data journey. I think I speak for both Will & I that we may be more ‘known’ for Tableau related concepts, but we both were developers at one time!
For those needing the introduction – Will is a Tableau visionary, won the IronViz competition last year and will be judging this year! So if you are out in Vegas in May at TC conference do make time to say hi.
Onto the chat today – lets start with a big one.
Q. Recently we’ve seen some huge advancements in tech, in particular with AI where chatGPT can write code after a short prompt, is it still worth learning to code?
W: Is the juice worth the squeeze? When I began my career with data I knew Excel and was just learning Tableau. Regularly I avoided anything that would involve learning to code. I had these great no-code tools, why should I spend hours learning a new piece of tech when I already can do something similar with the tools I already have?
As much as I tried to avoid coding, one day I had to face it. The call came in ” Oh hey Will, Ricky’s put together this awesome script in R to pull in data from the US Securities Exchange but there’s an issue with it – could you take a look at it please.” So I was sent this R file and asked to get it working. I download R and R Studio and was plunged into the deep end.
When they talk about coding as a language they aren’t joking, I looked over this script and it might as well have been written in a foreign dialect. I knew I couldn’t figure it out immediately but could I get some of it to work? Line by line I went through this script, I looked up everything, Stack Overflow, tutorials, forums, it took an age but eventually I worked out what the script was doing and fixed the error.
It was a slow process but one I don’t forget because it opened my eyes to what could be achieved with coding. I clicked a button and suddenly I had a dataset ready to bring into Tableau. A task that would have normally taken me hours was done in a few minutes. Amazing!
I was forced out of my comfort zone and it expanded my skillset as an analyst. I was used to visualising and analysing data but now I could automate and transform new data sets too.
While chatGPT is an awesome tool I don’t think the need for coding skills is going away anytime soon. We will still need people to build and implement this technology in the future. If anything chatGPT has reduced the barriers to learning coding. It’s like having an on-demand expert you can call up any time day or night to ask questions.
CJ: I love that idea of considering it reducing barriers to entry! It’s funny how we also have a spectrum of code to no code, and tools also exist somewhere in between – your likes of Alterys & Tableau prep, which I would say is like a front end GUI to coding. ChatGPT also will unlock everything along that spectrum.
CJ: One of the stumbling blocks is knowing which language to start with, what do you recommend for this?
Oh absolutely, the debate of which is better Python or R still goes on. It’s a question worth asking too, as you want to invest your time wisely and see a return for your efforts.
Hiring managers will tell you the language shouldn’t matter but what you can do with it. This is true in part but doesn’t consider integrating this person into a team. For example, I worked alongside a very skilled developer that coded applications in Delphi and could build great systems but no one else on the team knows Delphi, so sharing code and knowledge becomes a challenge.
Pick a language that’s in regular use for your role or the role you want. For data analysts generally, it’s a choice of SQL, Python and R. Aim to develop skills in one of these three. Additional languages are easier to pick up when you have some working knowledge of coding.
For deciding on the right coding language consider the people around you. It’s hard to learn a new language without support. I developed skills in R at the BBC purely because it was a commonly used language within the team, I would have the support of colleagues that had attempted problems before and the code they had produced I could work from.
SQL skills I found incredibly valuable when working with Tableau when pulling data from databases but it’s difficult to gain these skills outside of a work environment. After teaching SQL courses I recommend Danny Ma’s 8 Week SQL challenge, it’s all about understanding the functionality in SQL and solving problems they’ll face at work. I’ve put together solutions and tutorials for the first few challenges on GitHub.
What would be a good starting point for picking up coding?
Direction is often a challenge. There are so many paths to take with coding. I recall Ann Jackson speaking about having “T-shaped skills” where you can have a particular strength, e.g. Data visualisation, and then add a skill to broaden into a new area or improve efficiency, which is where coding comes in.
For example, I can reduce my time spent analysing data by running my dataset through a Python package called “ydata-profiling” to return an overview report of the data I’m working with.
In the example below I take a dataset of Call of Duty Players and generate a html report providing exploratory data analysis. I especially like the correlation plots at the end of the report.
Or I could use code to build a dataset for me. I was having issues with my home broadband speeds and wanted to check if it was a one-off or an issue I needed fixing. So I coded up a script in R using the package “speedtest” that would check my broadband speed and write it to a file.
I then set the script to run every hour on my PC using Windows Task Scheduler and visualised the results in Tableau. This is a nice double win for the portfolio too, as you can talk about the code and the visualisation to illustrate a range of skills in one go.
CJ: Could coding help with the visual elements with Tableau?
If you’re familiar with polygons or lines in Tableau they allow you to draw shapes by giving x, y coordinates and a path. Knowing this you can code up a script to generate these points for you which is what I did in 2021 with my Snowflake viz.
The script itself draws lines of random length and angles to create snowflake shapes, which I’ve then distributed in Tableau.
In 2022 I wanted to distribute the snowflakes in a more random arrangement. I found the packed circles layout to be just what I wanted if I could replace the circles with snowflakes. After some work, I was able to hack the co-ordinates out of the packedbubbles library in R and distribute my snowflakes in Tableau
CJ: What I enjoy about Tableau Public is that each piece of work goes to creating your own portfolio which is valuable when showcasing your skills. How would you demonstrate your coding skills?
Github! It’s another tool to learn but very useful for displaying your work and collaborating with others. GitHub is a document technology that will monitor a file folder, called a “repo”, for any changes, allowing you to push changes to files in a centralised location, or roll back any modifications. This way teams can be sure they’re working with the latest version of a file or script.
For the portfolio side, GitHub displays a readme which is a webpage-style document that shows up when someone accesses the code “repo”. Typically you can use the readme to explain the code, introduce it to a new user, giving an overview of some of the functions. You can use this for more than just project write-ups however, think about blog posts, challenge solutions, we even used it to start hosting the #GamesNightViz community project.
A profile readme is a great landing page to show off your work and projects, for example on my page I’ve used images to highlight certain projects I’ve written up so it has a similar effect to clicking a viz for Tableau Public.
Lastly, GitHub lets you host a website via GitHub pages if you want to branch into web development or would like a more customisable portfolio space. Mine needs a bit of update but this website was developed using R Markdown and Jekyll. Jekyll handles the themes and the blog functionality, Markdown is code that formats the documents, it takes a bit of learning but it’s free to host the site and there are some great examples available to start working from.
I have to admit I too am a huge fan of Github and we recently copied/stole the idea from Will to move our SportsVizSunday initiative to Github but it allows us to add code, add datasets and have essentially a one stop shop for all things data and tools. Truth be told it also saves us a whole bunch on memory and cost by not having to host it on the website.
I want to thank Will for making time in his busy schedule to share some of his journey and tool stack that he has adapted over the years to be able to help people see and understand data.