I can make a guess that, one day, some of you might have been faced with the question: “Which type of charts should I use for a particular piece of data?”. Nowadays, any BI tool provides you with an opportunity to choose between various charts. However, it is no surprise you always want to pick the best one which will help get the answers you need. As a data analyst, I build lots of visualizations, and have a better understanding of its usage than many people. In this blog I’d like to list my most favorite types of charts/graphs and briefly describe when to use them and with what data.
Let’s start with the most common and my favourite one. I could write quite a lot about this type of vis because I always starts with it when I try to visualize any data. However, as this is a short post, I will only say this is most useful when you want to show a distribution of data values to quickly do a comparison:
2: Side-by-side bars are really good when you want to compare the performance of metric values across different subgroups of your data or to track changes over time:
There is also another variation of a bar chart – stacked bars which help you break down your data even further. I like using a 100% stacked bar to see the distribution of values on one line:
When a line connects several distinct data points, usually over short or long periods of time, and create a continuous evolution, it is a line graph. Usually, line charts are great to view trends and patterns in data over time – it visualizes data changes at a glance and becomes even more informative by adding a trend line. Line charts are great to show movement over time such as ticket sales, or athlete registrations:
A variation of a line chart is an area chart, which is very similar to a stacked bar chart. It allows users to see changes in a series of data and where they occurred while comparing two or more categories:
So, I would use a line graph to see an overall trend or to compare two categorical values over time, but an area graph to display a contribution of each category in overall amount.
Pie charts are very widely used in the business world and are best to use when you are trying to compare parts of a whole. I love to use a variation on a pie chart, a donut chart, which has a round hole in the center where you can put a total value, therefore, make it more informative:
I would recommend using it for no more than 7 categories, ideally for 2-4, otherwise, you will find it difficult to compare different sections of the pie chart.
Also, as a tip, I would use a pie chart alongside other charts and graphs to drill down into the data, like on the image below. The addition of the map provides further context:
Whenever you have targets or goals, for example monthly sales per department, a bullet chart is the right visualization for you. A bullet graph is a variation of a bar chart with which you can quickly compare progress against a goal. It can be used to track retail/ticket sales performance against set up targets. As a hint, always use a different colour for bars in order to clearly understand how performance is measured against targets:
What I really like is that it saves space – it requires less real estate and can be oriented horizontally or vertically, depending upon the real estate available. However, this chart is best matched for the purpose of a quick analysis of “how the business is doing right now” rather than complex analysis.
Location data can play an important role in the decision-making process. If your database has address lines, town or city, region, postcode, or country I would highly recommend plotting this data on a map. The country field can be used for a global map – the highest level in location data hierarchy. We use Tableau software which is designed to make the most of geographical data. With instant geocoding, Tableau automatically turns the location data into rich and interactive maps.
If your database contains postcode information, that will allow you to do even deeper location analysis. Moreover, if you also have a list of store addresses for your merchandise, or sponsor locations, it is worth adding them on a map as well – you can then shows the distance of your fans to these destinations.
If you want to reveal the patterns or relative concentrations on a map, I would choose a variation of map which is called a heatmap or density map . Tableau creates a heatmap by grouping overlaying marks and color-coding them based on the number of marks in the group. Looking at the map on the left, which is a copy of the one above, we can better understand the areas where fans are concentrated and answer my previous question: “Where do my fans/customers live?”
A highlight table, or heat map, is one of my favourite versions of data visualization, especially when you have quite a lot of important figures you want to display and, at the same time, don’t want to overwhelm the chart. A highlight table uses colour to grab the viewer’s attention, while still presenting precise figures. As you can see on the image below, the cells are colored on a diverging spectrum with the lowest difference from the median temperature coloured a cold blue, and the highest one coloured a dark orange. Therefore, we see a clear trend of increasing temperature over time:
I highly recommend using colour when you have a table with numbers and this is why – please compare the previous version to the crosstab view below . Would you be able to see any trends looking just on a table with raw figures? I assume that potentially you could, but it would take some time to answer a question: “What is the highest and lowest difference from median global temperature?”:
I hope you see the difference now. So, the highlight table is very useful because it uses colour to draw the eye to see the highest and lowest indicators at a glance.
A scatter plot uses dots to represent values for two different numeric variables, for example, height and weight, or minutes played vs goals scored by a player. Scatter plots are used to observe relationships or, let’s say, correlations between variables which can be positive, negative and uncorrelated depending on the slope of a linear relationship:
Here, every dot represents a football player – goals scored are on the x-axis, and minutes played are plotted on the y-axis. As the angle of each line to x axis is less than 90 degrees, the linear relationship is positive for each age group. However, looking at the slope of each line, “<18” aged players tend to spend fewer minutes on a pitch before they score a goal. As expected, we run into the issue of overplotting because we have quite a lot of data points, however, the overall trend is clear. Also, you can use a scatter graph to find outliers in your data, like in the chart below:
I do not use box-and-whisker plots, or boxplots, very often. However, they can be a useful tool as a standardized way of displaying the dataset based on an important five-number summary: the minimum, the maximum, the sample median, and the first and third quartiles. Minimum and maximum are the lowest and largest data points excluding any outliers, median is the middle value of the dataset, lower and upper quartiles are the medians of the lower and upper half of the dataset. They are great for quickly comparing distributions between different categories:
I would suggest you don’t even try to put data labels on the chart, let a viewer focus just on the outliers.
I hope my suggestions are of use to you but there are no strict rules for how the data should be displayed. The main idea of any visualisation is that it should be clear and tell you a story – it should speak for itself. Before sharing it with your team or director, you have to take a look at it again and ask yourself: “Can anyone easily understand what is presented on that chart?”.
If you find that you’re restrained by common chart types, you can try more complex ones. There is a bunch of other chart types I did not touch on today as I wanted to focus on the those I use a lot. The Gantt chart, bubble chart, histogram chart, bullet chart and treemap are the ones I will talk about in my next post.