A column chart has vertical columns representing the value of different categories and a bar chart has horizontal bars representing the same. Column charts and bar charts are great for showing the relative difference in size between categories. We often refer to column charts or bar charts collectively as ‘category charts‘.
Column chart or bar chart?
The choice of vertical or horizontal bars is largely a styling choice (Figure 1, Figure 2). I recommend using vertical columns wherever possible (Figure 2). I prefer vertical columns because the basic charting convention is to place the independent variable on the X (horizontal) axis and the dependent variable on the Y (vertical) axis. The independent variable is the thing that is fixed, like time or a category and the dependent variable is the thing that varies in relation to that fixed variable. By sticking with a consistent ‘independent variable on the X axis’ approach, the reader interprets a column chart, a scatter chart, a histogram, or a box plot using the same basic approach.
Figure 1: A bar chart comparing two different collections (daylight saving and no daylight saving) across three different time periods
The possible exception where horizontal bars might trump vertical columns is where there are a large number of categories. Our typical page orientation is portrait rather than landscape, so the page is longer than it is wide. Hence you can fit more categories on a page if you stack them vertically (using horizontal bars). We are also accustomed to scrolling vertically (and not horizontally) on web pages so I can see the attraction of a bar chart over a column chart for large numbers of categories.
Comparing groups with a column chart
The column chart is the workhorse of data visualisation. There is hardly a data set that that cannot have at least some aspect of it presented using a column chart or bar chart. Bar charts and column charts are simple to understand because humans are pretty good at judging the difference in height / length of the columns/bars. They are easy to create and are an efficient use of space. Bar charts and column charts are also great for comparing data by placing columns or bars side by side (see Figure 2). No surprise then that we use column charts in most Data Curio posts.
Figure 2: A column chart comparing two different collections (daylight saving and no daylight saving) across three different time periods
Start with a zero base
The only real rule when it comes to using column charts or bar charts is that the dependent variable axis should start at zero. Because the purpose of the chart is to show the relative difference between categories, the value axis (Y for column charts and X for the bar chart) needs to start at zero. Compare Figure 3 to Figure 2. These charts show the same data, but at first glance, Figure 3 suggests a very dramatic decrease in energy use over time. Figure 2 also suggests that in 2011 non daylight saving used almost three times as much energy as daylight saving. However from Figure 2 (zero based) we can clearly see that the difference is closer to 30% than 300%.
There are possible exceptions to the zero axis start, such as where you are trying to highlight a subtle differences. In these cases the basic approach is to make sure that the reader knows what is going on by pointing out the non-zero start.
An alternative is to use a line chart – especially with temporal data – as it is acceptable to have a line series chart which does not start at zero. That is because line series charts are for showing a trend over time – hence the shape of the line is more important that the absolute values.
Figure 3: A column chart that doesn’t have a zero value start
Sort your category order based on values
You will commonly have a collection of categories that are sorted based on the category name (Figure 4). I find it easier to interpret Figure 5 which has the category order sorted according to the values. I can clearly see from Figure 5 that after mining, the second and third largest contributors to the Australian GDP are finance and home ownership – not quite so easy in Figure 4.
Sorting the category order by value is definitely not a hard and fast rule. There are plenty of times where the order of categories should be based on some other criteria, such as keeping ‘like’ categories together. For our Figure 4 and 5 example, it may have made more sense to keep ‘Manufacturing’, ‘Retail trade’ and ‘Wholesale trade’ side by side because they are related steps in a supply chain.
Figure 4:Column chart with categories sorted based on label name
Figure 5:Column chart with categories sorted based on values
Stacked column charts
Stacked column charts are a great way to show how the the parts of a column are made up. In Figure 6 you can clearly see that Tasmania has the lowest carbon emissions per person. You can also see that as a proportion, Victoria’s CO2 emissions from industrial processes and agricultural activities is lower than all other states.
If you are interested on focusing solely on the proportions and not the absolute then you can rescale the Y axis to a 0-100% scale (Figure 7). This is effectively the same as presenting a collection of pie charts but is a much more efficient use of space.
Figure 6:Stacked column chart
Figure 7:Stacked column chart – scaled to 100%
Making a column or bar chart in Truii
Don’t forget to sign up to Truii’s news and posts (form on the right).