Archive: May 2013

  1. Displaying quantitative information

    Over the last week or so I’ve immersed myself in data driven documents. This was partly inspired by the release of Panic’s new Status Board iPad app, which is a new app from Panic for displaying lots of different bits of data beautifully on one screen.

    However, I’d already been looking at ways to generate dynamic sales data for a client on their website’s main administration page. This page already aggregated recent sales information, including details of open orders and the most popular products. But I wanted to be able to give a graphical representation of some of this data for comparative reasons, so the client could see at a glance how well things were going compared to last month, and the same time last year.

    To begin with – and to satisfy my desire to test out the Status Board app – I wrote some PHP code to generate a JSON encoded string that was compatible with the Status Board app. After a little trial and error I managed to get this working. If you have the app installed and are viewing this page on an iPad, you can see an example by following this link:

    Example graph

    Then I started looking at the best way of representing the same information on a web page.

    Although I’ve used jQuery’s Flot plugin before, the power and flexibility of d3.js looked a better long-term bet.

    I started off looking at a plugin built on top of the d3.js library, called Rickshaw.js. This masks some of the complexity of d3.js and makes the creation of good looking charts a fairly simple process. But I encountered a few limitations with the Rickshaw plugin; and also decided that ultimately it would be a better idea to understand the framework which underpinned it – especially as I have a government project on the horizon which may well benefit from extensive data visualisation.

    So I began to look at using the d3.js library directly.

    There are actually some excellent tutorials up at d3js.org which helped me to understand the basics of how to manipulate data to produce a chart. However, none of the examples showed how to use JSON data to generate charts with multiple x axes – so a little trial and error with judicious use of console.log() was required before I was able to understand the underlying data structures for the charts fully.

    Having done so though, I was able to modify the Status Board JSON string to produce something more suited to the d3.js chart implementation. As my application is written in PHP, I used PHP to achieve this; but it could also be done directly in javascript.

    Here’s the JSON data for a few years worth of sales data by month:

    [{"title":"Jan","2010":408.97,"2011":551.4,"2012":679.78},{"title":"Feb","2010":491.37,"2011":1033.8,"2012":471.93},{"title":"Mar","2010":821.08,"2011":902.49,"2012":761.14},{"title":"Apr","2010":1265.45,"2011":1025.34,"2012":831.18},{"title":"May","2010":737.48,"2011":1163.13,"2012":1070},{"title":"Jun","2010":958.03,"2011":493.63,"2012":1095.29},{"title":"Jul","2010":351.21,"2011":907.85,"2012":919.6},{"title":"Aug","2010":777.62,"2011":696.24,"2012":363},{"title":"Sep","2010":737.34,"2011":2283.07,"2012":1199.4},{"title":"Oct","2010":1876.57,"2011":1552.31,"2012":485.85},{"title":"Nov","2010":994.3,"2011":2021.12,"2012":504.4},{"title":"Dec","2010":1760.62,"2011":1669.09,"2012":1079.29}]

    Once I had the data in the right format, I was able to create a bar chart showing multiple years of sales by month, as shown below.


    [This is actually generated from the data, so you can view the source to see how it’s actually done.]

    So how is this generated?

    We start by parsing the JSON data so that numbers are handled properly (where in this case ‘data’ is a string containing the JSON data):

    var series = JSON.parse(data);
    

    From this we create an array of keys used for the x axis. This uses the d3.keys function to iterate through the first sequence of data in the series JSON data (created above) to pull out each key and return it – as long as the key is not named ‘title’, which refers to the name of that data sequence rather than an axis point:

    var x_axis_keys = d3.keys(series[0]).filter(function(key) { if (key!="title") return key; });
    

    Next, we set some values for how big the chart is going to be, with some padding for each axis and the legend we are going to display to the right of the chart:

    var margin = {top: 20, right: 60, bottom: 60, left: 40},
    var width  = 500 - margin.left - margin.right,
    var height = 300 - margin.top - margin.bottom;
    

    Although this chart is going to be dealing with time data (it’s returning values for every month of a given year), the data is always going to return a value – it won’t skip any months – so there is no need to use a linear scale. This is just as well, because linear scales don’t really work very well with bar charts, especially when it comes to labelling the x axis. So we set an ordinal scale instead, starting at 20 pixels from the left (to add a little padding), and up to the width of the chart as defined above. The .1 refers to the padding between each band.

    var x0 = d3.scale.ordinal()
        .rangeRoundBands([20, width], .1)
        ;
    

    Then we create another x axis ordinal scale. This will be used for calculating where to place the bar for each data sequence within the bands defined by x0. For now, we’ll just use the defaults.

    var x1 = d3.scale.ordinal();
    

    The final axis scale that we set up is for the y axis. This does use a linear scale, because we want the y axis to display an accurate representation of the figures passed to it. It has a range set from the height of the chart to 0.

    var y = d3.scale.linear()
        .range([height, 0])
        ;
    

    In each of the above scales we haven’t yet defined a domain to map to the ranges (for more on this, see Setting Scales Domains and Ranges in d3.js). However, we will soon. In the meantime, let’s assign the right scale to each axis.

    First, the x0 scale is assigned to xAxis and set to display at the bottom of the chart.

    var xAxis = d3.svg.axis()
        .scale(x0)
        .orient("bottom")
        ;
    

    And then the y scale is assigned to yAxis and set to display at the left of the chart. The number returned for the y axis has also been divided by 100 to give a more readable display on the axis.

    var yAxis = d3.svg.axis()
        .scale(y)
        .orient("left")
        .tickFormat(function(d) { return (d/100);})
        ;
    

    Now we can create the SVG holder for the chart, which is set to appear within the #chart_container element on our HTML page. A class is assigned, along with the width and height. The g element is a container element, much like the div element in HTML. Setting a transform on this container affects how its child elements are positioned – in this case we’re just adding a little padding based on our original height and width calculations.

    var svg = d3.select("#chart_container").append("svg")
        .attr("class", "chart")
        .attr("width", width + margin.left + margin.right)
        .attr("height", height + margin.top + margin.bottom)
      .append("g")
        .attr("transform", "translate(" + margin.left + "," + margin.top + ")");
    

    Now it’s time to start processing and displaying some of the data on the chart.

    First we use the x_axis_keys array created earlier and javascript’s map function to generate x and y values for each data point.

    series.forEach(function(d) {
        d.parts = x_axis_keys.map(function(x) { return {x: x, y: d[x]}; });
     });
    

    Moving on to the axes, we get the ‘title’ value for each tick on the x axis and use it for the x0 domain, which itself is mapped to the ordinal range we set earlier.

    x0.domain(series.map(function(d) { return d.title; }));
    

    Then we set the range for each key in x_axis_keys in the x1 domain, where x0.rangeBand() returns the width of each tick on the x axis, based on the scale set and the number of keys in x_axis_keys.

    x1.domain(x_axis_keys).rangeRoundBands([0, x0.rangeBand()]);
    

    Finally, we sort out the y axis. The domain for this axis starts at 0 and goes up to the maximum y value we have stored in the data – which is calculated by the d3.max function iterating through each of the y values we set in d.parts in the data series and returning the maximum value.

    y.domain([0, d3.max(series, function(d) { return d3.max(d.parts, function(d) { return d.y; }); })]);
    

    Now we can do some actual drawing.

    First we draw the x axis.

    svg.append("g")
        .attr("class", "x axis")
        .attr("transform", "translate(0," + height + ")")
        .call(xAxis);
    

    For the example chart I’ve rotated the text on the x axis by 90 degrees, so that it can be read more easily. This means ensuring that the bottom padding is set high enough at the start. As well as rotating the text, it gets moved a little to the left so that it’s centred properly.

    svg.selectAll(".x.axis text") 
        .style("text-anchor", "end")
        .attr("transform", function(d) {
        return "translate(-15,10)rotate(-90)";
        });
    

    And then the y axis, to which we’ve added a label:

    svg.append("g")
        .attr("class", "y axis")
        .call(yAxis)
    
      .append("text")
        .attr("transform", "rotate(-90)")
        .attr("y", 6)
        .attr("dy", ".71em")
        .style("text-anchor", "end")
        .text("Amount (£'00s)");
    

    Next, we can draw our actual bars. We colour-code the bars to differentiate them by creating another ordinal scale which can be mapped to each data sequence (obviously the more data sequences you use, the more colours you’ll need).

    var color = d3.scale.ordinal()
        .range(["#98abc5", "#8a89a6", "#7b6888", "#6b486b", "#a05d56", "#d0743c", "#ff8c00"]);
    

    Then we generate the SVG elements used to hold the bars. Here the sellectAll is actually creating the .title elements as it goes based on the series data. Each bar is position according to a lookup on the x0 domain for the current title.

    var title = svg.selectAll(".title")
        .data(series)
      .enter().append("g")
        .attr("class", "g")
        .attr("transform", function(d) { return "translate(" + x0(d.title) + ",0)"; });
    

    And then we draw the bars, using the appropriate colour, and calculating the x and y points from the x1 ordinal scale and y linear scale respectively. The width is also derived from the x1 scale, while the height is calculated by reversing the y axis value – because SVG coordinates are calculated from the top left rather than the bottom left as required by this chart.

    title.selectAll("rect")
        .data(function(d) { return d.parts; })
      .enter().append("rect")
        .attr("width", x1.rangeBand())
        .attr("x", function(d) { return x1(d.x); })
        .attr("y", function(d) { return y(d.y); })
        .attr("height", function(d) { return height - y(d.y); })
        .style("fill", function(d) { return color(d.x); })
        ;
    

    So the chart is drawn and now the only thing to do is to add a legend to indicate what each of the bars for each month represent. This takes the keys from the x_axis_keys array and places them to the right of the chart alongside a small square colour-coordinated box.

    var legend = svg.selectAll(".legend")
        .data(x_axis_keys.slice())
       .enter().append("g")
        .attr("class", "legend")
        .attr("transform", function(d, i) { return "translate(0," + i * 20 + ")"; });
    
    legend.append("rect")
        .attr("x", width )
        .attr("width", 18)
        .attr("height", 18)
        .style("fill", color)
        ;
    
    legend.append("text")
        .attr("x", width + 25)
        .attr("y", 9)
        .attr("dy", ".35em")
        .text(function(d) { return d; });
    

    And that’s it! OK, it’s complicated and I’m just scratching the surface as to what’s possible – but it’s a start. Now I want to try my hand at some of the dynamic charts…