3/28 Lecture Notes: D3 and Final Project

I. Propose your Final Project

II. D3.

What is it? Let’s see some examples and ideas for how to learn more about it.

D3 — short for Data-Driven Documents — is emerging as a popular open-source library for data visualizations. This is partly because its creator, Mike Bostock, uses it to create stunning visualizations for the New York Times as an interactive graphics editor.  Bostock is also the creator of D3. You can view a collection of D3 interactives Bostock has worked on at his personal web site.

D3 is way too advanced to get into in much detail in this course, but if you are interested you can start to go through some online tutorials on your own. But first, fair warning: If you don’t have a good base-level understanding of HTML, CSS and some Javascript you are probably going to get stuck — A LOT. If that happens to you, don’t panic because you’re not alone. Just head over to a site like Codeacademy and learn the basic skills you need there, then go back to where you got suck on D3.

Here are several tutorials to try, starting with the easiest:

III. NVD3

D3 is way too advanced to get into in much details in this 5-week course, but if you are interested you can start to go through some online tutorials on your own.

Instead, we will use NVD3, which is a set of reusable charts that are rendered using D3. In terms of the raw output, you will notice that the examples are very similar to HighCharts or other Javascript charts such as Dimple.

The difference is that by using NVD3 charts, you can start to just barely understand how D3 brings data together with HTML, Javascript and CSS to create visual experiences. Think of it like a gateway drug to D3. After going through this exercise, you should absolutely expect to feel hemmed in and wonder how you can do more. When you reach that point, that’s when you may want to jump into the tutorials above.

Getting Started With NVD3

The first step is to download the NVD3 code. Click the Zip button at the upper right corner of the NVD3 site , unzip the file and drag the resulting folder to your desktop.

Open the Examples folder, and you will see a different HTML file for each chart. The examples match up to those you see in the examples gallery. Find the file linePlusBarChart.html and drag it into an open browser window.

Let’s Examine Our Two-Axis Chart

You should see a chart with bars and lines, and two scales — one on the left, and one on the right. This type of chart is useful for comparing two types of data that may relate to each other, but are on completely different scales.   For example, in this chart they’re comparing quantities in the millions of units with costs in the low hundreds of dollars. If you tried to create a chart like that in Excel the line representing dollars would be too small to display anything. This NVD3 chart auto-scales based on the data it’s given.

Getting Data Into the Chart

As with other Javascript libraries, this one accepts data directly in HTML. Let’s edit it!

First — and this is important — make a copy of that chart file and give it an original name. You will likely get tripped up the first time and need to start over. This way, you can play around with your new file and go back to square one if needed.

Next, open the file in a text editor such as TextWranger or Sublime Text. You will see some HTML elements that should be familiar to you by now. Stop at line 49 where it says var testdata. See all the numbers below? That’s the array that the chart displays. Each point consists of a pair of numbers that determines the height of the bar.

You will also notice that there are two sets of arrays: one labeled “Quantity” and the other “Price.” Obviously, the array under Quantity controls the bars that are aligned to the scale on the left, and the array under Price controls the line that is aligned with the scale on the right.

Go back up to the Price array and look closely at each pair of numbers and you will see something like this:

     [ 1136005200000 , 1271000.0]

This data format is known as JSON (Javascript Object Notation), and you can read more about it at Json.org.

Compare those numbers with what’s in your chart, you will eventually notice that the second number matches up with quantity. Great!

But what in the world is the number on the left? Do you have any guesses?

How Computers Calculate Dates

I’ll spare you the guesswork and just tell you. Those 13-digit numbers are dates that are formatted in a way that only computers understand. This is something that you would only be able to figure out by talking to a computer programmer, or doing some very specific web searches.

This will sound really bizarre, but those numbers represent the number of seconds that have passed since January 1,  1970 (I know — it’s weird, just accept it). This is true not just of this script, but for computers in general. Have you ever had your computer crash, or moved a bunch of files from one computer to another and noticed that the creation dates of the files are set back to 1969? Now you know why. It’s because they have no date, so the computer sets them all the way back to before 1970.

This means that if you want to input a set of date-value pairs for your chart, you need to convert every date into that format. This is where Excel can help you. Open an Excel spreadsheet, or download this one, and add two columns: One with dates in MM/DD/YY format, one with the values you want to display for each date. In the third column, enter this formula:

 =(A1-DATE(1969,12,31))*86400*1000

Make sure the A1 is referencing the cell directly to the left (so if you have column headers, you would probably want that to be A2). You will see a 13-digit number. Copy that formula all the way down the column. Now you have a list of dates the script can read.

Add your data to the script

The final step involves a lot of copying and pasting, or formatting in a text editor. You want to get all of your machine-readable date and data pairs into that script in exactly the same format as what you see, and remove all of the other data. If you are very careful and have a small amount of data you may be able to do it in a first try, but it’s very likely that you will have typos.

At the end, your array should look something like this:

[ [ 1388620800000,4234] , [ 1388620800000,234] ,[ 1388707200000,53] , [ 1388793600000,3634] , [ 1388880000000,434] , [ 1388966400000,6433] , [ 1389052800000,43454] , [ 1389139200000,4354] , [ 1389225600000,54] , [ 1389312000000,34534] ]

The array under Price will look similar, but with different sets of numbers depending on the data you’re comparing.

Finally, change the text next to “Key” above each array to whatever you are comparing. Drag your HTML file into an open browser window and you should see your data in there.

(Wondering how to get a lot of data into JSON format? There are converters online, like this one.)

Upload and Embed

As with every other code-based visualization, the final step is to upload the NVD3 folder to a web server, navigate to it on the server, and put the URL into iFrame codes like this:

<iframe src="YOURURLHERE" width="100%" height="480"></iframe>

Put that code into a blog post along with a story explaining the data.

IV. NVD3, Assignment 6

Assignment 6 is to work on your own using your own data. Feel free to use one of the other charts at NVD3.org.

Due: Tuesday, April 4.

V. Remember to do course evaluation.

Please spend last 15 minutes doing it in class.