2/13/2018 Lecture Notes: D3 and NVD3
February 12, 2018
- Course evaluations: please make sure you do them. Unfortunately you have to do one for each of the 3-pack classes. This is a pain, but I really value your feedback and remember it is all anonymous.
- #NHData Finds.
- Review of assignment 5.
- Where did you fail? How did you get out of the sand traps? What did you learn?
- A few great examples:
- Amanda Dominguez: Drug overdoses
Zachary: Fake News Twitter accounts
Tiffany: Domain names of Donald Trump
Maria: Restaurant health code violations. Excellent work with both the map and the chart!
What is it? Let’s see some examples and ideas for how to learn more about it.
D3 — short for Data-Driven Documents — is emerging as a popular open-source library for data visualizations. This is partly because its creator, Mike Bostock, uses it to create stunning visualizations for the New York Times as an interactive graphics editor. Bostock is also the creator of D3. You can view a collection of D3 interactives Bostock has worked on at his personal web site. Here are a few good ones:
- Midterm election maps from 2014
- Interactive stories from the last Winter Olympics
- Tax rates across the country
- See examples of different types of visualizations here.
Here are several tutorials to try, starting with the easiest:
D3 is way too advanced to get into in much details in this 5-week course, but if you are interested you can start to go through some online tutorials on your own.
Getting Started With NVD3
The first step is to download the NVD3 code. Click the Zip button at the upper right corner of the NVD3 site , unzip the file and drag the resulting folder to your desktop.
Open the Examples folder, and you will see a different HTML file for each chart. The examples match up to those you see in the examples gallery. Find the file linePlusBarChart.html and drag it into an open browser window.
Let’s Examine Our Two-Axis Chart
You should see a chart with bars and lines, and two scales — one on the left, and one on the right. This type of chart is useful for comparing two types of data that may relate to each other, but are on completely different scales. For example, in this chart they’re comparing quantities in the millions of units with costs in the low hundreds of dollars. If you tried to create a chart like that in Excel the line representing dollars would be too small to display anything. This NVD3 chart auto-scales based on the data it’s given.
Getting Data Into the Chart
First — and this is important — make a copy of that chart file and give it an original name. You will likely get tripped up the first time and need to start over. This way, you can play around with your new file and go back to square one if needed.
Next, open the file in a text editor such as TextWranger or Sublime Text. You will see some HTML elements that should be familiar to you by now. Stop at line 49 where it says var testdata. See all the numbers below? That’s the array that the chart displays. Each point consists of a pair of numbers that determines the height of the bar.
You will also notice that there are two sets of arrays: one labeled “Quantity” and the other “Price.” Obviously, the array under Quantity controls the bars that are aligned to the scale on the left, and the array under Price controls the line that is aligned with the scale on the right.
Go back up to the Price array and look closely at each pair of numbers and you will see something like this:
[ 1136005200000 , 1271000.0]
Compare those numbers with what’s in your chart, you will eventually notice that the second number matches up with quantity. Great!
But what in the world is the number on the left? Do you have any guesses?
How Computers Calculate Dates
I’ll spare you the guesswork and just tell you. Those 13-digit numbers are dates that are formatted in a way that only computers understand. This is something that you would only be able to figure out by talking to a computer programmer, or doing some very specific web searches.
This will sound really bizarre, but those numbers represent the number of seconds that have passed since January 1, 1970 (I know — it’s weird, just accept it). This is true not just of this script, but for computers in general. Have you ever had your computer crash, or moved a bunch of files from one computer to another and noticed that the creation dates of the files are set back to 1969? Now you know why. It’s because they have no date, so the computer sets them all the way back to before 1970.
This means that if you want to input a set of date-value pairs for your chart, you need to convert every date into that format. This is where Excel can help you. Open an Excel spreadsheet, or download this one, and add two columns: One with dates in MM/DD/YY format, one with the values you want to display for each date. In the third column, enter this formula:
Make sure the A1 is referencing the cell directly to the left (so if you have column headers, you would probably want that to be A2). You will see a 13-digit number. Copy that formula all the way down the column. Now you have a list of dates the script can read.
Add your data to the script
The final step involves a lot of copying and pasting, or formatting in a text editor. You want to get all of your machine-readable date and data pairs into that script in exactly the same format as what you see, and remove all of the other data. If you are very careful and have a small amount of data you may be able to do it in a first try, but it’s very likely that you will have typos.
At the end, your array should look something like this:
[ [ 1388620800000,4234] , [ 1388620800000,234] ,[ 1388707200000,53] , [ 1388793600000,3634] , [ 1388880000000,434] , [ 1388966400000,6433] , [ 1389052800000,43454] , [ 1389139200000,4354] , [ 1389225600000,54] , [ 1389312000000,34534] ]
The array under Price will look similar, but with different sets of numbers depending on the data you’re comparing.
Finally, change the text next to “Key” above each array to whatever you are comparing. Drag your HTML file into an open browser window and you should see your data in there.
(Power Tip: Wondering how to get a lot of data into JSON format? There are converters online, like this one. If you use it, upload a CSV, choose commas as the separator, make sure “First row is column names” is selected, and choose CSV to JSON Array as the output.)
Upload and Embed
As with every other code-based visualization, the final step is to upload the NVD3 folder to a web server, navigate to it on the server, and put the URL into iFrame codes like this:
<iframe src="YOURURLHERE" width="100%" height="480"></iframe>
Put that code into a blog post along with a story explaining the data.
IV. A data set to work with
See if you can use sorting and filtering with this data set on causes of death in NYC to generate a comparison graph in NVD3 comparing two different factors (men vs. women, different causes, etc.)
V. NVD3, Assignment 6
Assignment 6 is to work on your own NVD3 chart using your own data. Feel free to use one of the other charts at NVD3.org.