Category: DataViz Lecture Notes

2/16/16 Lecture Notes

I. Propose your Final Project

II. D3.

What is it? Let’s see some examples and ideas for how to learn more about it.

D3 — short for Data-Driven Documents — is emerging as a popular open-source library for data visualizations. This is partly because its creator, Mike Bostock, uses it to create stunning visualizations for the New York Times as an interactive graphics editor.  Bostock is also the creator of D3. You can view a collection of D3 interactives Bostock has worked on at his personal web site.

D3 is way too advanced to get into in much detail in this course, but if you are interested you can start to go through some online tutorials on your own. But first, fair warning: If you don’t have a good base-level understanding of HTML, CSS and some Javascript you are probably going to get stuck — A LOT. If that happens to you, don’t panic because you’re not alone. Just head over to a site like Codeacademy and learn the basic skills you need there, then go back to where you got suck on D3.

Here are several tutorials to try, starting with the easiest:

III. NVD3

D3 is way too advanced to get into in much details in this 5-week course, but if you are interested you can start to go through some online tutorials on your own.

Instead, we will use NVD3, which is a set of reusable charts that are rendered using D3. In terms of the raw output, you will notice that the examples are very similar to HighCharts or other Javascript charts, such as JQuery-Visualize.

The difference is that by using NVD3 charts, you can start to just barely understand how D3 brings data together with HTML, Javascript and CSS to create visual experiences. Think of it like a gateway drug to D3. After going through this exercise, you should absolutely expect to feel hemmed in and wonder how you can do more. When you reach that point, that’s when you may want to jump into the tutorials above.

Getting Started

The first step is to download the NVD3 code. Click the Zip button at the upper right corner of the NVD3 site , unzip the file and drag the resulting folder to your desktop.

Open the Examples folder, and you will see a different HTML file for each chart. The examples match up to those you see in the examples gallery. Find the file linePlusBarChart.html and drag it into an open browser window.

Let’s Examine Our Two-Axis Chart

You should see a chart with bars and lines, and two scales — one on the left, and one on the right. This type of chart is useful for comparing two types of data that may relate to each other, but are on completely different scales.   For example, in this chart they’re comparing quantities in the millions of units with costs in the low hundreds of dollars. If you tried to create a chart like that in Excel the line representing dollars would be too small to display anything. This NVD3 chart auto-scales based on the data it’s given.

Getting Data Into the Chart

As with other Javascript libraries, this one accepts data directly in HTML. Let’s edit it!

First — and this is important — make a copy of that chart file and give it an original name. You will likely get tripped up the first time and need to start over. This way, you can play around with your new file and go back to square one if needed.

Next, open the file in a text editor such as TextWranger or Sublime Text. You will see some HTML elements that should be familar to you by now. Stop at line 49 where it says var testdata. See all the numbers below? That’s the array that the chart displays. Each point consists of a pair of numbers that determines the height of the bar.

You will also notice that there are two sets of arrays: one labeled “Quantity” and the other “Price.” Obviously, the array under Quantity controls the bars that are aligned to the scale on the left, and the array under Price controls the line that is aligned with the scale on the right.

Go back up to the Price array and look closely at each pair of numbers and you will see something like this:

     [ 1136005200000 , 1271000.0]

This data format is known as JSON (Javascript Object Notation), and you can read more about it at Json.org.

Compare those numbers with what’s in your chart, you will eventually notice that the second number matches up with quantity. Great!

But what in the world is the number on the left? Do you have any guesses?

How Computers Calculate Dates

I’ll spare you the guesswork and just tell you. Those 13-digit numbers are dates that are formatted in a way that only computers understand. This is something that you would only be able to figure out by talking to a computer programmer, or doing some very specific web searches.

This will sound really bizarre, but those numbers represent the number of seconds that have passed since January 1,  1970 (I know — it’s weird, just accept it). This is true not just of this script, but for computers in general. Have you ever had your computer crash, or moved a bunch of files from one computer to another and noticed that the creation dates of the files are set back to 1969? Now you know why. It’s because they have no date, so the computer sets them all the way back to before 1970.

This means that if you want to input a set of date-value pairs for your chart, you need to convert every date into that format. This is where Excel can help you. Open an Excel spreadsheet, or download this one, and add two columns: One with dates in MM/DD/YY format, one with the values you want to display for each date. In the third column, enter this formula:

 =(A1-DATE(1969,12,31))*86400*1000

Make sure the A1 is referencing the cell directly to the left (so if you have column headers, you would probably want that to be A2). You will see a 13-digit number. Copy that formula all the way down the column. Now you have a list of dates the script can read.

Add your data to the script

The final step involves a lot of copying and pasting, or formatting in a text editor. You want to get all of your machine-readable date and data pairs into that script in exactly the same format as what you see, and remove all of the other data. If you are very careful and have a small amount of data you may be able to do it in a first try, but it’s very likely that you will have typos.

At the end, your array should look something like this:

[ [ 1388620800000,4234] , [ 1388620800000,234] ,[ 1388707200000,53] , [ 1388793600000,3634] , [ 1388880000000,434] , [ 1388966400000,6433] , [ 1389052800000,43454] , [ 1389139200000,4354] , [ 1389225600000,54] , [ 1389312000000,34534] ]

The array under Price will look similar, but with different sets of numbers depending on the data you’re comparing.

Finally, change the text next to “Key” above each array to whatever you are comparing. Drag your HTML file into an open browser window and you should see your data in there.

(Wondering how to get a lot of data into JSON format? There are converters online, like this one.)

Upload and Embed

As with every other code-based visualization, the final step is to upload the NVD3 folder to a web server, navigate to it on the server, and put the URL into iFrame codes like this:

< iframe src=”YOURURLHERE” width=”100%” height=”480″></iframe >

(Note: remove spaces in code above)

Put that code into a blog post along with a story explaining the data.

IV. NVD3, Assignment 8

Assignment 8 is to work on your own using your own data. Feel free to use one of the other charts at NVD3.org. Show how to upload it, create a URL, embed http://journovationsu.org/assignment-8-nvd3-charts-2/

Due: Tuesday, Feb. 23.

 

V. Remember to do course evaluation.

Please spend last 15 minutes doing it in class.

2/11/16 Lecture Notes

Agenda for today:

  • #NHData Finds
  • How’s HighCharts going?
  • Searchable, sortable data tables
  • Assignment 7, due Thursday, Feb. 18
  • Work on Assignment 6, start on Assignment 7 if you like.

 

 

2/9/2016 Lecture Notes: HighCharts

12:30-12:40

– #NHFinds

12:40-1

– Review Timeline homework. Where did you get stuck?
– Reminder of how to FTP and embed.
– Who rocked it with code?

1-1:50

Highcharts demonstration.
– Downloading code
– Finding a chart you like
– Customizing the data
– Load locally
– Upload to FTP
– Embed in a blog post

Remaining time: work on Assignment 6

2/4/16 Lecture Notes: Timelines

I. A CartoDB 

Irfan Uraizee, BDJ senior, created this poverty map of Onondaga County using skills he learned in this class. He’s now an interactive editor at the Sun Sentinel in Fort Lauderdale.
http://www.thenewshouse.com/story/new-study-finds-unfair-housing-discrimination-university-neighborhood

II. #NHData Finds

What cool data visualizations have you found this week?

III. Timelines

We’ll look at two options for timelines:

a) Vertical timeline, in code

Timelines are common in news stories, and let’s face it — if they’re nothing more than a long string of text, they’re BORING! We can do better with timelines on the Web.

Tabletop.js is one way to create an interactive timeline using pre-built HTML and Javascript code. You just have to change a few variables in the code and upload it somewhere. Everything else is updated through a Google doc.

As an example, let’s look at Professor Pacheco’s CV.

Lots and lots of text. Boring, huh? Now look at the interactive timeline version: http://journovationsu.org/dataviz_s2015/Assignment6/danpachecotimeline/, also embedded at the bottom of this post.

I only had to mess with a tiny bit of code, and now I can easily upload my resume timeline in this Google Doc. That’s because this timeline uses Tabletop, an open source code package that displays data in Google Docs in web pages.

Now it’s your turn to create your own timeline resume or portfolio. My friend and fellow data journalist Lisa Williams from the Independent News Network has create a friendly walkthrough of how to create your own timeline here:
http://dataforradicals.com/the-absurdly-illustrated-guide-to-your-first-data-driven-timeline/

Spend the rest of class going through the exercise. When you get to the point where you need to upload files to the server, go into Filezilla or Fetch and use the FTP server info provided on the first day of class when I asked you to upload your selfie.

Here’s an embedded version of my CV timeline:

b) Horizontal timeline
See the KnightLab timeline tool. Examples:

Assignment 5: Go through Lisa Williams’ Absurdly illustrated Guide to creating a vertical timeline and embed it in the class blog.

If you can’t get the vertical timeline to work, try the KnightLab timeline tool at http://timeline.knightlab.com.

As an incentive to not give up on the vertical timeline, which requires you to mess around with code, I will give 5 points extra credit for anyone who succeeds in posting a vertical timeline!

Due date: Tuesday, Feb. 9 by 9 a.m.

2/2 Lecture Notes: Doing More with CartoDB

I. #NHData Finds

What cool data visualizations have you found?

II. Common issues with Assignment 2

  • Not embedding into blog correctly (review how embed codes work again.)
  • Map of points: not filtering.
  • Remember to choose “Dataviz Turned-In Assignments” category, not “Dataviz Assignments” or “Assignments.” That makes your assignments show up on the Student Work category page I use to grade.
  • Otherwise, good job!

III. CartoDB discoveries?

IV. Instruction / Lab – How to merge data sets.

Please follow along from your own computer.

You can create a lot of interesting maps in CartoDB by overlaying data. For example, here’s a map I created that overlays a data set of wine consumption on top of another of beer consumption. These are really two maps, but they are layered on top of one another, as you can see from the layer pulldown.

But what if you want all of the information on wine and beer consumption to appear in the same infobox? And what if you want to also display data from a completely different data set, such as, for example, annual number of road deaths? For that, you have to merge data tables into a single table.

You could do this manually in Excel but it would be quite a manual process. CartoDB will do it for you as long as you have one geographically-oriented column in each data set whose contents are exactly the same (for example, country names or country codes), you can merge them in a snap.

On your own time, I encourage you to go through a tutorial on CartoDB.com about how to merge two data tables that are in Carto’s data library. Today we’re going to walk through how to merge two data sets into this single map:

I got this information from the World Health Organization, which as it turns out has a lot of really interesting data in CSV format. This map has data merged from the following two data sets:

In class, I will show you how I used sorting and filtering to create separate tables for alcohol and wine consumption which I merged into a single data set, then merged that again with data on road deaths.

During the rest of class, please work on Assignment 4, which is to create a CartoDB map with data you find and report on. You should publish the map that tells the story around your data the best, but I will be awarding the most points to assignments that incorporate as many of the following as possible as long as the choices are appropriate to the story:

  • Data from more than one source, either in CartoDB’s library or from somewhere else.
  • Data that is imported from outside of CartoDB’s library.
  • Data that is filtered to hide extraneous information, either within CartoDB or before importing by using Excel.
  • Layered data sets.
  • Customized markers and visual effects (e.g. different icons chosen in CartoDB or uploaded).
  • Customized infoboxes, especially if you modify the HTML or CSS.

Good luck!

 

1/28 Lab: CartoDB Basics

CartoDB Basics

Today we will be using CartoDB, which lets you upload and manipulate data that is then displayed in a map which you can embed on your site.

Get a CartoDB account

First you must set up a standard free account. Go to http://cartodb.com and click Get Started to create your account.

Example Map

Here’s a tourist map I made that overlays three sets of data for tourists visiting New York City.

How did I make it? Let’s crack it open.

 

Three Mapping Tutorials

Complete these three tutorials in class. Each takes approximately 10 minutes each:

1) Creating a simple map of points
http://developers.cartodb.com/tutorials/simple_points_map.html

2) Georeferencing
http://developers.cartodb.com/tutorials/how_to_georeference.html

3) Map Election Results
http://developers.cartodb.com/tutorials/electoral_map.html

Assignment 3

After you’re feeling comfortable with the CartoDB features, you can start working on Assignment 3, which is to post the results of the tutorials above. Due Monday, Feb. 1. 

Assignment 4

After you’ve done that, you can begin working on Assignment 4, which is to tell a story with data you find using a CartoDB map. Due Friday, Feb. 5.

1/21/16 Class 2 Lecture Notes

I. You

II. Excel

We’ll go through some basic features of Excel and how to create forumlas.

  • Adding information as data versus text.
  • Add a formula.
  • Columns and rows.
  • Formula: using the equal sign for functions. Basic math.
  • Sum columns or rows.
  • Format cells to change cell type (text, number).
  • Making simple charts in Excel.
  • Common formulas:  adding, subtracting, dividing, multiplying, summing

III. Acquiring Data

  • New York State public data.
  • Getting data: formats to look for (CSV, JSON).
  • What if you get a big, fancy Excel document? How to dumb it down to a CSV.
  • Copying and pasting data as values versus as formulas.
  • Copying data from HTML tables.

IV. Sorting and Filtering Data

  • Show how to sort and filtering data in Excel.

V. Sorting and Filtering Exercise

Assignment 2: Find some interesting data from the New York State open data site. Use sorting and filtering to hone in on some interesting and easily digestible data points that could be used in a story. Create a graph of the data you find in Infogr.am. Create a blog post that includes a link to the raw data from the NYS site, explain how you filtered it, and embed the Infogr.am chart into the post. Due Tuesday morning.

1/19/16 Lecture Notes, Class 1

Tues 9/1 Class: Welcome!

I. Welcome! An Introduction to using data visualization to tell stories.

II. FERPA forms.

III. Class Blog, Rebelmouse
Most of the assignments will be filed by embedding widget code into the class blog at http://journovationsu.org. And you have an account! Walkthrough of how it works.

IV. Exercise: And now for a little magic. Introduce yourself through data!

1. Open this URL. How did that data get in there? Let’s find out together.

2. Pull out your cell phone and take a “selfie,” or have your neighbor do it for you.

3. Email the file to yourself and download the image to your desktop.

4. Open Cyberduck on your computer.

5. Log into the class FTP site. You can find the login info in Blackboard under the Access tab.

6. In Cyberduck, navigate to Exercise 1. Create a folder with your name in it, then drag the profile picture you created into it.

7. Follow the instructions here to create a direct URL to your image in a browser.

8. Go to this Google spreadsheet and fill out your info. Put your image URL into the correct column.

9. Take a look at this URL, to see class data populating in real time.

Congrats! You Participated in a Data Visualization
You not only introduced yourself to the class, but you participated in your very first interactive data-driven visualization. The data is all in a Google spreadsheet, and some free Javascript and JQuery code called Tabletop.js that we will use in a future class pulls all of that data into a web page. Try changing any of your information and you will see that the public web roster updates in real time.

V. Excel

Go through some basic features of Excel, and formulas.

  • Adding information as data
  • Add a formula
  • Columns and rows.
  • Formula: using the equal sign for functions. Basic math.
  • Sum columns or rows.
  • Select an area.
  • Format cells to change cell type (text, number).
  • Making charts in Excel.
  • Common formulas:  adding, subtracting, dividing, multiplying, summing

Assignment 1: Register for the class blog, fill out this survey. Due before next class.

Reading before next class: Excel basics: http://bit.ly/19aKVUa