Dataviz, Week 1, Class 1
February 21, 2017
NEW 300 / 600 – Professor Dan Pacheco
S.I. Newhouse School of Public Communications
I. Welcome! An Introduction to using data visualization to tell stories.
– Let’s look at a few examples.
II. Class syllabus and expectations.
– Tuesday: instruction
– Thursday: lab
– Assignments due every Monday morning.
III. Class Blog, Spundge, Rebelmouse
Most of the assignments will be filed by embedding widget code into the class blog at http://journovationsu.org.
IV. Exercise: Introduce yourself through data!
1. Open this class roster in a web browser, or from your smartphone: http://jsfiddle.net/pachecod/HxpFX/embedded/result/.
2. Pull out your cell phone and take a “selfie,” or have your neighbor do it for you.
3. Email the file to yourself and download the image to your desktop.
4. Go to http://journovationsu.org/wp-login.php?action=register and register for an account (note: account registration is frozen now that the class has started). Check your email to get your temporary password, then click the link in the email to log in with that password.
5. After logging in for the first time, click the link at the very top of the screen inviting you to edit your profile and assign a permanent password. Click the Update button to save your new password.
6. Upload the selfie you took to your profile by clicking the Choose File button at the bottom of the page, then Update Profile.
7. Return to your profile by going to http://journovationsu.org/wp-admin/profile.php. Scroll down and right-click your photo. Choose View Image. Copy the URL for the image.
8. Finally, go this Google spreadsheet and paste your image URL into the image field. Fill out the rest of your information. (Note: the Google spreadsheet for this excercise is now frozen.)
9. Return to the Class Roster page in step 1 (http://jsfiddle.net/pachecod/HxpFX/embedded/result/). You will also see information from others in the class as they add theirs. Look at the home page of http://journovationsu.org and you should also see your info there. This is because the output of the link above is embedded into the page.
Congrats! You Participated in a Data Visualization
V. Assignments, due before next class:
1. Fill out class survey here (also linked into the class blog): http://journovationsu.org/2014/01/13/data-visualization-class-survey/
2. Read this summary from Jaimi Dowdell of IRE on on Excel functions for journalists: http://bit.ly/19aKVUa. We will go over some very basic Excel functions in class, but you are expected to learn more advanced functions needed for your projects on your own.
Propose your Final Project
February 16, 2016
Your final project is due on Friday, March 11, but I want to make sure you’re working on it starting now. The first step is to write up a proposal, which will make up 10% of your final project’s grade.
As a reminder, your final project must use use 3 dataviz tools we covered in class, or others you find, to tell a story. At least one of those tools must involve using open source code that you upload to the class server. Be sure to include the following in your proposal:
- A few paragraphs describing the story you hope to tell.
- What data do you need to acquire and analyze to tell this story?
- Where / how might you acquire this data?
- What tools are you thinking of using?
- What are some of the unknowns that you need to flesh out?
- How might things go wrong, and if so, what’s your plan B?
Proposal Due Date: Thursday, Feb. 25. But if you complete it by Thursday, you will receive 2 points of extra credit on your final grade and also get feedback from a data journalist who is visiting on Thursday.
Final Project Due Date: March 11.
Assignment 8: NVD3 Charts
February 16, 2016
NVD3, Assignment 7
Create an NVD3 chart using data you collect on your own. Convert the data to the JSON format, then put it into a chart that you choose. Upload that chart’s code to the server and embed it.
Due: Tuesday, Feb. 23.
2/11/16 Lecture Notes
February 11, 2016
Agenda for today:
- #NHData Finds
- How’s HighCharts going?
- Searchable, sortable data tables
- Assignment 7, due Thursday, Feb. 18
- Work on Assignment 6, start on Assignment 7 if you like.
Assignment 6: Using Highcharts
February 9, 2016
This assignment is designed to make you comfortable with modifying data in open source code, then publishing it to a web site. We will go through part of the assignment in class today. The rest of the assignment is due next Tuesday night at midnight.
Note that there are two deliverables for the assignment. The first is to embed the chart in this exercise into the class blog, which you can do today. The second is to create a chart with original data for a story you want to tell with a chart, FTP all of the required code to the site and embed your chart and story in the class blog.
Get Set Up
We will be using the Chrome browser for this exercise, not Firefox. Before we get started, go to http://jsfiddle.net. Create a free account, or log in if you already have one. Keep the browser open after you have logged in.
Review chart types
You can view all of the chart types that are possible here:
In this exercise, we will use the “Area-spline” chart to show two line graphs on the same axis:
Play around in JSFiddle
JSFiddle is a free sandbox that makes it easy to modify code and instantly see the results. You will be creating a “fork” (version) of some code I have set up for you. After you modify your version, you can keep messing around with it and even embed the output into a blog.
1. Go here to see the JSFiddle for this exercise:
2. Making sure you are logged in first, click the “Fork” button at the top. You now have your own fork.
4. Scroll down until you find the sections of code that start with xAxis. Notice the list of months. They only go through July. How do we get a full year’s worth of months into the chart? You need to type them in, being careful to copy exactly the style in the code. Of note: each month is surrounded by single quote marks, and there is a comma after the month. But this is also very important to notice: there is no comma after the very last month.
7. Add August through December, then click Update. Your chart will refresh, but there’s no data for the additional months. Why not? Oh yeah, because you have to add in those numbers!
8. Scroll down until you see “yAxis.” Put in a title for the Y axis, and a name for the units you are displaying. Be sure to include a space before what you put in for units or you will notice an embarassing typo in your chart later.
9. To add numbers for units, scroll down to the “Series” section and put in a title for each series, and numbers for each month in the series.
10. Click update when you are done. Does your chart look like you want it to? Good! If not, keep trying until it looks right.
The last step is to embed the chart in a blog post along with a short story that explains the data.
11. Get a URL for the output of your chart. Click the Share button in JSFiddle, then copy the URL in “Share full screen result.” Test that URL in a new tab to make sure the chart appears correctly in a browser.
12. Because we’re embedding into a WordPress site, we will use the iFrame plugin’s short code to embed this chart. Put your URL into the “YOURURLHERE” section in this code snippet:
<iframe src=”YOURURLHERE” width=”100%” height=”480″></iframe>
13. Go to the class WordPress Admin site at http://journovationsu.org/wp-admin/ and click the Posts > Add New button on the left. Enter a title. Making sure that the “Text” tab in the body area is active (not the Visual tab), paste this code into the box. Click Preview and make sure it apears in the preview. If it does, click Publish, then View to view the public post.
14. If the chart appears, the last step is to go back into the post and write a few paragraphs around the chart explaining to the public what it means.
Now, Go make Another Chart
To complete this assignment, create another chart using data for a story and embed it into a post in the class blog. You should do the following:
- Download the High Charts code, which you can get here.
- Find an example for a chart you want to make. Open the same Examples folder in the Highcharts codebase you downloaded, edit the index.html file contained in that example folder (not the index at the top of the folder tree), and preview it in a browser (which you can do by dragging the file into an open browser window).
- When your chart works the way you like, rename the top of the main folder to any name you choose, and upload the entire folder into the class server. You should be able to locate it in your browser and embed it into a second blog post just like you did in the first exercise.
Due: Monday, Feb. 15
Assignment 4- Marijuana Street Price per State vs. State Usage
February 5, 2016
While looking around on the internet for interesting data sets, I came across this dataset of the Street Price of Marijuana in every US State. I sorted this data first by state and then took the average price per state and filtered it by year. This graphic shows the 2015 prices. The most obvious observation on the map is that everywhere west of Colorado has prices that are extremely lower than the prices that people are paying in the east.
On top of that, I added a layer with data of the states with the highest percentage of marijuana use. For the most part on the west coast in the states with the lowest street price, more people are using the drug.
The political ramifications of these maps are another interesting comparison. This map goes along well with another map I found that shows each state’s marijuana laws. States that have already legalized marijuana for recreational purposes have lower prices and more people using it. With the political controversy surrounding the legalization of marijuana, this chart has special political interests. If this money was going towards the US government and other sources instead of the dealers, this could potentially have a large impact on the country.
2/2 Lecture Notes: Doing More with CartoDB
February 2, 2016
I. #NHData Finds
What cool data visualizations have you found?
- You #NHDatafinds.
- Let’s talk about the Iowa Caucus.
- How would you make a map like you see on Washingtonpost.com with CartoDB? For starters, you would need a shapefile of counties.
II. Common issues with Assignment 2
- Not embedding into blog correctly (review how embed codes work again.)
- Map of points: not filtering.
- Remember to choose “Dataviz Turned-In Assignments” category, not “Dataviz Assignments” or “Assignments.” That makes your assignments show up on the Student Work category page I use to grade.
- Otherwise, good job!
III. CartoDB discoveries?
- Here’s a fun thing to try with data that has times in one column.
IV. Instruction / Lab – How to merge data sets.
Please follow along from your own computer.
You can create a lot of interesting maps in CartoDB by overlaying data. For example, here’s a map I created that overlays a data set of wine consumption on top of another of beer consumption. These are really two maps, but they are layered on top of one another, as you can see from the layer pulldown.
But what if you want all of the information on wine and beer consumption to appear in the same infobox? And what if you want to also display data from a completely different data set, such as, for example, annual number of road deaths? For that, you have to merge data tables into a single table.
You could do this manually in Excel but it would be quite a manual process. CartoDB will do it for you as long as you have one geographically-oriented column in each data set whose contents are exactly the same (for example, country names or country codes), you can merge them in a snap.
On your own time, I encourage you to go through a tutorial on CartoDB.com about how to merge two data tables that are in Carto’s data library. Today we’re going to walk through how to merge two data sets into this single map:
I got this information from the World Health Organization, which as it turns out has a lot of really interesting data in CSV format. This map has data merged from the following two data sets:
In class, I will show you how I used sorting and filtering to create separate tables for alcohol and wine consumption which I merged into a single data set, then merged that again with data on road deaths.
During the rest of class, please work on Assignment 4, which is to create a CartoDB map with data you find and report on. You should publish the map that tells the story around your data the best, but I will be awarding the most points to assignments that incorporate as many of the following as possible as long as the choices are appropriate to the story:
- Data from more than one source, either in CartoDB’s library or from somewhere else.
- Data that is imported from outside of CartoDB’s library.
- Data that is filtered to hide extraneous information, either within CartoDB or before importing by using Excel.
- Layered data sets.
- Customized markers and visual effects (e.g. different icons chosen in CartoDB or uploaded).
- Customized infoboxes, especially if you modify the HTML or CSS.
Assignment 3 Brianne Sabino
February 2, 2016
Simple Map of points:
Map election results:
Simple Data Visualization Tools That Require No Coding
January 26, 2016
There are many dataviz tools that require no coding knowledge or skills. You just enter information into forms or spreadsheets and go to town. Note that while they require no code, many can be enhanced with a little basic HTML. So we will start with a primer on HTML.
You can go through these three self-guided HTML teaching tools on your own.
Simple No-Code Dataviz Tools
Here are the ones we will go over today:
1) Infogr.am (http://infogr.am)
Infogr.am is an easy way to create simple charts and graphs, as well as scrolling infographics that you may notice people posting in places like Facebook and Tumblr. For journalism I think the graphs and charts work best, because you can embed them directly into stories to visually explain something you are reporting in text.
Take note of the Graphs and Charts tab at the top. Click the Charts tab to see all the different types of charts you can use. Choose your visualization type, double click diferent parts of the interface to edit them, and copy and paste your data in. If you have trouble copying data from a web site, try starting from a summary sheet you make in Excel.
Click “Share” and copy the iFrame code at the bottom to embed into the blog. If you find that your chart is too wide for the blog post, you have two choices. You can manually change the width and height variables in the HTML code you copied, being very careful not to change anything other than those numbers, or you can try the “responsive” code which will make your chart shrink or stretch based on the width of the page where it’s embedded. The second option is good if you think your chart will be viewed by people on mobile devices, but be sure to test it out from a mobile device to be sure.
Here’s an example of a chart I made in Infogr.am using data from a previous exercise.
2) Easel.ly (http://easel.ly)
Think of Easel.ly as a quick infographic creator. You find a template you like, then start to manually edit it and add graphics from a built-in library. Charts can also be added and edited as spreadsheets, similar to Infogr.am.
3) Google Fusion Tables (http://tables.googlelabs.com)
Google Fusion Tables turns columns and rows of numbers in spreadsheets into visualizations. Once signed in, import a spreadsheet in .xls or .csv formats (not not .xlsx, which is Microsoft’s proprietary format). Make sure your spreadsheet has column headers, or it won’t work.
Sometimes Google will add a tab that it calls a “card” that is the best choice for your data — for example, a “Map of latitude” will appear if your data includes geocoordinates. If you see a card that works for you click it and see how it appears. If not, click the + sign to the right of that tab and choose Add a Chart. Click Done when your chart is set up the way you want it.
To publish your chart, you have to do two things:
1. Click the Share button at the very upper right of the browser, then “Change” next to Private under “Who can Access.” Select “Public on the Web” and then Save and Done.
2. Go to the Tools menu and choose Publish. You will see iframe tags here that you can embed in your blog post.
Here’s an example of a chart from Google Fusion Tables:
4) StoryMap (https://storymap.knightlab.com/)
From the Knight Lab at Medill, StoryMap lets you tell a story that’s broken up by points on a map. You can also use it to tell a story that moves across something that isn’t a map at all, such as a very detailed painting. Think of it like a timeline that takes place on a giantic picture.
Prepare for Thursday
On Thursday we will crack open CartoDB, which is a powerful mapping service that lets you tweak some of the interface and mess with a code a little. You can think of it as the gateway drug to other dataviz tools we will use that do require you to mess around with code. You can get a head start on how to use CartoDB through these free video tutorials on their web site.
Assignment 2: NY College Graduation Rates
January 26, 2016
I looked at the New York state data because it was easy to download and sort through. I decided to take a look at the graduation rates for the public universities in New York State. I sorted the data by school location and by graduation rate for only the year 2014 (most recent data). In the first graphic, I sorted the data to compare the graduation rate for each school to show which had the highest, and which had the lowest. It is easy to see which schools are producing successful students to graduate within four years.
In the second graph, I plotted each college’s graduation rate from 2009-2014 to show the trend in the college graduation rate over the years. I sorted the data again by school name, but this time grouped it by year so I had data by every year to put on the chart. It might also be interesting to take a look at each individual college and see why the graduation rate fluctuated year to year, and where the students from year to year went.