Assignment 2: Sorting and Filtering in Excel
February 23, 2017
In this exercise we will sort and filter data to understand it, then interviewing the data to gain insights and, hopefully, tell simple stories.
Data Viz – Sorting and Filtering Exercise
Using New York State Cardiac Arrests
New York State has an open-data policy. You can get a lot of data from New York State-funded programs at https://data.ny.gov/.
In this exercise, we will be using data on salary information for New York State authorities: https://data.ny.gov/Transparency/Salary-Information-for-State-Authorities/unag-2p27
Click the Export button and choose CSV for Excel, then save to your desktop.
Let’s make a list of the biggest fat cats in state government. Go to the Data menu and choose Sort.
- Make sure the box labeled “My list has headers” is checked!
- Click in “Sort By” and choose “Base Annual Salary” (column L). Under “Order” choose “Largest to Smallest,” then click OK.
- Who’s making the big money?
- Change the sorting to smallest to largest. Who’s making the least?
- Do you see any trends in the types of people who make the most and least amount of money?
- Do you see any problems with missing data?
As you may notice, there are some records that have $0 for salaries, so we need to filter those out.
- Go to the Data menu and choose Filter. You should see downward-facing arrows above each column header.
- Click the arrow in the Base Annual Salary column.
- Uncheck the box next to $0.
Now, let’s remove all salaries that are below the Federal poverty line of $20,000. Create a Sort, and sort on Base Annual Salary from smallest to largest. Select all of the rows under $20,000 and choose Edit > Delete Row.
Finally, let’s use a combination of filtering, sorting and formulas to find the average salary of people in the “Managerial” group (column H).
Now you should be familiar with sorting and filtering. What trends do you notice in the data? Do you see anything that may suggest a potential story? Or did you find a good story?
Now let’s shift to a topic that’s in the news a lot lately: how much money U.S. presidential candidates are getting. The Federal Election Commission makes this available in spreadsheet form here: http://www.fec.gov/disclosurep/PDownload.do
Download the data from the state where you’re from (or if not from the U.S. you can just choose New York). Use Sorting and Filtering to create a list of donations to two candidates you follow. When you create each list, use the Excel SUM feature to tally up how much money the candidate has received.
As a final step, create a simple bar graph in Infogr.am showing how much they’ve received. In class I will show you how to embed this in a blog post.
The assignment: Find some interesting data from the New York State open data site. Use sorting and filtering to hone in on some interesting and easily digestible data points that could be used in a story. Create a graph of the data you find in Infogr.am. Create a blog post that includes a link to the raw data from the NYS site, explain how you filtered it, and embed the Infogr.am chart into the post. Due Monday morning.