![]()
journalism | design | textured backgrounds
investigative reporter at @thenyworld | mhkeller.com
my latest interactive and NewsBeast Labs post
A defining characteristic of this election cycle was Super PACs and the hundreds of millions of dollars outside groups were spending to influence races. Now that it’s all over, we wanted to see which outside groups spent their money on succssful races and which did not. The result was our interactive Not-So-Super PACs: 2012’s Winners and Losers.
Super PACs abounded this cycle. So instead of trying to document and display all of them, we focused the narrative on how well the biggest spenders and their donors fared. To execute it, we used Center for Responsive Politics anaylsis of FEC data to find how much each PAC had spent so far in each race and then manually went through and coded each race whether the outcome was in line with or against the PAC’s interest. Then we added everything up.
Visualizing it
This idea went through a few iterations before settling on what you see above. For a while, we’ve been wanting to use a tower graphic template - one of those vertical scroll layouts with a sticky table of contents - that I built a couple of months ago but it never seems to work out. This time, after thinking about all of the detail we wanted to display we thought bigger.
If you’re trying to visualize money flows, Sankey lines are a go-to. ProPublica did a great one showing overlapping Super PAC expenditures and you see them as flat graphics too. They show direcionality and volume = great for money.
Getting the right data
Money was flowing from donors to PACs and then to races, so we used the JSON structure that D3 lays out for its network layouts (and Sankey) visualizations. You have a list of nodes (People and PACs) and a list of node to node links (X person gave $Y to Z PAC). We were working collaboratively in Google Docs so were able to do some formulas that would print out or data structure in JSON as we were editing the document. Very handy in case you need to correct any numbers or name spellings.
Our D3 visualization was a failure.
Here’s a link to the interactive version (yes, it’s in the “failures” folder). As you can see, there were too many races to fit on the screen and the dollar amounts in some races were so high that they dwarfed everything else. So showing each race in the Sankey was out.
This led to Sankey Idea #2.
We connected photos of the donors to the PACs, showed the percentage of succesful funds, and then put the races in a table down below. The photos were very useful because you can quickly understand that money is coming from a person and going somewhere. If we just had text, I think, without photos, it would be less clear and have less of a personality. Someone remarked that the lines almost form bodies and arms that reach out to touch Super PACs. It’s interesting to see visualized data combined with photography work out to tell a story like that.
Under the hood
We used Raphael to draw the lines, which was an improvement from D3 since we do indeed support Internet Explorer 8. We tweaked Al Shaw’s Sankey line from Tom Counsell’s Sankey library to make them span vertically instead of left to right. Here’s a jsFiddle of the code to draw the line.
The table uses Isotope.js for its animated sorting, which is snazzy but I also think does help make tabular data more understandable. Instead of clicking on a column header and everything resorts in a flash, you can see how dramatically different rows vary from view to view. It’s also nice because you can do filtering. So without much code you have a filterable, sortable table. It also saves a step of turning the object data into arrays for sorting. I’ve been wanting to add those ascending / descending arrows for a while to our tables so this was a good time for that.
This table will probably become our first stand-alone NewsBeast Labs plugin since we’ve been using it pretty frequently. That’s pretty cool because five months ago we didn’t have any interactive news code and now we’ve done enough projects that we can see what’s worked, what functionality we like and can wrap it all up into something more robust and reusable, which will make our future development that much faster.
-mk
Come peak behind the curtain.
This is a fascinating TAL from 1996 entitled “Politics” where Ira Glass speaks to an oddly comical right wing Mexican-American self-deportation proponent, Daniel D. Portado, (back before that became an actual law like it is Alabama) and Michael Lewis reads a story of his on John McCain, portraying him as one of the only people in Washington with integrity, reinstilling Lewis’s faith in politicians.
Listen here: http://www.thisamericanlife.org/radio-archives/episode/41/politics
I’ll come right out and say it: Taxes are awesome.
Yes, awesome. If you care about national values, or the relationship of citizens to their government, or the way we choose to award and discourage behavior, there is nowhere better to start than the gnarled and fascinating world of levies and tax breaks. Tax week gives American families a reason to consider moving to Bermunda, but it also gives me an excuse to spend the day finding my favorite, most controversial, and most illuminating graphs about taxes. Here they are. If you’ve think I’ve picked the wrong ones, or if you’ve got a better chart yourself, leave it in the comment section. I’m rounding up your favorite tax graphs tomorrow.
(via theatlantic)
—Ira Glass on Google Glasses
I like wam’s idea
1. What do you propose to do? [20 words]
Buy Washington City Paper and transform it into a non-profit membership organization.
2. Is anyone doing something like this now and how is your project different? [30 words]
While many people are exploring non-profit models for journalism, this is a…
(Source: newschallenge1)
Earlier this month, Petraeus mused about the emergence of an “Internet of Things” — that is, wired devices — at a summit for In-Q-Tel, the CIA’s venture capital firm. “‘Transformational’ is an overused word, but I do believe it properly applies to these technologies,” Petraeus enthused, “particularly to their effect on clandestine tradecraft.”
Isn’t this plus artificial intelligence pretty much the whole backstory to Battlestar Galactica?
The other day I was thinking about hashtag line graphs — charts that show traffic on a particular topic over time — and how to make them more interesting. Visualizing traffic around a hashtag over time usually tells the story that everyone already knows i.e. some huge event happened and people started tweeting about it. Not terribly surprising.
Other experiments into semantic analysis of tweets have tried to characterize what this conversation is about. The Guardian’s riot rumor visualization is one great example but it has some high barriers to entry even if you have the crazy datavis chops. First, you need a huge sample of tweets to analyze and you’ll have trouble getting that unless you’re Twitter white-listed or want to pay a company like DataSift. Second, to ensure accuracy, you need to either build a semantic tagger more advanced than what’s currently out there, or get a bunch of people to make sure your semantic analysis coded each tweet correctly and correct mistakes. So you need manpower.
So what story could you tell if you’re not a huge paper?
Clearly identifying the meaning behind a sentence has some barriers but what about individual tags? What if you could chart how the audience around an event shifted by looking at the evolution of tags around a single topic? Surely, sheer tweet volume will tell you something about how popular an event is that but it could be confounded if a spike in traffic is from a small group tweeting a hundred times as fast as opposed to a hundred times as many people tweeting at the same rate. (Yes you could use network analysis to get picture of audience but again you need all that data.)
Like with most data visualizations, stories in data start to come out when you can mash together different datasets.
One experiment:
When Occupy Wall Street started, I remember the hashtag began as the cumbersome #occupywallstreet because know one knew about it. I briefly saw an #occupywallst and now #ows is the clear choice. The question is, when did Occupy Wall Street become commonplace enough where people were comfortable just referring to it by #ows?
In other words, by looking at hashtag evolution could you see the moment when an obscure march became part of the national discourse?
Let’s use Trendistic. Tumblr won’t let me embed the graph so click on the image to see the interactive version, or click here (Trendistic doesn’t display this data forever so depending on when you’re reading this, the data from fall 2011 may be gone. But the image below remains!)
Read more
I got my start in journalism designing covers for my college newsmagazine the Georgetown Voice. It was a swell time, reading the first draft of a cover story hopefully the day before production night — usually at 6pm the night of, eight hours before we had to Cyberduck our PDFs — and designing an image that would capture the story and make college kids want to pick up newsprint.
I haven’t designed a cover since then so it was delightful when Columbia Journalism Review EIC and my old professor Mike Hoyt asked me to help them launch CJR’s experiment in digital longform publishing — I even got to name it “CJR Longreads,” until the folks at Columbia University Press decided they wanted “Columbia Journalism Review Books” instead. Can’t win em all. For the 50th anniversary issue, CJR ran two ~10,000 word cover stories and sold them as kindle singles.
They needed covers:
This was the first piece, “Confidence Game: The Limited Vision of the News Gurus” by Dean Starkman responding to the so-called “Future of News” thinks such as Jay Rosen and Clay Shirkey. CJR has a round-up of the debate the piece generated here.
The second story was the first part in a series of profiles on newspapers confronting the digital age by my other former professor Michael Shapiro. “The Newspaper that Almost Seized the Future” chronicles the downfall of the San Jose Mercury News who was ahead of everyone until they weren’t. A really great read.
Liner notes: For the Starkman cover I licensed a Lonely Planet image to alter it, adding texture, tone, and the puzzle pattern you see in Photoshop. I had originally thought to do some type of God Save the Queen ripped paper texture but it looked too negative. This odd antique store in my neighborhood had an assembled puzzle in its window of some chinese calendar type design — I had the idea to convey the “limited vision” as a half finished puzzle.
For the Shapiro piece, the light was good on my fire escape and I had bought a 1928 Underwood for 70 euro in a sidewalk sale when I was living in France so I typed out the title and took a few photos. I also had an old Macintosh SE from the 80s back home that my brother took a photo of and sent to me. I overlaid the title in the retro Mac Techno font over an empty MS Word document but the typewriter version came out better.
I’ve been wanting to make a google map with workable mouseovers for a quite a while. Fusion Tables are super useful for drawing shapes on a map and clicking for more info. But, the excessive clicking can be tedious and hide a great deal of the data.
I decided to play around with Albert Sun’s GMap Features that draws shapes based on a KML file. Link: https://github.com/albertsun/gmap-features. And since the payroll tax extension was in the news, I took the median income by census tract dataset from the American Community Survey through American Fact Finder2 and calculated how much people would save for the approved tax cut extension, which is 3.1% of income under $110,100. I got the shapefiles from the census (link: http://www.census.gov/geo/www/tiger/tgrshp2010/tgrshp2010.html) and merged the data and the shapes into one Fusion Table. Why a Fusion Table? FT seemed like the easiest way I knew of to quickly merge a table of polygons with a table of data as long as they shared a common column. Also, FT lets you export all of this as a single KML file, which I could plug into Sun’s template.
I did all that but it wasn’t working and wasn’t giving me an error. After comparing the sample KML files with my FT export, I noticed that my data was wrapped in <Polygon> tags but the sample files were <Multigeometry>. I wrote a simple macro in TextMate that added the tag and it worked! I looked deeper into Sun’s code and it looks like it searches specifically for the <Multigeometry> tag when it draws the shapes, but I’m no KML expert so perhaps there are other important differences for drawing polygons.
UPDATE: After learning some more about KML shapes, I found that you need to clean up the data a bit more to render complex shapes correctly. Sometimes FT outputs some KML polygons as MultiGeometry shapes by default. If you just run a find change to replace “<Polygon>” with “<MultiGeometry><Polygon>” and “</Polygon>” with “</Polygon></MultiGeometry>” you’ll get duplicates on those shapes that are already wrapped in MG tags. The structure should look like:
The (simplified) structure looks something like:
<multigeometry><polygon>otherTagsAndNumbers</polygon><polygon>otherTagsAndNumbers</polygon></multigeometry>
as opposed to with duplicates:
<multigeometry><multigeometry><polygon>otherTagsAndNumbers</polygon></multigeometry><multigeometry><polygon>otherTagsAndNumbers</polygon></multigeometry></multigeometry>
The macro I recorded is just a series of find/replaces but if you’d like a copy, send me a message at @mhkeller.
Sun included callbacks, which let me add the hoverbox that updates on mouseover, which was awesome and what I’ve been wanting to do for ages now. Here’s the map — more code talk after the jump:
Read moreOn Christmas Eve, around 50 OWS protesters marched from Zuccotti Park to the New York Stock Exchange to hold a candlelight vigil with “Fuck you” candles