If you’re familiar with Microsoft Excel or Access, you might like Fusion Tables. It’s a free tool that allows you to create interactive maps and charts with data. For journalists, this is fantastic. Fusion Tables unlocks the data stuck in your hard drive and lets you easily share it with readers in a compelling format. Check out some great examples at Matt Stiles’ blog, the Daily Viz.
I stumbled across this story by using Google’s advanced search options. Google lets you search specific websites for specific files and specific terms. So a way to find little-known databases and interesting stories is to search a government website for spreadsheets, pdf’s, and other type of documents.
For example, let’s say you want to focus on the city of San Antonio. In Google’s search box, you’d type site:sanantonio.gov, to limit the results to pages from the city’s website. Then use “filetype” to focus on specific types of files. The term filetype:xls searches for spreadsheets. Filetype:doc searches for Microsoft Word documents. Filetype:pdf searches for … you guessed it, pdf files.
You can do broad searches or get creative and add words you think might lead to interesting stuff. Check out this search with the term “injuries.”
One of the top results is a form for a vehicle accident report that is filled out by city employees whenever they’re involved in an accident. All the entries and check boxes in the form suggest this information is typed into a database of some kind. And if that’s the case, that means you can request the data, analyze it yourself, and see if there’s a story lurking in those numbers.
Using the Texas Public Information Act, I asked for any database the city had that tracked insurance claims from vehicle accidents. The process took awhile and there was a lot of back and forth. At first, the city’s Risk Management Office only sent me a pdf with two categories of information: case numbers and dates. The format and info was worthless.
But eventually they sent more complete spreadsheets that tracked the dollar amount of the claim, whether it was denied, and a brief description about what happened. It was interesting reading.
No one outside City Hall had ever looked at this data before. Thanks to a nifty Google search, now everybody can.
Local weather watchers have been dutifully documenting San Antonio’s temperature, precipitation, and other climate data for 140 years. If you’re curious how this year’s drought compares to past dry spells, meteorologist Robert Blaha with the National Weather Service has done you a huge favor.
“We were able to find the records,” Blaha told me. “In the 1800s, they hand wrote (the climate data) in ink. It was in a paperback book. When I came here in 1975, they were in notebook format. In 2050, they’ll be in the format of that day.”
Blaha said the rainfall gauge in San Antonio has changed locations over the years. In the early days it was at a co-op station and then moved to Fort Sam Houston. In 1891 it moved to a downtown office building. Somewhere along the line it was at Stinson Field. In the 1940s it moved to the San Antonio International Airport and stayed there ever since.
All that work helps us compare this year’s drought to past dry spells. This year, we’ve received 5.6 inches of rain so far in San Antonio. That’s about half the total precipitation for the lowest year on record since 1900, when it rained 10 inches in 1917.
In 2010 it looks like we got quite a bit of rain –37.4 inches. But click on the monthly figures for 2010 and 2011. The data show that September 2010 was our last significant taste of rain.
In the nine months since then, we’ve barely gotten anything.
Check out this amazing presentation at Google I/O 2011 about Google Fusion Tables. The whole video is interesting. But for a journalist’s perspective on the importance of making data accessible to readers, at the 34:50 mark Simon Rogers of the Guardian’s Data Blog offers some interesting examples of how journalists can bring “data to life” with Fusion Tables, a free online tool.
Until recently, I had no idea this DPS database existed. But I stumbled across it a few months earlier when I was working on this article about pursuits in San Antonio. SAPD keeps a database packed with details about each chase — the weather and road conditions, the pursuit speeds and durations, the injuries and fatalities. Since SAPD had this data, I figured other law enforcement agencies in Texas probably kept similar records. I asked around and sure enough, DPS was one of the agencies that collects details about pursuits.
Why is that a big deal? Well, when you find a previously unknown database with information about an important public safety issue and analyze those digital records, you’ll probably discover fresh, interesting information for your readers. Public databases empower journalists to do their own research and find surprising answers.
Brandi asked for a copy of the data and we received it from DPS with little trouble. It was a big spreadsheet documenting nearly 5,000 pursuits from 2005 to July 2010.
One detail jumped out at us: Hidalgo County, by far, had the most pursuits over the past five years — 656. Several other border counties also ranked high, suggesting smugglers were often fleeing DPS troopers. The database told us all kinds of things about these pursuits — how often people were injured, how often motorists escaped, and how they got away.
When reporters dive into data-heavy topics, it’s important to find the real people behind the numbers. We asked DPS early in the reporting process to go on a ride-along with a trooper in Hidalgo County. Brandi and photographer Callie Richmond visited McAllen and went on a ride along with DPS Trooper Johnny Hernandez. Their experience became the lede of our story. Brandi had some great interviews with Hernandez and other troopers in Hidalgo County, who openly talked about their continual struggles to catch smugglers from Mexico. The visit provided rich material for photos and an awesome online video that Callie produced.
Brandi wrote a big chunk of the article on the drive back from McAllen. We finished writing and editing the story in a Google Document, which really beats sending e-mails back and forth and losing track of differing versions of the story. Google Docs lets you see what each collaborator is adding to the document as they write. It’s like the Big Brother version of Microsoft Word, but less evil. It’s a useful tool for collaborating with people, especially if they work in a different organization in a different city. Plus, Google gives you a chat window in the document, which is nice if you want to mock the typing skills of your colleagues.
There were some interesting reactions to the story. Scott Henson at Grits for Breakfast was surprised so many suspects got away: “I would not have guessed that the number of chases ending with the suspect successfully eluding troopers on foot would have been so high, nor that the proportion who stop and surrender would be so low.”
KXXV TV localized the story by looking at the high number of pursuits in McLennan County.
That’s the great thing about news stories based on public data — people can take the information you found, talk about it, and look at the data themselves.