DIY data analysis: three fun examples

by Andrew Gelman on December 14, 2009 · 4 comments

in Methodology

I recently came across some links showing readers how to make their own data analysis and graphics from scratch. This is great stuff—spreading power tools to the masses and all that.

From Nathan Yau: How to Make a US County Thematic Map Using Free Tools and How to Make an Interactive Area Graph with Flare. I don’t actually think the interactive area graphs are so great—they work with the Baby Name Wizard but to me they don’t do much in the example that Nathan shows—but, that doesn’t really matter, what’s cool here is that he’s showing us all exactly how to do it. This stuff is gonna put us statistical graphics experts out of business.

And Chris Masse points me to these instructions from blogger Iowahawk on downloading and analyzing a historical climate dataset. Good stuff.

{ 4 comments }

anon December 14, 2009 at 3:33 pm

This is interesting and possible helpful….but…the first link says in step 0: “Just as a heads up, you’ll need Python unstalled on your computer. Python comes pre-installed on the Mac. I’m not sure about Windows. ”

Oh, great. More tools that only people that write code can use and maybe not easily on a PC. So much for the “masses.” Sorry, but tech-oriented people always assume everybody can just learn this in an afternoon (if you have the right computer to start with and this patch and that patch, etc., etc.). I’m not sure they really understand what user-friendly means. :-( I’ll keep checking these other sites, though. Thanks!

Sebastian December 15, 2009 at 1:56 pm

anon – I disagree. I highly value such instructions, because they enable people who are willing to put a little work in to not just be able to do something cool, but also learn something on the way.
Does that mean any dumb person can do this at home, same as, say, make a pie chart in Excel?
No, certainly it doesn’t. But a reasonably intelligent person who isn’t super-easily intimidated can do this _and_ learn something useful on the way.

Anonymous Coward December 16, 2009 at 9:13 am

Does that mean any dumb person can do this at home, same as, say, make a pie chart in Excel?

The ironic thing is that in Excel 97, all you had to do to make a choropleth of US States within the US or countries within the world or counties within a US state was:

(0) Install the map bolt-on from the cd.
(1) Have a spreadsheet with names of geographic regions in one column and some data in another.
(2) Click through to the map option in the menu.

And Excel would generate the appropriate map. There were right-click options to deal with how many category levels there were and how they were organized. It’s one of those weird three-steps-backwards that this functionality got stripped out.

Anyway, it’s nice that all of the tools in the example are free, but I think it would be easier to go through the workflow for map or tmap in stata.

anon December 18, 2009 at 12:22 am

Sebastian:

Read what I wrote and respond to that, no need to just vent your anger at people that are not as free as you are to learn new code or programming and put up with half-complete freeware projects that need you to patch this and jerry-rig that to get it to work. A.C. is right, it used to be you could make decent maps in Excel (state maps only, however, as I recall). That his can be done in Stata is useful to know as things made for Stata usually work fairly well.

Comments on this entry are closed.

Previous post:

Next post: