# Choices in graphing parallel time series

I saw this graph posted by Tyler Cowen:

and my first thought was that the bar plot should be replaced by a line plot: Six lines, one for each income category, with each line being a time series of these changes. With a line plot, you can more easily see each time series (these are hard to see in the barplot because you have to follow each color and jump from decade to decade) and also compare the patterns for each category. The line plot pretty much dominates the bar plot.

At least that was the theory. Now here’s what actually happened.

I downloaded the data as Excel files, saved them as csv, then read them into R. In all, it took close to an hour to get the data set up in the format that was needed to make the graphs. At this point it was pretty easy to make the line plot. But the result was disappointing:

The six lines are hard to untangle (sure, a better color scheme might help, but it wouldn’t really solve the problem) and the graph as a whole is much less clear than the original bar plot.

My next try was small multiples: six little graphs, each with its own time series. That didn’t work so well either (although, on the plus side, it only took a few minutes to make that graph).

Then I thought of plotting the incomes over time (all these income values are inflation-adjusted, of course):

I like this one a lot. In particular, it shows that the drop from 2000-2010 is really a drop since 2007. (Although I suppose Cowen would argue that the drop was really happening earlier and it was just that the economy was doing a Wile E. Coyote, standing in midair and not actually going into freefall until people realized they had gone off the edge of the cliff).

Still, even the time-trends graph is not quite a replacement for the original bar plot which shows so much drama. I think my recommended solution is to give the bar plot for the initial impression and then follow up immediately with the time-trends graph, which shows the big picture much more clearly.

### 7 Responses to Choices in graphing parallel time series

1. Manoel Galdino August 28, 2012 at 1:11 pm #

It’d think that this kind of post suits better to the other blog than here. And the same is true of other posts you post there instead of here. Any reason for that?

• Andrew Gelman August 28, 2012 at 6:55 pm #

Manoel:

The other blog has a backlog so I thought I’d post this one here right away as it is somewhat topical.

2. andrew long August 28, 2012 at 10:03 pm #

Huh. For me, the line graph doesn’t give me as much information as quickly.

The original bar graph is powerful for two reasons: it deals in percent change of income, and it immediately compares that change across all quintiles. Percent change of income is much easier to internalize and understand than trying to estimate the actual value represented by any given point on the lines in the line graph. And I can’t gauge percent increase or decrease at all. You’re probably a lot better at seeing those than the average reader. (Also: the asymmetric scale of income masks the actual distances between the income cohorts.)

What I immediately glean from the bar graph are 3 things:
1. that the economic life of the US was inestimably healthier and fairer because of the New Deal and the Great Society;
2. That whatever were the merits of the Reagan Revolution, it stomped on the poor;
3. The current Republican Party is responsible for everything wrong with the economy today.

Now, those are all debatable. But I really don’t get much of anything I can hope to argue about from the line graph.

• andrew long August 28, 2012 at 10:09 pm #

oh sorry I meant to refer to the time-trends graph, not the line graph.

• Andrew Gelman August 30, 2012 at 7:07 am #

Andrew Long:

I agree with you. That’s why I said the bar plot should be shown first. Once the reader sees the bar plot and gets the key message, I think it’s useful to see the time series plot to get the full perspective.

3. John Carson August 30, 2012 at 3:16 pm #

How about a time-trends graph on a ratio basis?

Graph the ratio of each quintile’s income to that of the bottom quintile’s. This is actual income, not change in income. Perhaps also, the ratio to the 2nd quintile would be of interest, and throw in the top 5% too.

And add horizontal lines at key levels to assist seeing how the ratio changes over time, showing a greater, or lesser, difference over the decades. (The difference as a ratio is more informative, in my opinion, than the “actual distance” that andrew long referred to.)

Would this be informative, or misleading? If misleading, how so? I think this would show the change in the change in income more readily.

Since many people want to ascribe political causes (or blame) to such income changes, perhaps plotting over 4-year or 8-year time intervals tied to the presidential cycle would be useful.

My key gripe is that virtually all such graphs do not show how individuals’ and families’ income changed over the course of a lifetime, but rather show the changes in certain statistical categories — indicative, yes, but often misleading. Related to that, the statistical categories themselves, based solely on family income at a point in time, do not take into account all the other demographic changes (family size, divorce, baby boomer population, immigrants, etc) and how those changes affect income distribution. Not to mention the change in typical working patterns, from high school direct to working in earlier decades, versus today’s pattern with college and starting work later in life for many people.

My experience, not everyone’s but certainly many’s, is little or no income in my 20’s, rising in my 30’s and 40’s, and peaking in my 50’s, going from the lowest quintile to the 4th during my working life. I’m not claiming that’s typical, but rather: Are such demographic and personal changes irrelevant? My concern is with those who are “stuck” in some sense in the lower quintiles, not the part-time working or non-working college student (that was me, many years ago) who will likely make a decent income starting in their late 20’s or early 30’s.

4. Noah Yetter August 31, 2012 at 3:07 pm #

“Still, even the time-trends graph is not quite a replacement for the original bar plot which shows so much drama.”

No, the bar graph shows drama where there is none. The time-trends graph you settled on tells the truth much more objectively.