PPP’s baffling discard process

by Andrew Gelman on September 10, 2013 · 5 comments

in Methodology,Public opinion

B. J. Martino writes:

Earlier this summer when I went on a bit of a rant about PPP and their process of discarding interviews, rather than simply weighting data. Mark Blumenthal mentioned your response to the discusion in one of his posts, where you said you were a bit “baffled” by it.

While they claim to engage in the discard process as a kind of retroactive quota to account for more older, white women in their sample, it was the discards among the non-”older white women” that made me curious. That is, any respondent who was not meeting all criteria of being age 46+, white and female.

I downloaded the data from all their 2012 surveys for Daily Kos/SEIU, and compared the sample of non-”older white women” within the unweighted released data as well as the discarded data.

At least from the first six surveys I have looked at, there appears to be a consistent difference in the partisan composition of the released data and the discarded data for this group. In every case, the released data for this group was net Democratic in Party ID (Unw D-R), and the discarded data was net Republican (Dis D-R).

Party ID in PPP Polls for Daily Kos/SEIU- Non-”older white women”
(raw unweighted data and discarded data)

 

Unw Sample

Discarded Sample

Unw D-R

Dis D-R

Unw- Disc

25-Oct

968

335

6.5%

-1.5%

8.0%

12-Oct

1100

302

6.5%

-8.6%

15.1%

4-Oct

922

356

5.1%

-19.7%

24.8%

27-Sep

763

277

10.2%

-8.6%

18.8%

20-Sep

829

236

12.0%

-18.2%

30.2%

13-Sep

699

267

8.7%

-7.9%

16.6%

What this suggests to me is that the discard process is both a way to apply a retroactive quota to older white women, but also a way to fix the partisanship from another group (assuming this is primarily younger voters). My thought is that they are getting too Republican a sample in this group because they never dial cell phones.

I found it interesting that despite hanging their hat on being the most accurate of 2012, they announced today that they would be working to find a way to include cell phones in the future.

Martino continues:

As I told Mark, I’m not really interested into getting into a shouting match with PPP. It has always just kind of dumbfounded me how they work. The fact that Daily Kos/SEIU published all the raw data from PPP’s 2012 polling at least gave me some opportunity to figure it out.

I guess the troubling part is how they have repeatedly stood by the statement that they do not weight for Party ID, when this discard process would seem to indicate a de facto weight on Party ID for at least a portion of the sample. What they say is strictly true, but the effect is the same. Seems to be arguing semantics.

I also took a look at the Presidential ballot for this same group of non-“older white women.” Same effect, perhaps even a bit more pronounced.

Presidential Ballot in PPP Polls for Daily Kos/SEIU- Non-older white women
(raw unweighted data and discarded data)

 

Unw Sample

Discarded Sample

Unw O-R

Dis O-R

Unw- Disc

25-Oct

968

335

4.2%

-11.6%

15.8%

12-Oct

1100

302

-0.4%

-19.2%

19.6%

4-Oct

922

356

-0.4%

-30.4%

30.8%

27-Sep

763

277

10.3%

-11.2%

21.5%

20-Sep

829

236

9.7%

-22.1%

31.8%

13-Sep

699

267

6.3%

-15.0%

21.3%

 

I don’t really have anything to add here; it’s just an interesting story. I remain amazed that anyone would think it’s a good idea to throw away survey interviews that have already been conducted.

{ 5 comments }

Nate Cohn September 10, 2013 at 1:13 pm

Almost all of the discards you’re considering are white men.

David Nir September 10, 2013 at 1:29 pm

This statement is incorrect in a subtle but important way:

While they claim to engage in the discard process as a kind of retroactive quota to account for more older, white women in their sample, it was the discards among the non-”older white women” that made me curious. That is, any respondent who was not meeting all criteria of being age 46+, white and female.

This is PPP’s methodology statement:

After contacting a sufficient number of respondents, PPP uses a random deletion process to achieve an appropriate gender and race balance, which generally involves removing excess cases of white and female voters. PPP then uses a statistical formula to adjust for age imbalances, which creates the final results.

Note the word “generally” there. It makes all the difference. PPP does not say they only weed out excess older, white women.

Nate Cohn September 10, 2013 at 1:44 pm

Yes, and in all of these instances, the discards are ~90% white, of which a big chunk are white men. So only looking at the non-old-white-women = mainly younger white women, all white men, a few non-whites.

BJ Martino September 11, 2013 at 9:28 am

PPP has repeatedly and strongly denied that they weight by Party Identification. This is pretty clear evidence that the discard process among non-”older white women” acts as a de facto weight on Party ID, which is necessary given that they do not contact respondents on cell phones. Otherwise the younger voters in their sample would look too Republican.

They have to “fix” the partisan imbalances caused by only contacting voters via landline. Technically, they did not apply a Party ID weight, but if the discard process looks, swims and quacks like a duck…

This is the polling equivalent of President Clinton saying, “it depends upon what the meaning of the word ‘is’ is.”

But they are already applying weights to other variables after the discard. Why so strongly deny Party ID weighting, then engage in a practice that effectively acts as one?

I’ll take a stab at answering my own question: because they are already applying such a massive weight to the 18-45 year cohorts (more than 2x by what I have seen), that to also apply a Party ID weight to those voters as well would be making some interviews (like Democratic younger voters) perhaps worth 3x or 4x in the final weighting mix.

Sebastian September 13, 2013 at 1:37 pm

I think Nate Cohn’s (presumbaly same as the commenter above) article in TNR on this is terrific. One of the best pieces on polling I’ve ever read in a non-academic outlet, though I’m afraid it will go over people’s heads. But great piece, great use of graphs, major kudos. There don’t seem to be many journalists who even understand statistics at this level, let alone are willing and able to write about it.
http://www.newrepublic.com/article/114682/ppp-polling-methodology-opaque-flawed

Comments on this entry are closed.

Previous post:

Next post: