Using Voter Registration to Predict Votes for President

Updated March 17th, 2017

In mid-2014, Mark Blumenthal and Ariel Edwards-Levy helpfully published a listing of all then-available data on state voter registration:

I was curious to what degree this info could be used to retroactively predict the outcome of the 2016 US presidential election.

If you quickly glance at their list, you might wonder where Alabama is. Well, Alabama doesn’t require party registration (as the chart title suggests). The same is true of Mississippi and of Georgia. Due to these omissions, you might think that this undertaking already doesn’t make much sense. There are still 31 states listed, though, so it could still be an interesting exercise.

So, how do we go about figuring out prediction rate? My solution was to first figure out the differences between percentages of registered Republicans and Democrats in each state. For example, the difference here in Arizona is 29.5% Democrat vs 34.8% Republican for a roughly 5 percentage point Republican advantage. Ultimately, Trump won the state by 3.6 percentage points. So, not too bad for Arizona.

I then performed that calculation for all 31 states that require voter registration compared to the Clinton/Trump outcome by state. The results are here:

The states where registration is most predictive:
NJ — 1.0 percentage point difference (ppd)
AZ — 1.6 ppd
AK — 1.7 ppd
ME — 1.9 ppd

And where it’s least predictive:
OK — 38.0 ppd
LA — 39.4 ppd
KY — 45.1 ppd
WV — 63.7 ppd

Some of these numbers are obviously incredibly high. Why? Well, look at Democratic registration in some of these states that are obviously red states:

State — Dem — GOP
OK — 44.7% — 43.2%
LA — 47.4% — 27.8%
KY — 53.9% — 38.5%
WV — 50.3% — 28.8%

I’m tempted to think that this has something to do with the Dixiecrats.

But if we look at states Dixiecrat chief Strom Thurmond won in 1948, only Louisiana tracks with that hypothesis:
State — Percentage of Vote Won
MS — 87.2%
AL — 79.8%
SC — 72%
LA — 49.1%.

Of course, it doesn’t help that Louisiana is the only state of those four that requires party registration.

On the other hand, Thurmond hugely lost those other states with unexpectedly high Democratic registration, getting…
under 1% of the vote in Oklahoma (Truman won the state with 63%),
under 1% in West Virginia (Truman won the state with 57%), and
1.3% in Kentucky (Truman also won with 57%).

Putting aside for now an explanation for why some of these states have unexpectedly high Democratic registration, what are we left with?

Well, even with the outliers, this method is 71% predictive with a mean state predictive score of 12.8. Not bad but not great. And if we remove those four biggest outliers, we get a more respectable 81.5% predictiveness with a mean state predictive score of 7.8. A little better but of questionable usefulness.

Regardless of that dubious utility, I maintain that the exercise was interesting in itself. Perhaps residents of some of these outlier states will stumble across this info and throw in their two cents for an explanation. Is it that they all have a lot of old timers who never bothered to change registration because cross-party voting isn’t a problem in those states? Is there some other reason?

Keep in mind too that California’s result was 15 percentage points different from expectation based on voter registration. How come? I don’t know. Feel free to chime in if you think you know.

