Date Published: 2020-04-20
Author(s):
Seo-young Silvia Kim, California Institute of Technology
R. Michael Alvarez, California Institute of Technology
Abstract:
Objective: What can machine learning tell us about who voted in 2016? There are numerous competing voter turnout theories, and a large number of covariates are required to assess which theory best explains turnout. This paper is a proof-of-concept that machine learning can help overcome this curse of dimensionality and reveal important insights in studies of political phenomena.
Methods: We use Fuzzy Forests, an extension of Random Forests, to screen variables for a parsimonious but accurate prediction. Fuzzy Forests achieve accurate variable importance measures in the face of high dimensional and highly correlated data. The data that we use is the 2016 Cooperative Congressional Election Study.
Results: Fuzzy Forests chose only a small number of covariates as major correlates of 2016 turnout and still boasted high predictive performance.