2,582 research outputs found

    Inferring Population Preferences via Mixtures of Spatial Voting Models

    Full text link
    Understanding political phenomena requires measuring the political preferences of society. We introduce a model based on mixtures of spatial voting models that infers the underlying distribution of political preferences of voters with only voting records of the population and political positions of candidates in an election. Beyond offering a cost-effective alternative to surveys, this method projects the political preferences of voters and candidates into a shared latent preference space. This projection allows us to directly compare the preferences of the two groups, which is desirable for political science but difficult with traditional survey methods. After validating the aggregated-level inferences of this model against results of related work and on simple prediction tasks, we apply the model to better understand the phenomenon of political polarization in the Texas, New York, and Ohio electorates. Taken at face value, inferences drawn from our model indicate that the electorates in these states may be less bimodal than the distribution of candidates, but that the electorates are comparatively more extreme in their variance. We conclude with a discussion of limitations of our method and potential future directions for research.Comment: To be published in the 8th International Conference on Social Informatics (SocInfo) 201

    Bayesian inference on group differences in multivariate categorical data

    Full text link
    Multivariate categorical data are common in many fields. We are motivated by election polls studies assessing evidence of changes in voters opinions with their candidates preferences in the 2016 United States Presidential primaries or caucuses. Similar goals arise routinely in several applications, but current literature lacks a general methodology which combines flexibility, efficiency, and tractability in testing for group differences in multivariate categorical data at different---potentially complex---scales. We address this goal by leveraging a Bayesian representation which factorizes the joint probability mass function for the group variable and the multivariate categorical data as the product of the marginal probabilities for the groups, and the conditional probability mass function of the multivariate categorical data, given the group membership. To enhance flexibility, we define the conditional probability mass function of the multivariate categorical data via a group-dependent mixture of tensor factorizations, thus facilitating dimensionality reduction and borrowing of information, while providing tractable procedures for computation, and accurate tests assessing global and local group differences. We compare our methods with popular competitors, and discuss improved performance in simulations and in American election polls studies

    Marginal and simultaneous predictive classification using stratified graphical models

    Full text link
    An inductive probabilistic classification rule must generally obey the principles of Bayesian predictive inference, such that all observed and unobserved stochastic quantities are jointly modeled and the parameter uncertainty is fully acknowledged through the posterior predictive distribution. Several such rules have been recently considered and their asymptotic behavior has been characterized under the assumption that the observed features or variables used for building a classifier are conditionally independent given a simultaneous labeling of both the training samples and those from an unknown origin. Here we extend the theoretical results to predictive classifiers acknowledging feature dependencies either through graphical models or sparser alternatives defined as stratified graphical models. We also show through experimentation with both synthetic and real data that the predictive classifiers based on stratified graphical models have consistently best accuracy compared with the predictive classifiers based on either conditionally independent features or on ordinary graphical models.Comment: 18 pages, 5 figure

    Polling bias and undecided voter allocations: US Presidential elections, 2004 - 2016

    Full text link
    Accounting for undecided and uncertain voters is a challenging issue for predicting election results from public opinion polls. Undecided voters typify the uncertainty of swing voters in polls but are often ignored or allocated to each candidate in a simple, deterministic manner. Historically this may have been adequate because the undecided were comparatively small enough to assume that they do not affect the relative proportions of the decided voters. However, in the presence of high numbers of undecided voters, these static rules may in fact bias election predictions from election poll authors and meta-poll analysts. In this paper, we examine the effect of undecided voters in the 2016 US presidential election to the previous three presidential elections. We show there were a relatively high number of undecided voters over the campaign and on election day, and that the allocation of undecided voters in this election was not consistent with two-party proportional (or even) allocations. We find evidence that static allocation regimes are inadequate for election prediction models and that probabilistic allocations may be superior. We also estimate the bias attributable to polling agencies, often referred to as "house effects".Comment: 32 pages, 9 figures, 6 table

    Induction and Deduction in Baysian Data Analysis

    Get PDF
    The classical or frequentist approach to statistics (in which inference is centered on significance testing), is associated with a philosophy in which science is deductive and follows Popperis doctrine of falsification. In contrast, Bayesian inference is commonly associated with inductive reasoning and the idea that a model can be dethroned by a competing model but can never be directly falsified by a significance test. The purpose of this article is to break these associations, which I think are incorrect and have been detrimental to statistical practice, in that they have steered falsificationists away from the very useful tools of Bayesian inference and have discouraged Bayesians from checking the fit of their models. From my experience using and developing Bayesian methods in social and environmental science, I have found model checking and falsification to be central in the modeling process.philosophy of statistics, decision theory, subjective probability, Bayesianism, falsification, induction, frequentism

    The Information of Spam

    Get PDF
    This paper explores the value of information contained in spam tweets as it pertains to prediction accuracy. As a case study, tweets discussing Bitcoin were collected and used to predict the rise and fall of Bitcoin value. Precision of prediction both with and without spam tweets, as identified by a naive Bayesian spam filter, were measured. Results showed a minor increase in accuracy when spam tweets were included, indicating that spam messages likely contain information valuable for prediction of market fluctuations
    • 

    corecore