2,582 research outputs found
Inferring Population Preferences via Mixtures of Spatial Voting Models
Understanding political phenomena requires measuring the political
preferences of society. We introduce a model based on mixtures of spatial
voting models that infers the underlying distribution of political preferences
of voters with only voting records of the population and political positions of
candidates in an election. Beyond offering a cost-effective alternative to
surveys, this method projects the political preferences of voters and
candidates into a shared latent preference space. This projection allows us to
directly compare the preferences of the two groups, which is desirable for
political science but difficult with traditional survey methods. After
validating the aggregated-level inferences of this model against results of
related work and on simple prediction tasks, we apply the model to better
understand the phenomenon of political polarization in the Texas, New York, and
Ohio electorates. Taken at face value, inferences drawn from our model indicate
that the electorates in these states may be less bimodal than the distribution
of candidates, but that the electorates are comparatively more extreme in their
variance. We conclude with a discussion of limitations of our method and
potential future directions for research.Comment: To be published in the 8th International Conference on Social
Informatics (SocInfo) 201
Bayesian inference on group differences in multivariate categorical data
Multivariate categorical data are common in many fields. We are motivated by
election polls studies assessing evidence of changes in voters opinions with
their candidates preferences in the 2016 United States Presidential primaries
or caucuses. Similar goals arise routinely in several applications, but current
literature lacks a general methodology which combines flexibility, efficiency,
and tractability in testing for group differences in multivariate categorical
data at different---potentially complex---scales. We address this goal by
leveraging a Bayesian representation which factorizes the joint probability
mass function for the group variable and the multivariate categorical data as
the product of the marginal probabilities for the groups, and the conditional
probability mass function of the multivariate categorical data, given the group
membership. To enhance flexibility, we define the conditional probability mass
function of the multivariate categorical data via a group-dependent mixture of
tensor factorizations, thus facilitating dimensionality reduction and borrowing
of information, while providing tractable procedures for computation, and
accurate tests assessing global and local group differences. We compare our
methods with popular competitors, and discuss improved performance in
simulations and in American election polls studies
Marginal and simultaneous predictive classification using stratified graphical models
An inductive probabilistic classification rule must generally obey the
principles of Bayesian predictive inference, such that all observed and
unobserved stochastic quantities are jointly modeled and the parameter
uncertainty is fully acknowledged through the posterior predictive
distribution. Several such rules have been recently considered and their
asymptotic behavior has been characterized under the assumption that the
observed features or variables used for building a classifier are conditionally
independent given a simultaneous labeling of both the training samples and
those from an unknown origin. Here we extend the theoretical results to
predictive classifiers acknowledging feature dependencies either through
graphical models or sparser alternatives defined as stratified graphical
models. We also show through experimentation with both synthetic and real data
that the predictive classifiers based on stratified graphical models have
consistently best accuracy compared with the predictive classifiers based on
either conditionally independent features or on ordinary graphical models.Comment: 18 pages, 5 figure
Polling bias and undecided voter allocations: US Presidential elections, 2004 - 2016
Accounting for undecided and uncertain voters is a challenging issue for
predicting election results from public opinion polls. Undecided voters typify
the uncertainty of swing voters in polls but are often ignored or allocated to
each candidate in a simple, deterministic manner. Historically this may have
been adequate because the undecided were comparatively small enough to assume
that they do not affect the relative proportions of the decided voters.
However, in the presence of high numbers of undecided voters, these static
rules may in fact bias election predictions from election poll authors and
meta-poll analysts. In this paper, we examine the effect of undecided voters in
the 2016 US presidential election to the previous three presidential elections.
We show there were a relatively high number of undecided voters over the
campaign and on election day, and that the allocation of undecided voters in
this election was not consistent with two-party proportional (or even)
allocations. We find evidence that static allocation regimes are inadequate for
election prediction models and that probabilistic allocations may be superior.
We also estimate the bias attributable to polling agencies, often referred to
as "house effects".Comment: 32 pages, 9 figures, 6 table
Induction and Deduction in Baysian Data Analysis
The classical or frequentist approach to statistics (in which inference is centered on significance testing), is associated with a philosophy in which science is deductive and follows Popperis doctrine of falsification. In contrast, Bayesian inference is commonly associated with inductive reasoning and the idea that a model can be dethroned by a competing model but can never be directly falsified by a significance test. The purpose of this article is to break these associations, which I think are incorrect and have been detrimental to statistical practice, in that they have steered falsificationists away from the very useful tools of Bayesian inference and have discouraged Bayesians from checking the fit of their models. From my experience using and developing Bayesian methods in social and environmental science, I have found model checking and falsification to be central in the modeling process.philosophy of statistics, decision theory, subjective probability, Bayesianism, falsification, induction, frequentism
The Information of Spam
This paper explores the value of information contained in spam tweets as it pertains to prediction accuracy. As a case study, tweets discussing Bitcoin were collected and used to predict the rise and fall of Bitcoin value. Precision of prediction both with and without spam tweets, as identified by a naive Bayesian spam filter, were measured. Results showed a minor increase in accuracy when spam tweets were included, indicating that spam messages likely contain information valuable for prediction of market fluctuations
- âŠ