10,297 research outputs found
Validation of Twitter opinion trends with national polling aggregates: Hillary Clinton vs Donald Trump
Measuring and forecasting opinion trends from real-time social media is a
long-standing goal of big-data analytics. Despite its importance, there has
been no conclusive scientific evidence so far that social media activity can
capture the opinion of the general population. Here we develop a method to
infer the opinion of Twitter users regarding the candidates of the 2016 US
Presidential Election by using a combination of statistical physics of complex
networks and machine learning based on hashtags co-occurrence to develop an
in-domain training set approaching 1 million tweets. We investigate the social
networks formed by the interactions among millions of Twitter users and infer
the support of each user to the presidential candidates. The resulting Twitter
trends follow the New York Times National Polling Average, which represents an
aggregate of hundreds of independent traditional polls, with remarkable
accuracy. Moreover, the Twitter opinion trend precedes the aggregated NYT polls
by 10 days, showing that Twitter can be an early signal of global opinion
trends. Our analytics unleash the power of Twitter to uncover social trends
from elections, brands to political movements, and at a fraction of the cost of
national polls
Geotagging One Hundred Million Twitter Accounts with Total Variation Minimization
Geographically annotated social media is extremely valuable for modern
information retrieval. However, when researchers can only access
publicly-visible data, one quickly finds that social media users rarely publish
location information. In this work, we provide a method which can geolocate the
overwhelming majority of active Twitter users, independent of their location
sharing preferences, using only publicly-visible Twitter data.
Our method infers an unknown user's location by examining their friend's
locations. We frame the geotagging problem as an optimization over a social
network with a total variation-based objective and provide a scalable and
distributed algorithm for its solution. Furthermore, we show how a robust
estimate of the geographic dispersion of each user's ego network can be used as
a per-user accuracy measure which is effective at removing outlying errors.
Leave-many-out evaluation shows that our method is able to infer location for
101,846,236 Twitter users at a median error of 6.38 km, allowing us to geotag
over 80\% of public tweets.Comment: 9 pages, 8 figures, accepted to IEEE BigData 2014, Compton, Ryan,
David Jurgens, and David Allen. "Geotagging one hundred million twitter
accounts with total variation minimization." Big Data (Big Data), 2014 IEEE
International Conference on. IEEE, 201
Approximating the Spectrum of a Graph
The spectrum of a network or graph with adjacency matrix ,
consists of the eigenvalues of the normalized Laplacian . This set of eigenvalues encapsulates many aspects of the structure
of the graph, including the extent to which the graph posses community
structures at multiple scales. We study the problem of approximating the
spectrum , of in the regime where the graph is too
large to explicitly calculate the spectrum. We present a sublinear time
algorithm that, given the ability to query a random node in the graph and
select a random neighbor of a given node, computes a succinct representation of
an approximation , such that . Our algorithm has query complexity and running time ,
independent of the size of the graph, . We demonstrate the practical
viability of our algorithm on 15 different real-world graphs from the Stanford
Large Network Dataset Collection, including social networks, academic
collaboration graphs, and road networks. For the smallest of these graphs, we
are able to validate the accuracy of our algorithm by explicitly calculating
the true spectrum; for the larger graphs, such a calculation is computationally
prohibitive.
In addition we study the implications of our algorithm to property testing in
the bounded degree graph model
- …