9,860 research outputs found
Political Homophily in Independence Movements: Analysing and Classifying Social Media Users by National Identity
Social media and data mining are increasingly being used to analyse political
and societal issues. Here we undertake the classification of social media users
as supporting or opposing ongoing independence movements in their territories.
Independence movements occur in territories whose citizens have conflicting
national identities; users with opposing national identities will then support
or oppose the sense of being part of an independent nation that differs from
the officially recognised country. We describe a methodology that relies on
users' self-reported location to build large-scale datasets for three
territories -- Catalonia, the Basque Country and Scotland. An analysis of these
datasets shows that homophily plays an important role in determining who people
connect with, as users predominantly choose to follow and interact with others
from the same national identity. We show that a classifier relying on users'
follow networks can achieve accurate, language-independent classification
performances ranging from 85% to 97% for the three territories.Comment: Accepted for publication in IEEE Intelligent System
Topic-centric Classification of Twitter User's Political Orientation
In the recent Scottish Independence Referendum (hereafter, IndyRef), Twitter offered a broad platform for people to express their opinions, with millions of IndyRef tweets posted over the campaign period. In this paper, we aim to classify people's voting intentions by the content of their tweets---their short messages communicated on Twitter. By observing tweets related to the IndyRef, we find that people not only discussed the vote, but raised topics related to an independent Scotland including oil reserves, currency, nuclear weapons, and national debt. We show that the views communicated on these topics can inform us of the individuals' voting intentions ("Yes"--in favour of Independence vs. "No"--Opposed). In particular, we argue that an accurate classifier can be designed by leveraging the differences in the features' usage across different topics related to voting intentions. We demonstrate improvements upon a Naive Bayesian classifier using the topics enrichment method. Our new classifier identifies the closest topic for each unseen tweet, based on those topics identified in the training data. Our experiments show that our Topics-Based Naive Bayesian classifier improves accuracy by 7.8% over the classical Naive Bayesian baseline
Semi-supervised Text Regression with Conditional Generative Adversarial Networks
Enormous online textual information provides intriguing opportunities for
understandings of social and economic semantics. In this paper, we propose a
novel text regression model based on a conditional generative adversarial
network (GAN), with an attempt to associate textual data and social outcomes in
a semi-supervised manner. Besides promising potential of predicting
capabilities, our superiorities are twofold: (i) the model works with
unbalanced datasets of limited labelled data, which align with real-world
scenarios; and (ii) predictions are obtained by an end-to-end framework,
without explicitly selecting high-level representations. Finally we point out
related datasets for experiments and future research directions
Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval
Where previous reviews on content-based image retrieval emphasize on what can
be seen in an image to bridge the semantic gap, this survey considers what
people tag about an image. A comprehensive treatise of three closely linked
problems, i.e., image tag assignment, refinement, and tag-based image retrieval
is presented. While existing works vary in terms of their targeted tasks and
methodology, they rely on the key functionality of tag relevance, i.e.
estimating the relevance of a specific tag with respect to the visual content
of a given image and its social context. By analyzing what information a
specific method exploits to construct its tag relevance function and how such
information is exploited, this paper introduces a taxonomy to structure the
growing literature, understand the ingredients of the main works, clarify their
connections and difference, and recognize their merits and limitations. For a
head-to-head comparison between the state-of-the-art, a new experimental
protocol is presented, with training sets containing 10k, 100k and 1m images
and an evaluation on three test sets, contributed by various research groups.
Eleven representative works are implemented and evaluated. Putting all this
together, the survey aims to provide an overview of the past and foster
progress for the near future.Comment: to appear in ACM Computing Survey
Data exploitation and privacy protection in the era of data sharing
As the amount, complexity, and value of data available in both private and public sectors has risen sharply, the competing goals of data privacy and data utility have challenged both organizations and individuals. This dissertation addresses both goals. First, we consider the task of {\it interorganizational data sharing}, in which data owners, data clients, and data subjects have different and sometimes competing privacy concerns. A key challenge in this type of scenario is that each organization uses its own set of proprietary, intraorganizational attributes to describe the shared data; such attributes cannot be shared with other organizations. Moreover, data-access policies are determined by multiple parties and may be specified using attributes that are not directly comparable with the ones used by the owner to specify the data. We propose a system architecture and a suite of protocols that facilitate dynamic and efficient interorganizational data sharing, while allowing each party to use its own set of proprietary attributes to describe the shared data and preserving confidentiality of both data records and attributes. We introduce the novel technique of \textit{attribute-based encryption with oblivious attribute translation (OTABE)}, which plays a crucial role in our solution and may prove useful in other applications. This extension of attribute-based encryption uses semi-trusted proxies to enable dynamic and oblivious translation between proprietary attributes that belong to different organizations. We prove that our OTABE-based framework is secure in the standard model and provide two real-world use cases. Next, we turn our attention to utility that can be derived from the vast and growing amount of data about individuals that is available on social media. As social networks (SNs) continue to grow in popularity, it is essential to understand what can be learned about personal attributes of SN users by mining SN data. The first SN-mining problem we consider is how best to predict the voting behavior of SN users. Prior work only considered users who generate politically oriented content or voluntarily disclose their political preferences online. We avoid this bias by using a novel type of Bayesian-network (BN) model that combines demographic, behavioral, and social features. We test our method in a predictive analysis of the 2016 U.S. Presidential election. Our work is the first to take a semi-supervised approach in this setting. Using the Expectation-Maximization (EM) algorithm, we combine labeled survey data with unlabeled Facebook data, thus obtaining larger datasets and addressing self-selection bias. The second SN-mining challenge we address is the extent to which Dynamic Bayesian Networks (DBNs) can infer dynamic behavioral intentions such as the intention to get a vaccine or to apply for a loan. Knowledge of such intentions has great potential to improve the design of recommendation systems, ad-targeting mechanisms, public-health campaigns, and other social and commercial endeavors. We focus on the question of how to infer an SN user\u27s \textit{offline} decisions and intentions using only the {\it public} portions of her \textit{online} SN accounts. Our contribution is twofold. First, we use BNs and several behavioral-psychology techniques to model decision making as a complex process that both influences and is influenced by static factors (such as personality traits and demographic categories) and dynamic factors (such as triggering events, interests, and emotions). Second, we explore the extent to which temporal models may assist in the inference task by representing SN users as sets of DBNs that are built using our modeling techniques. The use of DBNs, together with data gathered in multiple waves, has the potential to improve both inference accuracy and prediction accuracy in future time slots. It may also shed light on the extent to which different factors influence the decision-making process
Citizen adoption of e-government services – Evidence from Hungary
In a citizen centric approach – which became increasingly popular in the last decade – e-government success begins with citizens starting to use e-government systems, solutions, services. In line with this our paper investigates the factors – presented by the technology acceptance literature – influencing e-government service usage, on a large representative Hungarian sample concerning a wide range of B2C public administration services. Our results imply that the Hungarian government can further increase the usage of e-government services by influencing effort expectancy, trust of internet, facilitating conditions, user experience or habits
An analysis of the user occupational class through Twitter content
Social media content can be used as a complementary source to the traditional
methods for extracting and studying collective social attributes. This study focuses on the prediction of the occupational class for a public user profile. Our analysis is conducted on a new annotated corpus of Twitter users, their respective job titles, posted textual content and platform-related attributes. We frame our task as classification using latent feature representations such as word clusters and embeddings. The employed linear and, especially, non-linear methods can predict a user’s occupational class with strong accuracy for the coarsest level of a standard occupation taxonomy which includes nine classes. Combined with a qualitative assessment, the derived results confirm the feasibility of our approach in inferring a new user attribute that can be embedded in a multitude of downstream applications
Recommended from our members
e-Government awareness among the techno-disadvantaged in the United States
This exploratory research focuses on awareness among techno-disadvantaged citizens in the United States. Specifically, we address whether awareness is associated with visitation and whether there are differences between those who are aware and those who are not aware. Following up on on a theory-based community initiative designed to improve computer literacy and access to information and communication technologies (ICT) for members of an underserved public housing community, a survey was undertaken. The results indicate that awareness is associated with visitation. Differences in demographic characterisitics, perceived ease of use (PEOU), and perceived access barriers between those who are aware of e-government websites, and those who are not, were found. While nearly half of the respondents are neither aware of nor have visited e-government websites, a slight majority is partaking of e-government services. We identify directions for future research and conclude by emphasizing the value of a theory-based community initiative to improve computer literacy, provide access to ICT, and advance e-government inclusion
- …