69,620 research outputs found
Unsupervised User Stance Detection on Twitter
We present a highly effective unsupervised framework for detecting the stance
of prolific Twitter users with respect to controversial topics. In particular,
we use dimensionality reduction to project users onto a low-dimensional space,
followed by clustering, which allows us to find core users that are
representative of the different stances. Our framework has three major
advantages over pre-existing methods, which are based on supervised or
semi-supervised classification. First, we do not require any prior labeling of
users: instead, we create clusters, which are much easier to label manually
afterwards, e.g., in a matter of seconds or minutes instead of hours. Second,
there is no need for domain- or topic-level knowledge either to specify the
relevant stances (labels) or to conduct the actual labeling. Third, our
framework is robust in the face of data skewness, e.g., when some users or some
stances have greater representation in the data. We experiment with different
combinations of user similarity features, dataset sizes, dimensionality
reduction methods, and clustering algorithms to ascertain the most effective
and most computationally efficient combinations across three different datasets
(in English and Turkish). We further verified our results on additional tweet
sets covering six different controversial topics. Our best combination in terms
of effectiveness and efficiency uses retweeted accounts as features, UMAP for
dimensionality reduction, and Mean Shift for clustering, and yields a small
number of high-quality user clusters, typically just 2--3, with more than 98\%
purity. The resulting user clusters can be used to train downstream
classifiers. Moreover, our framework is robust to variations in the
hyper-parameter values and also with respect to random initialization
Feature discovery and visualization of robot mission data using convolutional autoencoders and Bayesian nonparametric topic models
The gap between our ability to collect interesting data and our ability to
analyze these data is growing at an unprecedented rate. Recent algorithmic
attempts to fill this gap have employed unsupervised tools to discover
structure in data. Some of the most successful approaches have used
probabilistic models to uncover latent thematic structure in discrete data.
Despite the success of these models on textual data, they have not generalized
as well to image data, in part because of the spatial and temporal structure
that may exist in an image stream.
We introduce a novel unsupervised machine learning framework that
incorporates the ability of convolutional autoencoders to discover features
from images that directly encode spatial information, within a Bayesian
nonparametric topic model that discovers meaningful latent patterns within
discrete data. By using this hybrid framework, we overcome the fundamental
dependency of traditional topic models on rigidly hand-coded data
representations, while simultaneously encoding spatial dependency in our topics
without adding model complexity. We apply this model to the motivating
application of high-level scene understanding and mission summarization for
exploratory marine robots. Our experiments on a seafloor dataset collected by a
marine robot show that the proposed hybrid framework outperforms current
state-of-the-art approaches on the task of unsupervised seafloor terrain
characterization.Comment: 8 page
From the User to the Medium: Neural Profiling Across Web Communities
Online communities provide a unique way for individuals to access information
from those in similar circumstances, which can be critical for health
conditions that require daily and personalized management. As these groups and
topics often arise organically, identifying the types of topics discussed is
necessary to understand their needs. As well, these communities and people in
them can be quite diverse, and existing community detection methods have not
been extended towards evaluating these heterogeneities. This has been limited
as community detection methodologies have not focused on community detection
based on semantic relations between textual features of the user-generated
content. Thus here we develop an approach, NeuroCom, that optimally finds dense
groups of users as communities in a latent space inferred by neural
representation of published contents of users. By embedding of words and
messages, we show that NeuroCom demonstrates improved clustering and identifies
more nuanced discussion topics in contrast to other common unsupervised
learning approaches
Multiple Moving Object Recognitions in video based on Log Gabor-PCA Approach
Object recognition in the video sequence or images is one of the sub-field of
computer vision. Moving object recognition from a video sequence is an
appealing topic with applications in various areas such as airport safety,
intrusion surveillance, video monitoring, intelligent highway, etc. Moving
object recognition is the most challenging task in intelligent video
surveillance system. In this regard, many techniques have been proposed based
on different methods. Despite of its importance, moving object recognition in
complex environments is still far from being completely solved for low
resolution videos, foggy videos, and also dim video sequences. All in all,
these make it necessary to develop exceedingly robust techniques. This paper
introduces multiple moving object recognition in the video sequence based on
LoG Gabor-PCA approach and Angle based distance Similarity measures techniques
used to recognize the object as a human, vehicle etc. Number of experiments are
conducted for indoor and outdoor video sequences of standard datasets and also
our own collection of video sequences comprising of partial night vision video
sequences. Experimental results show that our proposed approach achieves an
excellent recognition rate. Results obtained are satisfactory and competent.Comment: 8,26,conferenc
Basic tasks of sentiment analysis
Subjectivity detection is the task of identifying objective and subjective
sentences. Objective sentences are those which do not exhibit any sentiment.
So, it is desired for a sentiment analysis engine to find and separate the
objective sentences for further analysis, e.g., polarity detection. In
subjective sentences, opinions can often be expressed on one or multiple
topics. Aspect extraction is a subtask of sentiment analysis that consists in
identifying opinion targets in opinionated text, i.e., in detecting the
specific aspects of a product or service the opinion holder is either praising
or complaining about
- …