34,530 research outputs found
Hierarchical Subquery Evaluation for Active Learning on a Graph
To train good supervised and semi-supervised object classifiers, it is
critical that we not waste the time of the human experts who are providing the
training labels. Existing active learning strategies can have uneven
performance, being efficient on some datasets but wasteful on others, or
inconsistent just between runs on the same dataset. We propose perplexity based
graph construction and a new hierarchical subquery evaluation algorithm to
combat this variability, and to release the potential of Expected Error
Reduction.
Under some specific circumstances, Expected Error Reduction has been one of
the strongest-performing informativeness criteria for active learning. Until
now, it has also been prohibitively costly to compute for sizeable datasets. We
demonstrate our highly practical algorithm, comparing it to other active
learning measures on classification datasets that vary in sparsity,
dimensionality, and size. Our algorithm is consistent over multiple runs and
achieves high accuracy, while querying the human expert for labels at a
frequency that matches their desired time budget.Comment: CVPR 201
A Machine Learning Based Analytical Framework for Semantic Annotation Requirements
The Semantic Web is an extension of the current web in which information is
given well-defined meaning. The perspective of Semantic Web is to promote the
quality and intelligence of the current web by changing its contents into
machine understandable form. Therefore, semantic level information is one of
the cornerstones of the Semantic Web. The process of adding semantic metadata
to web resources is called Semantic Annotation. There are many obstacles
against the Semantic Annotation, such as multilinguality, scalability, and
issues which are related to diversity and inconsistency in content of different
web pages. Due to the wide range of domains and the dynamic environments that
the Semantic Annotation systems must be performed on, the problem of automating
annotation process is one of the significant challenges in this domain. To
overcome this problem, different machine learning approaches such as supervised
learning, unsupervised learning and more recent ones like, semi-supervised
learning and active learning have been utilized. In this paper we present an
inclusive layered classification of Semantic Annotation challenges and discuss
the most important issues in this field. Also, we review and analyze machine
learning applications for solving semantic annotation problems. For this goal,
the article tries to closely study and categorize related researches for better
understanding and to reach a framework that can map machine learning techniques
into the Semantic Annotation challenges and requirements
Graph-based Semi-Supervised & Active Learning for Edge Flows
We present a graph-based semi-supervised learning (SSL) method for learning
edge flows defined on a graph. Specifically, given flow measurements on a
subset of edges, we want to predict the flows on the remaining edges. To this
end, we develop a computational framework that imposes certain constraints on
the overall flows, such as (approximate) flow conservation. These constraints
render our approach different from classical graph-based SSL for vertex labels,
which posits that tightly connected nodes share similar labels and leverages
the graph structure accordingly to extrapolate from a few vertex labels to the
unlabeled vertices. We derive bounds for our method's reconstruction error and
demonstrate its strong performance on synthetic and real-world flow networks
from transportation, physical infrastructure, and the Web. Furthermore, we
provide two active learning algorithms for selecting informative edges on which
to measure flow, which has applications for optimal sensor deployment. The
first strategy selects edges to minimize the reconstruction error bound and
works well on flows that are approximately divergence-free. The second approach
clusters the graph and selects bottleneck edges that cross cluster-boundaries,
which works well on flows with global trends
Tripartite Graph Clustering for Dynamic Sentiment Analysis on Social Media
The growing popularity of social media (e.g, Twitter) allows users to easily
share information with each other and influence others by expressing their own
sentiments on various subjects. In this work, we propose an unsupervised
\emph{tri-clustering} framework, which analyzes both user-level and tweet-level
sentiments through co-clustering of a tripartite graph. A compelling feature of
the proposed framework is that the quality of sentiment clustering of tweets,
users, and features can be mutually improved by joint clustering. We further
investigate the evolution of user-level sentiments and latent feature vectors
in an online framework and devise an efficient online algorithm to sequentially
update the clustering of tweets, users and features with newly arrived data.
The online framework not only provides better quality of both dynamic
user-level and tweet-level sentiment analysis, but also improves the
computational and storage efficiency. We verified the effectiveness and
efficiency of the proposed approaches on the November 2012 California ballot
Twitter data.Comment: A short version is in Proceeding of the 2014 ACM SIGMOD International
Conference on Management of dat
- …