30,091 research outputs found
Basic tasks of sentiment analysis
Subjectivity detection is the task of identifying objective and subjective
sentences. Objective sentences are those which do not exhibit any sentiment.
So, it is desired for a sentiment analysis engine to find and separate the
objective sentences for further analysis, e.g., polarity detection. In
subjective sentences, opinions can often be expressed on one or multiple
topics. Aspect extraction is a subtask of sentiment analysis that consists in
identifying opinion targets in opinionated text, i.e., in detecting the
specific aspects of a product or service the opinion holder is either praising
or complaining about
Symbol Emergence in Robotics: A Survey
Humans can learn the use of language through physical interaction with their
environment and semiotic communication with other people. It is very important
to obtain a computational understanding of how humans can form a symbol system
and obtain semiotic skills through their autonomous mental development.
Recently, many studies have been conducted on the construction of robotic
systems and machine-learning methods that can learn the use of language through
embodied multimodal interaction with their environment and other systems.
Understanding human social interactions and developing a robot that can
smoothly communicate with human users in the long term, requires an
understanding of the dynamics of symbol systems and is crucially important. The
embodied cognition and social interaction of participants gradually change a
symbol system in a constructive manner. In this paper, we introduce a field of
research called symbol emergence in robotics (SER). SER is a constructive
approach towards an emergent symbol system. The emergent symbol system is
socially self-organized through both semiotic communications and physical
interactions with autonomous cognitive developmental agents, i.e., humans and
developmental robots. Specifically, we describe some state-of-art research
topics concerning SER, e.g., multimodal categorization, word discovery, and a
double articulation analysis, that enable a robot to obtain words and their
embodied meanings from raw sensory--motor information, including visual
information, haptic information, auditory information, and acoustic speech
signals, in a totally unsupervised manner. Finally, we suggest future
directions of research in SER.Comment: submitted to Advanced Robotic
A Deep Embedding Model for Co-occurrence Learning
Co-occurrence Data is a common and important information source in many
areas, such as the word co-occurrence in the sentences, friends co-occurrence
in social networks and products co-occurrence in commercial transaction data,
etc, which contains rich correlation and clustering information about the
items. In this paper, we study co-occurrence data using a general energy-based
probabilistic model, and we analyze three different categories of energy-based
model, namely, the , and models, which are able to capture
different levels of dependency in the co-occurrence data. We also discuss how
several typical existing models are related to these three types of energy
models, including the Fully Visible Boltzmann Machine (FVBM) (), Matrix
Factorization (), Log-BiLinear (LBL) models (), and the Restricted
Boltzmann Machine (RBM) model (). Then, we propose a Deep Embedding Model
(DEM) (an model) from the energy model in a \emph{principled} manner.
Furthermore, motivated by the observation that the partition function in the
energy model is intractable and the fact that the major objective of modeling
the co-occurrence data is to predict using the conditional probability, we
apply the \emph{maximum pseudo-likelihood} method to learn DEM. In consequence,
the developed model and its learning method naturally avoid the above
difficulties and can be easily used to compute the conditional probability in
prediction. Interestingly, our method is equivalent to learning a special
structured deep neural network using back-propagation and a special sampling
strategy, which makes it scalable on large-scale datasets. Finally, in the
experiments, we show that the DEM can achieve comparable or better results than
state-of-the-art methods on datasets across several application domains
Bayesian Information Extraction Network
Dynamic Bayesian networks (DBNs) offer an elegant way to integrate various
aspects of language in one model. Many existing algorithms developed for
learning and inference in DBNs are applicable to probabilistic language
modeling. To demonstrate the potential of DBNs for natural language processing,
we employ a DBN in an information extraction task. We show how to assemble
wealth of emerging linguistic instruments for shallow parsing, syntactic and
semantic tagging, morphological decomposition, named entity recognition etc. in
order to incrementally build a robust information extraction system. Our method
outperforms previously published results on an established benchmark domain.Comment: 6 page
A Survey on Bayesian Deep Learning
A comprehensive artificial intelligence system needs to not only perceive the
environment with different `senses' (e.g., seeing and hearing) but also infer
the world's conditional (or even causal) relations and corresponding
uncertainty. The past decade has seen major advances in many perception tasks
such as visual object recognition and speech recognition using deep learning
models. For higher-level inference, however, probabilistic graphical models
with their Bayesian nature are still more powerful and flexible. In recent
years, Bayesian deep learning has emerged as a unified probabilistic framework
to tightly integrate deep learning and Bayesian models. In this general
framework, the perception of text or images using deep learning can boost the
performance of higher-level inference and in turn, the feedback from the
inference process is able to enhance the perception of text or images. This
survey provides a comprehensive introduction to Bayesian deep learning and
reviews its recent applications on recommender systems, topic models, control,
etc. Besides, we also discuss the relationship and differences between Bayesian
deep learning and other related topics such as Bayesian treatment of neural
networks.Comment: To appear in ACM Computing Surveys (CSUR) 202
The Importance of Being Clustered: Uncluttering the Trends of Statistics from 1970 to 2015
In this paper we retrace the recent history of statistics by analyzing all
the papers published in five prestigious statistical journals since 1970,
namely: Annals of Statistics, Biometrika, Journal of the American Statistical
Association, Journal of the Royal Statistical Society, series B and Statistical
Science. The aim is to construct a kind of "taxonomy" of the statistical papers
by organizing and by clustering them in main themes. In this sense being
identified in a cluster means being important enough to be uncluttered in the
vast and interconnected world of the statistical research. Since the main
statistical research topics naturally born, evolve or die during time, we will
also develop a dynamic clustering strategy, where a group in a time period is
allowed to migrate or to merge into different groups in the following one.
Results show that statistics is a very dynamic and evolving science, stimulated
by the rise of new research questions and types of data
- …