12,681 research outputs found
Inducing Probabilistic Grammars by Bayesian Model Merging
We describe a framework for inducing probabilistic grammars from corpora of
positive samples. First, samples are {\em incorporated} by adding ad-hoc rules
to a working grammar; subsequently, elements of the model (such as states or
nonterminals) are {\em merged} to achieve generalization and a more compact
representation. The choice of what to merge and when to stop is governed by the
Bayesian posterior probability of the grammar given the data, which formalizes
a trade-off between a close fit to the data and a default preference for
simpler models (`Occam's Razor'). The general scheme is illustrated using three
types of probabilistic grammars: Hidden Markov models, class-based -grams,
and stochastic context-free grammars.Comment: To appear in Grammatical Inference and Applications, Second
International Colloquium on Grammatical Inference; Springer Verlag, 1994. 13
page
Modeling Topic and Role Information in Meetings using the Hierarchical Dirichlet Process
Abstract. In this paper, we address the modeling of topic and role information in multiparty meetings, via a nonparametric Bayesian model called the hierarchical Dirichlet process. This model provides a powerful solution to topic modeling and a flexible framework for the incorporation of other cues such as speaker role information. We present our modeling framework for topic and role on the AMI Meeting Corpus, and illustrate the effectiveness of the approach in the context of adapting a baseline language model in a large-vocabulary automatic speech recognition system for multiparty meetings. The adapted LM produces significant improvements in terms of both perplexity and word error rate.
Language Modeling with Power Low Rank Ensembles
We present power low rank ensembles (PLRE), a flexible framework for n-gram
language modeling where ensembles of low rank matrices and tensors are used to
obtain smoothed probability estimates of words in context. Our method can be
understood as a generalization of n-gram modeling to non-integer n, and
includes standard techniques such as absolute discounting and Kneser-Ney
smoothing as special cases. PLRE training is efficient and our approach
outperforms state-of-the-art modified Kneser Ney baselines in terms of
perplexity on large corpora as well as on BLEU score in a downstream machine
translation task
Symbol Emergence in Robotics: A Survey
Humans can learn the use of language through physical interaction with their
environment and semiotic communication with other people. It is very important
to obtain a computational understanding of how humans can form a symbol system
and obtain semiotic skills through their autonomous mental development.
Recently, many studies have been conducted on the construction of robotic
systems and machine-learning methods that can learn the use of language through
embodied multimodal interaction with their environment and other systems.
Understanding human social interactions and developing a robot that can
smoothly communicate with human users in the long term, requires an
understanding of the dynamics of symbol systems and is crucially important. The
embodied cognition and social interaction of participants gradually change a
symbol system in a constructive manner. In this paper, we introduce a field of
research called symbol emergence in robotics (SER). SER is a constructive
approach towards an emergent symbol system. The emergent symbol system is
socially self-organized through both semiotic communications and physical
interactions with autonomous cognitive developmental agents, i.e., humans and
developmental robots. Specifically, we describe some state-of-art research
topics concerning SER, e.g., multimodal categorization, word discovery, and a
double articulation analysis, that enable a robot to obtain words and their
embodied meanings from raw sensory--motor information, including visual
information, haptic information, auditory information, and acoustic speech
signals, in a totally unsupervised manner. Finally, we suggest future
directions of research in SER.Comment: submitted to Advanced Robotic
- ā¦