13,450 research outputs found
Computing phonological generalization over real speech exemplars
Though it has attracted growing attention from phonologists and phoneticians Exemplar Theory (e g Bybee 2001) has hitherto lacked an explicit production model that can apply to speech signals An adequate model must be able to generalize but this presents the problem of how to generate an output that generalizes over a collection of unique variable-length signals Rather than resorting to a priori phonological units such as phones we adopt a dynamic programming approach using an optimization criterion that is sensitive to the frequency of similar subsequences within other exemplars the Phonological Exemplar-Based Learning System We show that PEBLS displays pattern-entrenchment behaviour central to Exemplar Theory s account of phonologization (C) 2010 Elsevier Ltd All rights reserve
Optimizing expected word error rate via sampling for speech recognition
State-level minimum Bayes risk (sMBR) training has become the de facto
standard for sequence-level training of speech recognition acoustic models. It
has an elegant formulation using the expectation semiring, and gives large
improvements in word error rate (WER) over models trained solely using
cross-entropy (CE) or connectionist temporal classification (CTC). sMBR
training optimizes the expected number of frames at which the reference and
hypothesized acoustic states differ. It may be preferable to optimize the
expected WER, but WER does not interact well with the expectation semiring, and
previous approaches based on computing expected WER exactly involve expanding
the lattices used during training. In this paper we show how to perform
optimization of the expected WER by sampling paths from the lattices used
during conventional sMBR training. The gradient of the expected WER is itself
an expectation, and so may be approximated using Monte Carlo sampling. We show
experimentally that optimizing WER during acoustic model training gives 5%
relative improvement in WER over a well-tuned sMBR baseline on a 2-channel
query recognition task (Google Home)
Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation
The control of nonlinear dynamical systems remains a major challenge for
autonomous agents. Current trends in reinforcement learning (RL) focus on
complex representations of dynamics and policies, which have yielded impressive
results in solving a variety of hard control tasks. However, this new
sophistication and extremely over-parameterized models have come with the cost
of an overall reduction in our ability to interpret the resulting policies. In
this paper, we take inspiration from the control community and apply the
principles of hybrid switching systems in order to break down complex dynamics
into simpler components. We exploit the rich representational power of
probabilistic graphical models and derive an expectation-maximization (EM)
algorithm for learning a sequence model to capture the temporal structure of
the data and automatically decompose nonlinear dynamics into stochastic
switching linear dynamical systems. Moreover, we show how this framework of
switching models enables extracting hierarchies of Markovian and
auto-regressive locally linear controllers from nonlinear experts in an
imitation learning scenario.Comment: 2nd Annual Conference on Learning for Dynamics and Contro
Substructure and Boundary Modeling for Continuous Action Recognition
This paper introduces a probabilistic graphical model for continuous action
recognition with two novel components: substructure transition model and
discriminative boundary model. The first component encodes the sparse and
global temporal transition prior between action primitives in state-space model
to handle the large spatial-temporal variations within an action class. The
second component enforces the action duration constraint in a discriminative
way to locate the transition boundaries between actions more accurately. The
two components are integrated into a unified graphical structure to enable
effective training and inference. Our comprehensive experimental results on
both public and in-house datasets show that, with the capability to incorporate
additional information that had not been explicitly or efficiently modeled by
previous methods, our proposed algorithm achieved significantly improved
performance for continuous action recognition.Comment: Detailed version of the CVPR 2012 paper. 15 pages, 6 figure
- …