46,120 research outputs found
Nonparametric Bayesian Double Articulation Analyzer for Direct Language Acquisition from Continuous Speech Signals
Human infants can discover words directly from unsegmented speech signals
without any explicitly labeled data. In this paper, we develop a novel machine
learning method called nonparametric Bayesian double articulation analyzer
(NPB-DAA) that can directly acquire language and acoustic models from observed
continuous speech signals. For this purpose, we propose an integrative
generative model that combines a language model and an acoustic model into a
single generative model called the "hierarchical Dirichlet process hidden
language model" (HDP-HLM). The HDP-HLM is obtained by extending the
hierarchical Dirichlet process hidden semi-Markov model (HDP-HSMM) proposed by
Johnson et al. An inference procedure for the HDP-HLM is derived using the
blocked Gibbs sampler originally proposed for the HDP-HSMM. This procedure
enables the simultaneous and direct inference of language and acoustic models
from continuous speech signals. Based on the HDP-HLM and its inference
procedure, we developed a novel double articulation analyzer. By assuming
HDP-HLM as a generative model of observed time series data, and by inferring
latent variables of the model, the method can analyze latent double
articulation structure, i.e., hierarchically organized latent words and
phonemes, of the data in an unsupervised manner. The novel unsupervised double
articulation analyzer is called NPB-DAA.
The NPB-DAA can automatically estimate double articulation structure embedded
in speech signals. We also carried out two evaluation experiments using
synthetic data and actual human continuous speech signals representing Japanese
vowel sequences. In the word acquisition and phoneme categorization tasks, the
NPB-DAA outperformed a conventional double articulation analyzer (DAA) and
baseline automatic speech recognition system whose acoustic model was trained
in a supervised manner.Comment: 15 pages, 7 figures, Draft submitted to IEEE Transactions on
Autonomous Mental Development (TAMD
Shape-Based Models for Interactive Segmentation of Medical Images
Accurate image segmentation is one of the key problems in computer vision. In domains such as radiation treatment planning, dosimetrists must manually trace the outlines of a few critical structures on large numbers of images. Considerable similarity can be seen in the shape of these regions, both between adjacent slices in a particular patient and across the spectrum of patients. Consequently we should be able to model this similarity and use it to assist in the process of segmentation. Previous work has demonstrated that a constraint-based 2D radial model can capture generic shape information for certain shape classes, and can reduce user interaction by a factor of three over purely manual segmentation. Additional simulation studies have shown that a probabilistic version of the model has the potential to further reduce user interaction. This paper describes an implementation of both models in a general-purpose imaging and graphics framework and compares the usefulness of the models on several shape classes
Anatomical Priors in Convolutional Networks for Unsupervised Biomedical Segmentation
We consider the problem of segmenting a biomedical image into anatomical
regions of interest. We specifically address the frequent scenario where we
have no paired training data that contains images and their manual
segmentations. Instead, we employ unpaired segmentation images to build an
anatomical prior. Critically these segmentations can be derived from imaging
data from a different dataset and imaging modality than the current task. We
introduce a generative probabilistic model that employs the learned prior
through a convolutional neural network to compute segmentations in an
unsupervised setting. We conducted an empirical analysis of the proposed
approach in the context of structural brain MRI segmentation, using a
multi-study dataset of more than 14,000 scans. Our results show that an
anatomical prior can enable fast unsupervised segmentation which is typically
not possible using standard convolutional networks. The integration of
anatomical priors can facilitate CNN-based anatomical segmentation in a range
of novel clinical problems, where few or no annotations are available and thus
standard networks are not trainable. The code is freely available at
http://github.com/adalca/neuron.Comment: Presented at CVPR 2018. IEEE CVPR proceedings pp. 9290-929
Learning to Segment and Represent Motion Primitives from Driving Data for Motion Planning Applications
Developing an intelligent vehicle which can perform human-like actions
requires the ability to learn basic driving skills from a large amount of
naturalistic driving data. The algorithms will become efficient if we could
decompose the complex driving tasks into motion primitives which represent the
elementary compositions of driving skills. Therefore, the purpose of this paper
is to segment unlabeled trajectory data into a library of motion primitives. By
applying a probabilistic inference based on an iterative
Expectation-Maximization algorithm, our method segments the collected
trajectories while learning a set of motion primitives represented by the
dynamic movement primitives. The proposed method utilizes the mutual
dependencies between the segmentation and representation of motion primitives
and the driving-specific based initial segmentation. By utilizing this mutual
dependency and the initial condition, this paper presents how we can enhance
the performance of both the segmentation and the motion primitive library
establishment. We also evaluate the applicability of the primitive
representation method to imitation learning and motion planning algorithms. The
model is trained and validated by using the driving data collected from the
Beijing Institute of Technology intelligent vehicle platform. The results show
that the proposed approach can find the proper segmentation and establish the
motion primitive library simultaneously
- …