8,966 research outputs found
Multilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model
Multilingual models for Automatic Speech Recognition (ASR) are attractive as
they have been shown to benefit from more training data, and better lend
themselves to adaptation to under-resourced languages. However, initialisation
from monolingual context-dependent models leads to an explosion of
context-dependent states. Connectionist Temporal Classification (CTC) is a
potential solution to this as it performs well with monophone labels.
We investigate multilingual CTC in the context of adaptation and
regularisation techniques that have been shown to be beneficial in more
conventional contexts. The multilingual model is trained to model a universal
International Phonetic Alphabet (IPA)-based phone set using the CTC loss
function. Learning Hidden Unit Contribution (LHUC) is investigated to perform
language adaptive training. In addition, dropout during cross-lingual
adaptation is also studied and tested in order to mitigate the overfitting
problem.
Experiments show that the performance of the universal phoneme-based CTC
system can be improved by applying LHUC and it is extensible to new phonemes
during cross-lingual adaptation. Updating all the parameters shows consistent
improvement on limited data. Applying dropout during adaptation can further
improve the system and achieve competitive performance with Deep Neural Network
/ Hidden Markov Model (DNN/HMM) systems on limited data
User-centered visual analysis using a hybrid reasoning architecture for intensive care units
One problem pertaining to Intensive Care Unit information systems is that, in some cases, a very dense display of data can result. To ensure the overview and readability of the increasing volumes of data, some special features are required (e.g., data prioritization, clustering, and selection mechanisms) with the application of analytical methods (e.g., temporal data abstraction, principal component analysis, and detection of events). This paper addresses the problem of improving the integration of the visual and analytical methods applied to medical monitoring systems. We present a knowledge- and machine learning-based approach to support the knowledge discovery process with appropriate analytical and visual methods. Its potential benefit to the development of user interfaces for intelligent monitors that can assist with the detection and explanation of new, potentially threatening medical events. The proposed hybrid reasoning architecture provides an interactive graphical user interface to adjust the parameters of the analytical methods based on the users' task at hand. The action sequences performed on the graphical user interface by the user are consolidated in a dynamic knowledge base with specific hybrid reasoning that integrates symbolic and connectionist approaches. These sequences of expert knowledge acquisition can be very efficient for making easier knowledge emergence during a similar experience and positively impact the monitoring of critical situations. The provided graphical user interface incorporating a user-centered visual analysis is exploited to facilitate the natural and effective representation of clinical information for patient care
Multilingual Adaptation of RNN Based ASR Systems
In this work, we focus on multilingual systems based on recurrent neural
networks (RNNs), trained using the Connectionist Temporal Classification (CTC)
loss function. Using a multilingual set of acoustic units poses difficulties.
To address this issue, we proposed Language Feature Vectors (LFVs) to train
language adaptive multilingual systems. Language adaptation, in contrast to
speaker adaptation, needs to be applied not only on the feature level, but also
to deeper layers of the network. In this work, we therefore extended our
previous approach by introducing a novel technique which we call "modulation".
Based on this method, we modulated the hidden layers of RNNs using LFVs. We
evaluated this approach in both full and low resource conditions, as well as
for grapheme and phone based systems. Lower error rates throughout the
different conditions could be achieved by the use of the modulation.Comment: 5 pages, 1 figure, to appear in 2018 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP 2018
Learning a world model and planning with a self-organizing, dynamic neural system
We present a connectionist architecture that can learn a model of the
relations between perceptions and actions and use this model for behavior
planning. State representations are learned with a growing self-organizing
layer which is directly coupled to a perception and a motor layer. Knowledge
about possible state transitions is encoded in the lateral connectivity. Motor
signals modulate this lateral connectivity and a dynamic field on the layer
organizes a planning process. All mechanisms are local and adaptation is based
on Hebbian ideas. The model is continuous in the action, perception, and time
domain.Comment: 9 pages, see http://www.marc-toussaint.net
- ā¦