876 research outputs found
Recommended from our members
Biologically inspired speaker verification
Speaker verification is an active research problem that has been addressed using a variety of different classification techniques. However, in general, methods inspired by the human auditory system tend to show better verification performance than other methods. In this thesis three biologically inspired speaker verification algorithms are presented
Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
We propose a novel neural speaker diarization system using memory-aware
multi-speaker embedding with sequence-to-sequence architecture (NSD-MS2S),
which integrates the strengths of memory-aware multi-speaker embedding (MA-MSE)
and sequence-to-sequence (Seq2Seq) architecture, leading to improvement in both
efficiency and performance. Next, we further decrease the memory occupation of
decoding by incorporating input features fusion and then employ a multi-head
attention mechanism to capture features at different levels. NSD-MS2S achieved
a macro diarization error rate (DER) of 15.9% on the CHiME-7 EVAL set, which
signifies a relative improvement of 49% over the official baseline system, and
is the key technique for us to achieve the best performance for the main track
of CHiME-7 DASR Challenge. Additionally, we introduce a deep interactive module
(DIM) in MA-MSE module to better retrieve a cleaner and more discriminative
multi-speaker embedding, enabling the current model to outperform the system we
used in the CHiME-7 DASR Challenge. Our code will be available at
https://github.com/liyunlongaaa/NSD-MS2S.Comment: Submitted to ICASSP 202
A detection-based pattern recognition framework and its applications
The objective of this dissertation is to present a detection-based pattern recognition framework and demonstrate its applications in automatic speech recognition and broadcast news video story segmentation.
Inspired by the studies of modern cognitive psychology and real-world pattern recognition systems, a detection-based pattern recognition framework is proposed to provide an alternative solution for some complicated pattern recognition problems. The primitive features are first detected and the task-specific knowledge hierarchy is constructed level by level; then a variety of heterogeneous information sources are combined together and the high-level context is incorporated as additional information at certain stages.
A detection-based framework is a â divide-and-conquerâ design paradigm for pattern recognition problems, which will decompose a conceptually difficult problem into many elementary sub-problems that can be handled directly and reliably. Some information fusion strategies will be employed to integrate the evidence from a lower level to form the evidence at a higher level. Such a fusion procedure continues until reaching the top level. Generally, a detection-based framework has many advantages: (1) more flexibility in both detector design and fusion strategies, as these two parts
can be optimized separately; (2) parallel and distributed computational components in primitive feature detection. In such a component-based framework, any primitive component can be replaced by a new one while other components remain unchanged; (3) incremental information integration; (4) high level context information as additional information sources, which can be combined with bottom-up processing at any stage.
This dissertation presents the basic principles, criteria, and techniques for detector design and hypothesis verification based on the statistical detection and decision theory. In addition, evidence fusion strategies were investigated in this dissertation. Several novel detection algorithms and evidence fusion methods were proposed and their effectiveness was justified in automatic speech recognition and broadcast news video segmentation system. We believe such a detection-based framework can be employed
in more applications in the future.Ph.D.Committee Chair: Lee, Chin-Hui; Committee Member: Clements, Mark; Committee Member: Ghovanloo, Maysam; Committee Member: Romberg, Justin; Committee Member: Yuan, Min
- …