1,856 research outputs found
Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments
Eliminating the negative effect of non-stationary environmental noise is a
long-standing research topic for automatic speech recognition that stills
remains an important challenge. Data-driven supervised approaches, including
ones based on deep neural networks, have recently emerged as potential
alternatives to traditional unsupervised approaches and with sufficient
training, can alleviate the shortcomings of the unsupervised methods in various
real-life acoustic environments. In this light, we review recently developed,
representative deep learning approaches for tackling non-stationary additive
and convolutional degradation of speech with the aim of providing guidelines
for those involved in the development of environmentally robust speech
recognition systems. We separately discuss single- and multi-channel techniques
developed for the front-end and back-end of speech recognition systems, as well
as joint front-end and back-end training frameworks
Interpretable and Efficient Beamforming-Based Deep Learning for Single Snapshot DOA Estimation
We introduce an interpretable deep learning approach for direction of arrival
(DOA) estimation with a single snapshot. Classical subspace-based methods like
MUSIC and ESPRIT use spatial smoothing on uniform linear arrays for single
snapshot DOA estimation but face drawbacks in reduced array aperture and
inapplicability to sparse arrays. Single-snapshot methods such as compressive
sensing and iterative adaptation approach (IAA) encounter challenges with high
computational costs and slow convergence, hampering real-time use. Recent deep
learning DOA methods offer promising accuracy and speed. However, the practical
deployment of deep networks is hindered by their black-box nature. To address
this, we propose a deep-MPDR network translating minimum power distortionless
response (MPDR)-type beamformer into deep learning, enhancing generalization
and efficiency. Comprehensive experiments conducted using both simulated and
real-world datasets substantiate its dominance in terms of inference time and
accuracy in comparison to conventional methods. Moreover, it excels in terms of
efficiency, generalizability, and interpretability when contrasted with other
deep learning DOA estimation networks.Comment: 10 pages, 10 figure
Generalized DOA and Source Number Estimation Techniques for Acoustics and Radar
The purpose of this thesis is to emphasize the lacking areas in the field of direction of arrival estimation and to propose building blocks for continued solution development in the area. A review of current methods are discussed and their pitfalls are emphasized. DOA estimators are compared to each other for usage on a conformal microphone array which receives impulsive, wideband signals. Further, many DOA estimators rely on the number of source signals prior to DOA estimation. Though techniques exist to achieve this, they lack robustness to estimate for certain signal types, particularly in the case where multiple radar targets exist in the same range bin. A deep neural network approach is proposed and evaluated for this particular case. The studies detailed in this thesis are specific to acoustic and radar applications for DOA estimation
- …