40,566 research outputs found
Band-pass filtering of the time sequences of spectral parameters for robust wireless speech recognition
In this paper we address the problem of automatic speech recognition when wireless speech communication systems are involved. In this context, three main sources of distortion should be considered: acoustic environment, speech coding and transmission errors. Whilst the first one has already received a lot of attention, the last two deserve further investigation in our opinion. We have found out that band-pass filtering of the recognition features improves ASR performance when distortions due to these particular communication systems are present. Furthermore, we have evaluated two alternative configurations at different bit error rates (BER) typical of these channels: band-pass filtering the LP-MFCC parameters or a modification of the RASTA-PLP using a sharper low-pass section perform consistently better than LP-MFCC and RASTA-PLP, respectively.Publicad
Robust Adaptive Median Binary Pattern for noisy texture classification and retrieval
Texture is an important cue for different computer vision tasks and
applications. Local Binary Pattern (LBP) is considered one of the best yet
efficient texture descriptors. However, LBP has some notable limitations,
mostly the sensitivity to noise. In this paper, we address these criteria by
introducing a novel texture descriptor, Robust Adaptive Median Binary Pattern
(RAMBP). RAMBP based on classification process of noisy pixels, adaptive
analysis window, scale analysis and image regions median comparison. The
proposed method handles images with high noisy textures, and increases the
discriminative properties by capturing microstructure and macrostructure
texture information. The proposed method has been evaluated on popular texture
datasets for classification and retrieval tasks, and under different high noise
conditions. Without any train or prior knowledge of noise type, RAMBP achieved
the best classification compared to state-of-the-art techniques. It scored more
than under impulse noise densities, more than under
Gaussian noised textures with standard deviation , and more than
under Gaussian blurred textures with standard deviation .
The proposed method yielded competitive results and high performance as one of
the best descriptors in noise-free texture classification. Furthermore, RAMBP
showed also high performance for the problem of noisy texture retrieval
providing high scores of recall and precision measures for textures with high
levels of noise
Learning An Invariant Speech Representation
Recognition of speech, and in particular the ability to generalize and learn
from small sets of labelled examples like humans do, depends on an appropriate
representation of the acoustic input. We formulate the problem of finding
robust speech features for supervised learning with small sample complexity as
a problem of learning representations of the signal that are maximally
invariant to intraclass transformations and deformations. We propose an
extension of a theory for unsupervised learning of invariant visual
representations to the auditory domain and empirically evaluate its validity
for voiced speech sound classification. Our version of the theory requires the
memory-based, unsupervised storage of acoustic templates -- such as specific
phones or words -- together with all the transformations of each that normally
occur. A quasi-invariant representation for a speech segment can be obtained by
projecting it to each template orbit, i.e., the set of transformed signals, and
computing the associated one-dimensional empirical probability distributions.
The computations can be performed by modules of filtering and pooling, and
extended to hierarchical architectures. In this paper, we apply a single-layer,
multicomponent representation for phonemes and demonstrate improved accuracy
and decreased sample complexity for vowel classification compared to standard
spectral, cepstral and perceptual features.Comment: CBMM Memo No. 022, 5 pages, 2 figure
Focusing on the Big Picture: Insights into a Systems Approach to Deep Learning for Satellite Imagery
Deep learning tasks are often complicated and require a variety of components
working together efficiently to perform well. Due to the often large scale of
these tasks, there is a necessity to iterate quickly in order to attempt a
variety of methods and to find and fix bugs. While participating in IARPA's
Functional Map of the World challenge, we identified challenges along the
entire deep learning pipeline and found various solutions to these challenges.
In this paper, we present the performance, engineering, and deep learning
considerations with processing and modeling data, as well as underlying
infrastructure considerations that support large-scale deep learning tasks. We
also discuss insights and observations with regard to satellite imagery and
deep learning for image classification.Comment: Accepted to IEEE Big Data 201
- âŠ