19,170 research outputs found
Incremental Art: A Neural Network System for Recognition by Incremental Feature Extraction
Abstract Incremental ART extends adaptive resonance theory (ART) by incorporating mechanisms for efficient recognition through incremental feature extraction. The system achieves efficient confident prediction through the controlled acquisition of only those features necessary to discriminate an input pattern. These capabilities are achieved through three modifications to the fuzzy ART system: (1) A partial feature vector complement coding rule extends fuzzy ART logic to allow recognition based on partial feature vectors. (2) The addition of a F2 decision criterion to measure ART predictive confidence. (3) An incremental feature extraction layer computes the next feature to extract based on a measure of predictive value. Our system is demonstrated on a face recognition problem but has general applicability as a machine vision solution and as model for studying scanning patterns.Office of Naval Research (N00014-92-J-4015, N00014-92-J-1309, N00014-91-4100); Air Force Office of Scientific Research (90-0083); National Science Foundation (IRI 90-00530
Adversarial Network Bottleneck Features for Noise Robust Speaker Verification
In this paper, we propose a noise robust bottleneck feature representation
which is generated by an adversarial network (AN). The AN includes two cascade
connected networks, an encoding network (EN) and a discriminative network (DN).
Mel-frequency cepstral coefficients (MFCCs) of clean and noisy speech are used
as input to the EN and the output of the EN is used as the noise robust
feature. The EN and DN are trained in turn, namely, when training the DN, noise
types are selected as the training labels and when training the EN, all labels
are set as the same, i.e., the clean speech label, which aims to make the AN
features invariant to noise and thus achieve noise robustness. We evaluate the
performance of the proposed feature on a Gaussian Mixture Model-Universal
Background Model based speaker verification system, and make comparison to MFCC
features of speech enhanced by short-time spectral amplitude minimum mean
square error (STSA-MMSE) and deep neural network-based speech enhancement
(DNN-SE) methods. Experimental results on the RSR2015 database show that the
proposed AN bottleneck feature (AN-BN) dramatically outperforms the STSA-MMSE
and DNN-SE based MFCCs for different noise types and signal-to-noise ratios.
Furthermore, the AN-BN feature is able to improve the speaker verification
performance under the clean condition
SEGAN: Speech Enhancement Generative Adversarial Network
Current speech enhancement techniques operate on the spectral domain and/or
exploit some higher-level feature. The majority of them tackle a limited number
of noise conditions and rely on first-order statistics. To circumvent these
issues, deep networks are being increasingly used, thanks to their ability to
learn complex functions from large example sets. In this work, we propose the
use of generative adversarial networks for speech enhancement. In contrast to
current techniques, we operate at the waveform level, training the model
end-to-end, and incorporate 28 speakers and 40 different noise conditions into
the same model, such that model parameters are shared across them. We evaluate
the proposed model using an independent, unseen test set with two speakers and
20 alternative noise conditions. The enhanced samples confirm the viability of
the proposed model, and both objective and subjective evaluations confirm the
effectiveness of it. With that, we open the exploration of generative
architectures for speech enhancement, which may progressively incorporate
further speech-centric design choices to improve their performance.Comment: 5 pages, 4 figures, accepted in INTERSPEECH 201
- …