Search CORE

21,704 research outputs found

A Subband-Based SVM Front-End for Robust ASR

Author: Ager Matthew
Cvetkovic Zoran
Sollich Peter
Yousafzai Jibran
Publication venue
Publication date: 24/12/2013
Field of study

This work proposes a novel support vector machine (SVM) based robust automatic speech recognition (ASR) front-end that operates on an ensemble of the subband components of high-dimensional acoustic waveforms. The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed. The proposed front-end is compared with state-of-the-art ASR front-ends in terms of robustness to additive noise and linear filtering. Experiments performed on the TIMIT phoneme classification task demonstrate the benefits of the proposed subband based SVM front-end: it outperforms the standard cepstral front-end in the presence of noise and linear filtering for signal-to-noise ratio (SNR) below 12-dB. A combination of the proposed front-end with a conventional front-end such as MFCC yields further improvements over the individual front ends across the full range of noise levels

arXiv.org e-Print Archive

King's Research Portal

Unmasking Clever Hans Predictors and Assessing What Machines Really Learn

Author: Binder Alexander
Lapuschkin Sebastian
Montavon Grégoire
Müller Klaus-Robert
Samek Wojciech
Wäldchen Stephan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/02/2019
Field of study

Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly "intelligent" behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.Comment: Accepted for publication in Nature Communication

arXiv.org e-Print Archive

Directory of Open Access Journals

Fraunhofer-ePrints

Inferring Room Semantics Using Acoustic Monitoring

Author: Harras Khaled A.
Raj Bhiksha
Shah Muhammad A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/10/2017
Field of study

Having knowledge of the environmental context of the user i.e. the knowledge of the users' indoor location and the semantics of their environment, can facilitate the development of many of location-aware applications. In this paper, we propose an acoustic monitoring technique that infers semantic knowledge about an indoor space \emph{over time,} using audio recordings from it. Our technique uses the impulse response of these spaces as well as the ambient sounds produced in them in order to determine a semantic label for them. As we process more recordings, we update our \emph{confidence} in the assigned label. We evaluate our technique on a dataset of single-speaker human speech recordings obtained in different types of rooms at three university buildings. In our evaluation, the confidence\emph{ }for the true label generally outstripped the confidence for all other labels and in some cases converged to 100\% with less than 30 samples.Comment: 2017 IEEE International Workshop on Machine Learning for Signal Processing, Sept.\ 25--28, 2017, Tokyo, Japa

arXiv.org e-Print Archive

Crossref

Deep Learning for Audio Signal Processing

Author: Chang Shuo-yiin
Li Bo
Purwins Hendrik
Sainath Tara
Schlüter Jan
Virtanen Tuomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2019
Field of study

Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure

arXiv.org e-Print Archive

VBN

Probabilistic Modeling Paradigms for Audio Source Separation

Author: A. P.Dempster
A.Gelman
D. L.Wang
D.FitzGerald
J.Nocedal
J.Winn
M. I.Mandel
R. J.Weiss
R.Mukai
S. T.Roweis
S.Makino
Publication venue: 'IGI Global'
Publication date: 01/01/2010
Field of study

This is the author's final version of the article, first published as E. Vincent, M. G. Jafari, S. A. Abdallah, M. D. Plumbley, M. E. Davies. Probabilistic Modeling Paradigms for Audio Source Separation. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 7, pp. 162-185. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch007file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04Most sound scenes result from the superposition of several sources, which can be separately perceived and analyzed by human listeners. Source separation aims to provide machine listeners with similar skills by extracting the sounds of individual sources from a given scene. Existing separation systems operate either by emulating the human auditory system or by inferring the parameters of probabilistic sound models. In this chapter, the authors focus on the latter approach and provide a joint overview of established and recent models, including independent component analysis, local time-frequency models and spectral template-based models. They show that most models are instances of one of the following two general paradigms: linear modeling or variance modeling. They compare the merits of either paradigm and report objective performance figures. They also,conclude by discussing promising combinations of probabilistic priors and inference algorithms that could form the basis of future state-of-the-art systems

HAL-CentraleSupelec

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Queen Mary Research Online

Surrey Research Insight

HAL-Rennes 1

The SED Machine: a robotic spectrograph for fast transient classification

Author: Ben-Ami Sagi
Blagorodnova Nadejda
Dekany Richard G.
Fremling Christoffer
Fucik Jason R.
Konidaris Nick
Kulkarni Shrinivas R.
Nash Reston
Neill James D.
Ngeow Chow-Choong
Ofek Eran O.
Quimby Robert
Ritter Andreas
Sullivan Donal O'
Vyhmeister Karl E.
Walters Richard
Publication venue: 'IOP Publishing'
Publication date: 11/10/2017
Field of study

Current time domain facilities are finding several hundreds of transient astronomical events a year. The discovery rate is expected to increase in the future as soon as new surveys such as the Zwicky Transient Facility (ZTF) and the Large Synoptic Sky Survey (LSST) come on line. At the present time, the rate at which transients are classified is approximately one order or magnitude lower than the discovery rate, leading to an increasing "follow-up drought". Existing telescopes with moderate aperture can help address this deficit when equipped with spectrographs optimized for spectral classification. Here, we provide an overview of the design, operations and first results of the Spectral Energy Distribution Machine (SEDM), operating on the Palomar 60-inch telescope (P60). The instrument is optimized for classification and high observing efficiency. It combines a low-resolution (R

\sim

100) integral field unit (IFU) spectrograph with "Rainbow Camera" (RC), a multi-band field acquisition camera which also serves as multi-band (ugri) photometer. The SEDM was commissioned during the operation of the intermediate Palomar Transient Factory (iPTF) and has already proved lived up to its promise. The success of the SEDM demonstrates the value of spectrographs optimized to spectral classification. Introduction of similar spectrographs on existing telescopes will help alleviate the follow-up drought and thereby accelerate the rate of discoveries.Comment: 21 pages, 20 figure

arXiv.org e-Print Archive

Caltech Authors

Astronomical Spectroscopy

Author: A Dressler
A Hoag
A Sota
AB Meinel
AI Sheinis
AK Pierce
AV Filippenko
B Atwood
B Campbell
BE Woodgate
CD Mackay
CE Moore
CG Wynne
D Fabricant
DE Osterbrock
DE Osterbrock
DE Turnshek
DL Depoy
E Oliva
EG Loewen
EH Richardson
EM Levesque
F Pepe
F Schweizer
G Walker
GH Jacoby
GH Rieke
IS Bowen
IS Bowen
J Allington-Smith
J Tonry
JB Oke
JL Bean
JL Marshall
K Horne
KF Neugent
L Valdivielso
M Hamuy
MM Hanson
MM Hanson
MR Drout
NR Walborn
P Massey
P Massey
P Massey
P Massey
P Massey
PS Conti
R Coluzzi
RPS Stone
RPS Stone
RPS Stone
TR Marsh
VD Ivanov
WP Bidelman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/05/2011
Field of study

Spectroscopy is one of the most important tools that an astronomer has for studying the universe. This chapter begins by discussing the basics, including the different types of optical spectrographs, with extension to the ultraviolet and the near-infrared. Emphasis is given to the fundamentals of how spectrographs are used, and the trade-offs involved in designing an observational experiment. It then covers observing and reduction techniques, noting that some of the standard practices of flat-fielding often actually degrade the quality of the data rather than improve it. Although the focus is on point sources, spatially resolved spectroscopy of extended sources is also briefly discussed. Discussion of differential extinction, the impact of crowding, multi-object techniques, optimal extractions, flat-fielding considerations, and determining radial velocities and velocity dispersions provide the spectroscopist with the fundamentals needed to obtain the best data. Finally the chapter combines the previous material by providing some examples of real-life observing experiences with several typical instruments.Comment: An abridged version of a chapter to appear in Planets, Stars and Stellar Systems, to be published in 2011 by Springer. Slightly revise

arXiv.org e-Print Archive

Crossref