Search CORE

33,389 research outputs found

Model-based analysis of noisy musical recordings with application to audio restoration

Author: Esquef Paulo A. A.
Publication venue: Teknillinen korkeakoulu
Publication date: 02/04/2004
Field of study

This thesis proposes digital signal processing algorithms for noise reduction and enhancement of audio signals. Approximately half of the work concerns signal modeling techniques for suppression of localized disturbances in audio signals, such as impulsive noise and low-frequency pulses. In this regard, novel algorithms and modifications to previous propositions are introduced with the aim of achieving a better balance between computational complexity and qualitative performance, in comparison with other schemes presented in the literature. The main contributions related to this set of articles are: an efficient algorithm for suppression of low-frequency pulses in audio signals; a scheme for impulsive noise detection that uses frequency-warped linear prediction; and two methods for reconstruction of audio signals within long gaps of missing samples. The remaining part of the work discusses applications of sound source modeling (SSM) techniques to audio restoration. It comprises application examples, such as a method for bandwidth extension of guitar tones, and discusses the challenge of model calibration based on noisy recorded sources. Regarding this matter, a frequency-selective spectral analysis technique called frequency-zooming ARMA (FZ-ARMA) modeling is proposed as an effective way to estimate the frequency and decay time of resonance modes associated with the partials of a given tone, despite the presence of corrupting noise in the observable signal.reviewe

Aaltodoc Publication Archive

Multimodal person recognition for human-vehicle interaction

Author: Abut Huseyin
Abut Hüseyin
Ercil Aytul
Erdogan Hakan
Erdoğan Hakan
Erzin Engin
Erçil Aytül
Tekalp A. Murat
Yemez Yucel
Yemez Yücel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2006
Field of study

Next-generation vehicles will undoubtedly feature biometric person recognition as part of an effort to improve the driving experience. Today's technology prevents such systems from operating satisfactorily under adverse conditions. A proposed framework for achieving person recognition successfully combines different biometric modalities, borne out in two case studies

Sabanci University Research Database

End to End Deep Neural Network Frequency Demodulation of Speech Signals

Author: DE Rumelhart
Indranil Hatai
M Amini
M Schuster
M Önder
N Srivastava
RE Turner
S Hochreiter
T Goehring
Y Xu
Publication venue
Publication date: 07/10/2017
Field of study

Frequency modulation (FM) is a form of radio broadcasting which is widely used nowadays and has been for almost a century. We suggest a software-defined-radio (SDR) receiver for FM demodulation that adopts an end-to-end learning based approach and utilizes the prior information of transmitted speech message in the demodulation process. The receiver detects and enhances speech from the in-phase and quadrature components of its base band version. The new system yields high performance detection for both acoustical disturbances, and communication channel noise and is foreseen to out-perform the established methods for low signal to noise ratio (SNR) conditions in both mean square error and in perceptual evaluation of speech quality score

arXiv.org e-Print Archive

Crossref

"Set phasors to stun": an algorithm to improve phase coherence on transients in multi-microphone recordings

Author: Paterson Justin
Publication venue
Publication date: 02/09/2007
Field of study

Ever since the advent of multi-microphone recording, sound engineers have wrestled with the colouration of sound by phasing issues. For some this was an anathema; for others this colouration was a crucial ingredient of the finished product. Traditionally, delicate microphone placement was essential, with subtle movements and tilts allowing the producer/engineer to determine when a sound was “in phase” based on perception alone. More recently, DAW’s have allowed us to view multiple waveforms and manually nudge them into coherence with visual feedback now supporting the aural, although still a manual process. This paper will present an algorithm that allows automatic correction of phase via a unique Max/MSP patch operating on multiple audio components simultaneously. With a single button push, the producer can now hear a stereo recording with maximum coherence and thus make an artistic judgment as to whether the “ideal” is ideal, or better to pursue naturally occurring phase colouration in preference. In addition, the patch allows zoning in to spatially separated sound sources, eg tuning drum kit overheads to phase lock with the snare drum or hi-hat microphone. Audio examples will be played and the patch demonstrated in action. Limiting factors, contexts and applications will also be discussed

UWL Repository