Search CORE

4,559 research outputs found

Scalable and perceptual audio compression

Author: Raad Mohammed
Publication venue: School of Electrical, Computer and Telecommunications Engineering
Publication date: 01/01/2003
Field of study

This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner

Research Online

Adaptive RD Optimized Hybrid Sound Coding

Author: A. Niamut Omar
Bensa Julien
Christensen Mads Græsbøll
Colomes Catherine
Edler Bernd
H. Plasberg Jan
H. van Schijndel Nicolle
Heusdens Richard
Jensen Jesper
Jensen Søren Holdt
Kleijn W. Bastiaan
Kot Valery
Kovesi Bala Zs
Lindblom Jonas
Massaloux Dominique
Nordén Fredrik
Vafin Renat
Van De Par Steven
Virette David
Wûbbolt Oliver
Publication venue
Publication date: 01/01/2008
Field of study

VBN

The DESAM toolbox: spectral analysis of musical audio

Author: Badeau Roland
Bertin Nancy
Daudet Laurent
David Bertrand
Derrien Olivier
Echeveste Jose
Lagrange Mathieu
Marchand Sylvain
Publication venue: HAL CCSD
Publication date: 01/09/2010
Field of study

International audienceIn this paper is presented the DESAM Toolbox, a set of Matlab functions dedicated to the estimation of widely used spectral models for music signals. Although those models can be used in Music Information Retrieval (MIR) tasks, the core functions of the toolbox do not focus on any specific application. It is rather aimed at providing a range of state-of-the-art signal processing tools that decompose music files according to different signal models, giving rise to different ``mid-level'' representations. After motivating the need for such a toolbox, this paper offers an overview of the overall organization of the toolbox, and describes all available functionalities

HAL-CentraleSupelec

HAL AMU

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

An adaptive perception-based image preprocessing method

Author: BRUNI VITTORIA
SELESNICK Ivan William
Tarchi L.
Vitulano D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

The aim of this paper is to introduce an adaptive preprocessing procedure based on human perception in order to increase the performance of some standard image processing techniques. Specifically, image frequency content has been weighted by the corresponding value of the contrast sensitivity function, in agreement with the sensitiveness of human eye to the different image frequencies and contrasts. The 2D Rational dilation wavelet transform has been employed for representing image frequencies. In fact, it provides an adaptive and flexible multiresolution framework, enabling an easy and straightforward adaptation to the image frequency content. Preliminary experimental results show that the proposed preprocessing allows us to increase the performance of some standard image enhancement algorithms in terms of visual quality and often also in terms of PSNR

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Audio Analysis/synthesis System

Author
Publication venue
Publication date
Field of study

A method and apparatus for the automatic analysis, synthesis and modification of audio signals, based on an overlap-add sinusoidal model, is disclosed. Automatic analysis of amplitude, frequency and phase parameters of the model is achieved using an analysis-by-synthesis procedure which incorporates successive approximation, yielding synthetic waveforms which are very good approximations to the original waveforms and are perceptually identical to the original sounds. A generalized overlap-add sinusoidal model is introduced which can modify audio signals without objectionable artifacts. In addition, a new approach to pitch-scale modification allows for the use of arbitrary spectral envelope estimates and addresses the problems of high-frequency loss and noise amplification encountered with prior art methods. The overlap-add synthesis method provides the ability to synthesize sounds with computational efficiency rivaling that of synthesis using the discrete short-time Fourier transform (DSTFT) while eliminating the modification artifacts associated with that method.Georgia Tech Research Corporatio

Scholarly Materials And Research @ Georgia Tech

Single-Channel and Multi-Channel Sinusoidal Audio Coding Using Compressed Sensing

Author: Anthony Griffin
Athanasios Mouchtaris
Christos Tzagkarakis
Panagiotis Tsakalides
Toni Hirvonen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Multichannel high resolution NMF for modelling convolutive mixtures of non-stationary signals in the time-frequency domain

Author: Badeau Roland
Plumbley Mark
Publication venue
Publication date: 01/01/2013
Field of study

Several probabilistic models involving latent components have been proposed for modeling time-frequency (TF) representations of audio signals such as spectrograms, notably in the nonnegative matrix factorization (NMF) literature. Among them, the recent high-resolution NMF (HR-NMF) model is able to take both phases and local correlations in each frequency band into account, and its potential has been illustrated in applications such as source separation and audio inpainting. In this paper, HR-NMF is extended to multichannel signals and to convolutive mixtures. The new model can represent a variety of stationary and non-stationary signals, including autoregressive moving average (ARMA) processes and mixtures of damped sinusoids. A fast variational expectation-maximization (EM) algorithm is proposed to estimate the enhanced model. This algorithm is applied to piano signals, and proves capable of accurately modeling reverberation, restoring missing observations, and separating pure tones with close frequencies

Crossref

Queen Mary Research Online

Surrey Research Insight

Superposition frames for adaptive time-frequency analysis and fast reconstruction

Author: Basu Prabahan
Rudoy Daniel
Wolfe Patrick J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/11/2009
Field of study

In this article we introduce a broad family of adaptive, linear time-frequency representations termed superposition frames, and show that they admit desirable fast overlap-add reconstruction properties akin to standard short-time Fourier techniques. This approach stands in contrast to many adaptive time-frequency representations in the extant literature, which, while more flexible than standard fixed-resolution approaches, typically fail to provide efficient reconstruction and often lack the regular structure necessary for precise frame-theoretic analysis. Our main technical contributions come through the development of properties which ensure that this construction provides for a numerically stable, invertible signal representation. Our primary algorithmic contributions come via the introduction and discussion of specific signal adaptation criteria in deterministic and stochastic settings, based respectively on time-frequency concentration and nonstationarity detection. We conclude with a short speech enhancement example that serves to highlight potential applications of our approach.Comment: 16 pages, 6 figures; revised versio

arXiv.org e-Print Archive

Crossref