417 research outputs found
A Phase Vocoder based on Nonstationary Gabor Frames
We propose a new algorithm for time stretching music signals based on the
theory of nonstationary Gabor frames (NSGFs). The algorithm extends the
techniques of the classical phase vocoder (PV) by incorporating adaptive
time-frequency (TF) representations and adaptive phase locking. The adaptive TF
representations imply good time resolution for the onsets of attack transients
and good frequency resolution for the sinusoidal components. We estimate the
phase values only at peak channels and the remaining phases are then locked to
the values of the peaks in an adaptive manner. During attack transients we keep
the stretch factor equal to one and we propose a new strategy for determining
which channels are relevant for reinitializing the corresponding phase values.
In contrast to previously published algorithms we use a non-uniform NSGF to
obtain a low redundancy of the corresponding TF representation. We show that
with just three times as many TF coefficients as signal samples, artifacts such
as phasiness and transient smearing can be greatly reduced compared to the
classical PV. The proposed algorithm is tested on both synthetic and real world
signals and compared with state of the art algorithms in a reproducible manner.Comment: 10 pages, 6 figure
Frame Theory for Signal Processing in Psychoacoustics
This review chapter aims to strengthen the link between frame theory and
signal processing tasks in psychoacoustics. On the one side, the basic concepts
of frame theory are presented and some proofs are provided to explain those
concepts in some detail. The goal is to reveal to hearing scientists how this
mathematical theory could be relevant for their research. In particular, we
focus on frame theory in a filter bank approach, which is probably the most
relevant view-point for audio signal processing. On the other side, basic
psychoacoustic concepts are presented to stimulate mathematicians to apply
their knowledge in this field
An Entropy Based Method for Local Time-Adaptation of the Spectrogram
We propose a method for automatic local time-adaptation of the spectrogram of
audio signals: it is based on the decomposition of a signal within a Gabor
multi-frame through the STFT operator. The sparsity of the analysis in every
individual frame of the multi-frame is evaluated through the R\'enyi entropy
measures: the best local resolution is determined minimizing the entropy
values. The overall spectrogram of the signal we obtain thus provides local
optimal resolution adaptively evolving over time. We give examples of the
performance of our algorithm with an instrumental sound and a synthetic one,
showing the improvement in spectrogram displaying obtained with an automatic
adaptation of the resolution. The analysis operator is invertible, thus leading
to a perfect reconstruction of the original signal through the analysis
coefficients
Algorithms and architectures for the multirate additive synthesis of musical tones
In classical Additive Synthesis (AS), the output signal is the sum of a large number of independently controllable sinusoidal partials. The advantages of AS for music synthesis are well known as is the high computational cost. This thesis is concerned with the computational optimisation of AS by multirate DSP techniques. In note-based music synthesis, the expected bounds of the frequency trajectory of each partial in a finite lifecycle tone determine critical time-invariant partial-specific sample rates which are lower than the conventional rate (in excess of 40kHz) resulting in computational savings. Scheduling and interpolation (to suppress quantisation noise) for many sample rates is required, leading to the concept of Multirate Additive Synthesis (MAS) where these overheads are minimised by synthesis filterbanks which quantise the set of available sample rates. Alternative AS optimisations are also appraised. It is shown that a hierarchical interpretation of the QMF filterbank preserves AS generality and permits efficient context-specific adaptation of computation to required note dynamics. Practical QMF implementation and the modifications necessary for MAS are discussed. QMF transition widths can be logically excluded from the MAS paradigm, at a cost. Therefore a novel filterbank is evaluated where transition widths are physically excluded. Benchmarking of a hypothetical orchestral synthesis application provides a tentative quantitative analysis of the performance improvement of MAS over AS. The mapping of MAS into VLSI is opened by a review of sine computation techniques. Then the functional specification and high-level design of a conceptual MAS Coprocessor (MASC) is developed which functions with high autonomy in a loosely-coupled master- slave configuration with a Host CPU which executes filterbanks in software. Standard hardware optimisation techniques are used, such as pipelining, based upon the principle of an application-specific memory hierarchy which maximises MASC throughput
2-D Prony-Huang Transform: A New Tool for 2-D Spectral Analysis
This work proposes an extension of the 1-D Hilbert Huang transform for the
analysis of images. The proposed method consists in (i) adaptively decomposing
an image into oscillating parts called intrinsic mode functions (IMFs) using a
mode decomposition procedure, and (ii) providing a local spectral analysis of
the obtained IMFs in order to get the local amplitudes, frequencies, and
orientations. For the decomposition step, we propose two robust 2-D mode
decompositions based on non-smooth convex optimization: a "Genuine 2-D"
approach, that constrains the local extrema of the IMFs, and a "Pseudo 2-D"
approach, which constrains separately the extrema of lines, columns, and
diagonals. The spectral analysis step is based on Prony annihilation property
that is applied on small square patches of the IMFs. The resulting 2-D
Prony-Huang transform is validated on simulated and real data.Comment: 24 pages, 7 figure
- …