5,602 research outputs found
High-resolution sinusoidal analysis for resolving harmonic collisions in music audio signal processing
Many music signals can largely be considered an additive combination of
multiple sources, such as musical instruments or voice. If the musical sources
are pitched instruments, the spectra they produce are predominantly harmonic,
and are thus well suited to an additive sinusoidal model. However,
due to resolution limits inherent in time-frequency analyses, when the harmonics
of multiple sources occupy equivalent time-frequency regions, their
individual properties are additively combined in the time-frequency representation
of the mixed signal. Any such time-frequency point in a mixture
where multiple harmonics overlap produces a single observation from which
the contributions owed to each of the individual harmonics cannot be trivially
deduced. These overlaps are referred to as overlapping partials or harmonic
collisions. If one wishes to infer some information about individual sources in
music mixtures, the information carried in regions where collided harmonics
exist becomes unreliable due to interference from other sources. This interference
has ramifications in a variety of music signal processing applications
such as multiple fundamental frequency estimation, source separation, and
instrumentation identification.
This thesis addresses harmonic collisions in music signal processing applications.
As a solution to the harmonic collision problem, a class of signal
subspace-based high-resolution sinusoidal parameter estimators is explored.
Specifically, the direct matrix pencil method, or equivalently, the Estimation
of Signal Parameters via Rotational Invariance Techniques (ESPRIT)
method, is used with the goal of producing estimates of the salient parameters
of individual harmonics that occupy equivalent time-frequency regions. This
estimation method is adapted here to be applicable to time-varying signals
such as musical audio. While high-resolution methods have been previously
explored in the context of music signal processing, previous work has not
addressed whether or not such methods truly produce high-resolution sinusoidal parameter estimates in real-world music audio signals. Therefore, this
thesis answers the question of whether high-resolution sinusoidal parameter
estimators are really high-resolution for real music signals.
This work directly explores the capabilities of this form of sinusoidal parameter
estimation to resolve collided harmonics. The capabilities of this
analysis method are also explored in the context of music signal processing
applications. Potential benefits of high-resolution sinusoidal analysis are
examined in experiments involving multiple fundamental frequency estimation
and audio source separation. This work shows that there are indeed
benefits to high-resolution sinusoidal analysis in music signal processing applications,
especially when compared to methods that produce sinusoidal
parameter estimates based on more traditional time-frequency representations.
The benefits of this form of sinusoidal analysis are made most evident
in multiple fundamental frequency estimation applications, where substantial
performance gains are seen. High-resolution analysis in the context of
computational auditory scene analysis-based source separation shows similar
performance to existing comparable methods
FRIDA: FRI-Based DOA Estimation for Arbitrary Array Layouts
In this paper we present FRIDA---an algorithm for estimating directions of
arrival of multiple wideband sound sources. FRIDA combines multi-band
information coherently and achieves state-of-the-art resolution at extremely
low signal-to-noise ratios. It works for arbitrary array layouts, but unlike
the various steered response power and subspace methods, it does not require a
grid search. FRIDA leverages recent advances in sampling signals with a finite
rate of innovation. It is based on the insight that for any array layout, the
entries of the spatial covariance matrix can be linearly transformed into a
uniformly sampled sum of sinusoids.Comment: Submitted to ICASSP201
Direct torque control for dual three-phase induction motor drives
A direct torque control (DTC) strategy for dual three-phase induction motor drives is discussed in this paper. The induction machine has two sets of stator three-phase windings spatially shifted by 30 electrical degrees. The DTC strategy is based on a predictive algorithm and is implemented in a synchronous reference frame aligned with the machine stator flux vector. The advantages of the discussed control strategy are constant inverter switching frequency, good transient and steady-state performance, and low distortion of machine currents with respect to direct self-control (DSC) and other DTC schemes with variable switching frequency. Experimental results are presented for a 10-kW DTC dual three-phase induction motor drive prototype
A Detailed Investigation into Low-Level Feature Detection in Spectrogram Images
Being the first stage of analysis within an image, low-level feature detection is a crucial step in the image analysis process and, as such, deserves suitable attention. This paper presents a systematic investigation into low-level feature detection in spectrogram images. The result of which is the identification of frequency tracks. Analysis of the literature identifies different strategies for accomplishing low-level feature detection. Nevertheless, the advantages and disadvantages of each are not explicitly investigated. Three model-based detection strategies are outlined, each extracting an increasing amount of information from the spectrogram, and, through ROC analysis, it is shown that at increasing levels of extraction the detection rates increase. Nevertheless, further investigation suggests that model-based detection has a limitation—it is not computationally feasible to fully evaluate the model of even a simple sinusoidal track. Therefore, alternative approaches, such as dimensionality reduction, are investigated to reduce the complex search space. It is shown that, if carefully selected, these techniques can approach the detection rates of model-based strategies that perform the same level of information extraction. The implementations used to derive the results presented within this paper are available online from http://stdetect.googlecode.com
- …