3,677 research outputs found
The DiTME Project: interdisciplinary research in music technology
This paper profiles the emergence of a significant body of research in audio engineering within the Faculties of Engineering and Applied Arts at Dublin Institute of Technology. Over a period of five years the group has had significant success in completing a Strand 3 research project entitled Digital Tools for Music Education (DiTME)
AVISARME: Audio Visual Synchronization Algorithm for a Robotic Musician Ensemble
This thesis presents a beat detection algorithm which combines both audio and visual inputs to synchronize a robotic musician to its human counterpart. Although there has been considerable work done to create sophisticated methods for audio beat detection, the visual aspect of musicianship has been largely ignored. With advancements in image processing techniques, as well as both computer and imaging technologies, it has recently become feasible to integrate visual inputs into beat detection algorithms. Additionally, the proposed method for audio tempo detection also attempts to solve many issues that are present in current algorithms. Current audio-only algorithms have imperfections, whether they are inaccurate, too computationally expensive, or suffer from terrible resolution. Through further experimental testing on both a popular music database and simulated music signals, the proposed algorithm performed statistically better in both accuracy and robustness than the baseline approaches. Furthermore, the proposed approach is extremely efficient, taking only 45ms to compute on a 2.5s signal, and maintains an extremely high temporal resolution of 0.125 BPM. The visual integration also relies on Full Scene Tracking, allowing it to be utilized for live beat detection for practically all musicians and instruments. Numerous optimization techniques have been implemented, such as pyramidal optimization (PO) and clustering techniques which are presented in this thesis. A Temporal Difference Learning approach to sensor fusion and beat synchronization is also proposed and tested thoroughly. This TD learning algorithm implements a novel policy switching criterion which provides a stable, yet quickly reacting estimation of tempo. The proposed algorithm has been implemented and tested on a robotic drummer to verify the validity of the approach. The results from testing are documented in great detail and compared with previously proposed approaches
Recommended from our members
Structured Sub-Nyquist Sampling with Applications in Compressive Toeplitz Covariance Estimation, Super-Resolution and Phase Retrieval
Sub-Nyquist sampling has received a huge amount of interest in the past decade. In classical compressed sensing theory, if the measurement procedure satisfies a particular condition known as Restricted Isometry Property (RIP), we can achieve stable recovery of signals of low-dimensional intrinsic structures with an order-wise optimal sample size. Such low-dimensional structures include sparse and low rank for both vector and matrix cases. The main drawback of conventional compressed sensing theory is that random measurements are required to ensure the RIP property. However, in many applications such as imaging and array signal processing, applying independent random measurements may not be practical as the systems are deterministic. Moreover, random measurements based compressed sensing always exploits convex programs for signal recovery even in the noiseless case, and solving those programs is computationally intensive if the ambient dimension is large, especially in the matrix case. The main contribution of this dissertation is that we propose a deterministic sub-Nyquist sampling framework for compressing the structured signal and come up with computationally efficient algorithms. Besides widely studied sparse and low-rank structures, we particularly focus on the cases that the signals of interest are stationary or the measurements are of Fourier type. The key difference between our work from classical compressed sensing theory is that we explicitly exploit the second-order statistics of the signals, and study the equivalent quadratic measurement model in the correlation domain. The essential observation made in this dissertation is that a difference/sum coarray structure will arise from the quadratic model if the measurements are of Fourier type. With these observations, we are able to achieve a better compression rate for covariance estimation, identify more sources in array signal processing or recover the signals of larger sparsity. In this dissertation, we will first study the problem of Toeplitz covariance estimation. In particular, we will show how to achieve an order-wise optimal compression rate using the idea of sparse arrays in both general and low-rank cases. Then, an analysis framework of super-resolution with positivity constraint is established. We will present fundamental robustness guarantees, efficient algorithms and applications in practices. Next, we will study the problem of phase-retrieval for which we successfully apply the sparse array ideas by fully exploiting the quadratic measurement model. We achieve near-optimal sample complexity for both sparse and general cases with practical Fourier measurements and provide efficient and deterministic recovery algorithms. In the end, we will further elaborate on the essential role of non-negative constraint in underdetermined inverse problems. In particular, we will analyze the nonlinear co-array interpolation problem and develop a universal upper bound of the interpolation error. Bilinear problem with non-negative constraint will be considered next and the exact characterization of the ambiguous solutions will be established for the first time in literature. At last, we will show how to apply the nested array idea to solve real problems such as Kriging. Using spatial correlation information, we are able to have a stable estimate of the field of interest with fewer sensors than classic methodologies. Extensive numerical experiments are implemented to demonstrate our theoretical claims
A Sequential MUSIC algorithm for Scatterers Detection 2 in SAR Tomography Enhanced by a Robust Covariance 3 Estimator
Synthetic aperture radar (SAR) tomography (TomoSAR) is an appealing tool for
the extraction of height information of urban infrastructures. Due to the
widespread applications of the MUSIC algorithm in source localization, it is a
suitable solution in TomoSAR when multiple snapshots (looks) are available.
While the classical MUSIC algorithm aims to estimate the whole reflectivity
profile of scatterers, sequential MUSIC algorithms are suited for the detection
of sparse point-like scatterers. In this class of methods, successive
cancellation is performed through orthogonal complement projections on the
MUSIC power spectrum. In this work, a new sequential MUSIC algorithm named
recursive covariance canceled MUSIC (RCC-MUSIC), is proposed. This method
brings higher accuracy in comparison with the previous sequential methods at
the cost of a negligible increase in computational cost. Furthermore, to
improve the performance of RCC-MUSIC, it is combined with the recent method of
covariance matrix estimation called correlation subspace. Utilizing the
correlation subspace method results in a denoised covariance matrix which in
turn, increases the accuracy of subspace-based methods. Several numerical
examples are presented to compare the performance of the proposed method with
the relevant state-of-the-art methods. As a subspace method, simulation results
demonstrate the efficiency of the proposed method in terms of estimation
accuracy and computational load
Efficient methods for joint estimation of multiple fundamental frequencies in music signals
This study presents efficient techniques for multiple fundamental frequency estimation in music signals. The proposed methodology can infer harmonic patterns from a mixture considering interactions with other sources and evaluate them in a joint estimation scheme. For this purpose, a set of fundamental frequency candidates are first selected at each frame, and several hypothetical combinations of them are generated. Combinations are independently evaluated, and the most likely is selected taking into account the intensity and spectral smoothness of its inferred patterns. The method is extended considering adjacent frames in order to smooth the detection in time, and a pitch tracking stage is finally performed to increase the temporal coherence. The proposed algorithms were evaluated in MIREX contests yielding state of the art results with a very low computational burden.This study was supported by the project DRIMS (code TIN2009-14247-C02), the Consolider Ingenio 2010 research programme (project MIPRCV, CSD2007-00018), and the PASCAL2 Network of Excellence, IST-2007-216886
- …