129,490 research outputs found

    Joint Multi-Pitch Detection Using Harmonic Envelope Estimation for Polyphonic Music Transcription

    Get PDF
    In this paper, a method for automatic transcription of music signals based on joint multiple-F0 estimation is proposed. As a time-frequency representation, the constant-Q resonator time-frequency image is employed, while a novel noise suppression technique based on pink noise assumption is applied in a preprocessing step. In the multiple-F0 estimation stage, the optimal tuning and inharmonicity parameters are computed and a salience function is proposed in order to select pitch candidates. For each pitch candidate combination, an overlapping partial treatment procedure is used, which is based on a novel spectral envelope estimation procedure for the log-frequency domain, in order to compute the harmonic envelope of candidate pitches. In order to select the optimal pitch combination for each time frame, a score function is proposed which combines spectral and temporal characteristics of the candidate pitches and also aims to suppress harmonic errors. For postprocessing, hidden Markov models (HMMs) and conditional random fields (CRFs) trained on MIDI data are employed, in order to boost transcription accuracy. The system was trained on isolated piano sounds from the MAPS database and was tested on classic and jazz recordings from the RWC database, as well as on recordings from a Disklavier piano. A comparison with several state-of-the-art systems is provided using a variety of error metrics, where encouraging results are indicated

    Non-linear minimum variance estimation for discrete-time multi-channel systems

    Get PDF
    A nonlinear operator approach to estimation in discrete-time systems is described. It involves inferential estimation of a signal which enters a communications channel involving both nonlinearities and transport delays. The measurements are assumed to be corrupted by a colored noise signal which is correlated with the signal to be estimated. The system model may also include a communications channel involving either static or dynamic nonlinearities. The signal channel is represented in a very general nonlinear operator form. The algorithm is relatively simple to derive and to implement

    On the Enhancement of Generalized Integrator-based Adaptive Filter Dynamic Tuning Range

    Get PDF

    Multiple-F0 estimation of piano sounds exploiting spectral structure and temporal evolution

    Get PDF
    This paper proposes a system for multiple fundamental frequency estimation of piano sounds using pitch candidate selection rules which employ spectral structure and temporal evolution. As a time-frequency representation, the Resonator Time-Frequency Image of the input signal is employed, a noise suppression model is used, and a spectral whitening procedure is performed. In addition, a spectral flux-based onset detector is employed in order to select the steady-state region of the produced sound. In the multiple-F0 estimation stage, tuning and inharmonicity parameters are extracted and a pitch salience function is proposed. Pitch presence tests are performed utilizing information from the spectral structure of pitch candidates, aiming to suppress errors occurring at multiples and sub-multiples of the true pitches. A novel feature for the estimation of harmonically related pitches is proposed, based on the common amplitude modulation assumption. Experiments are performed on the MAPS database using 8784 piano samples of classical, jazz, and random chords with polyphony levels between 1 and 6. The proposed system is computationally inexpensive, being able to perform multiple-F0 estimation experiments in realtime. Experimental results indicate that the proposed system outperforms state-of-the-art approaches for the aforementioned task in a statistically significant manner. Index Terms: multiple-F0 estimation, resonator timefrequency image, common amplitude modulatio

    Universal direct tuner for loop control in industry

    Get PDF
    This paper introduces a direct universal (automatic) tuner for basic loop control in industrial applications. The direct feature refers to the fact that a first-hand model, such as a step response first-order plus dead time approximation, is not required. Instead, a point in the frequency domain and the corresponding slope of the loop frequency response is identified by single test suitable for industrial applications. The proposed method has been shown to overcome pitfalls found in other (automatic) tuning methods and has been validated in a wide range of common and exotic processes in simulation and experimental conditions. The method is very robust to noise, an important feature for real life industrial applications. Comparison is performed with other well-known methods, such as approximate M-constrained integral gain optimization (AMIGO) and Skogestad internal model controller (SIMC), which are indirect methods, i.e., they are based on a first-hand approximation of step response data. The results indicate great similarity between the results, whereas the direct method has the advantage of skipping this intermediate step of identification. The control structure is the most commonly used in industry, i.e., proportional-integral-derivative (PID) type. As the derivative action is often not used in industry due to its difficult choice, in the proposed method, we use a direct relation between the integral and derivative gains. This enables the user to have in the tuning structure the advantages of the derivative action, therefore much improving the potential of good performance in real life control applications

    Group Iterative Spectrum Thresholding for Super-Resolution Sparse Spectral Selection

    Full text link
    Recently, sparsity-based algorithms are proposed for super-resolution spectrum estimation. However, to achieve adequately high resolution in real-world signal analysis, the dictionary atoms have to be close to each other in frequency, thereby resulting in a coherent design. The popular convex compressed sensing methods break down in presence of high coherence and large noise. We propose a new regularization approach to handle model collinearity and obtain parsimonious frequency selection simultaneously. It takes advantage of the pairing structure of sine and cosine atoms in the frequency dictionary. A probabilistic spectrum screening is also developed for fast computation in high dimensions. A data-resampling version of high-dimensional Bayesian Information Criterion is used to determine the regularization parameters. Experiments show the efficacy and efficiency of the proposed algorithms in challenging situations with small sample size, high frequency resolution, and low signal-to-noise ratio

    EffiTest: Efficient Delay Test and Statistical Prediction for Configuring Post-silicon Tunable Buffers

    Full text link
    At nanometer manufacturing technology nodes, process variations significantly affect circuit performance. To combat them, post- silicon clock tuning buffers can be deployed to balance timing bud- gets of critical paths for each individual chip after manufacturing. The challenge of this method is that path delays should be mea- sured for each chip to configure the tuning buffers properly. Current methods for this delay measurement rely on path-wise frequency stepping. This strategy, however, requires too much time from ex- pensive testers. In this paper, we propose an efficient delay test framework (EffiTest) to solve the post-silicon testing problem by aligning path delays using the already-existing tuning buffers in the circuit. In addition, we only test representative paths and the delays of other paths are estimated by statistical delay prediction. Exper- imental results demonstrate that the proposed method can reduce the number of frequency stepping iterations by more than 94% with only a slight yield loss.Comment: ACM/IEEE Design Automation Conference (DAC), June 201

    Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates

    Full text link
    This paper presents a novel approach for indoor acoustic source localization using microphone arrays and based on a Convolutional Neural Network (CNN). The proposed solution is, to the best of our knowledge, the first published work in which the CNN is designed to directly estimate the three dimensional position of an acoustic source, using the raw audio signal as the input information avoiding the use of hand crafted audio features. Given the limited amount of available localization data, we propose in this paper a training strategy based on two steps. We first train our network using semi-synthetic data, generated from close talk speech recordings, and where we simulate the time delays and distortion suffered in the signal that propagates from the source to the array of microphones. We then fine tune this network using a small amount of real data. Our experimental results show that this strategy is able to produce networks that significantly improve existing localization methods based on \textit{SRP-PHAT} strategies. In addition, our experiments show that our CNN method exhibits better resistance against varying gender of the speaker and different window sizes compared with the other methods.Comment: 18 pages, 3 figures, 8 table
    • …
    corecore