Search CORE

38,958 research outputs found

A versatile pitch tracking algorithm : from human speech to killer whale vocalizations

Author: Shapiro Ari D.
Wang Chao
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2009
Field of study

Author Posting. © Acoustical Society of America, 2009. This article is posted here by permission of Acoustical Society of America for personal use, not for redistribution. The definitive version was published in Journal of the Acoustical Society of America 126 (2009): 451-459, doi:10.1121/1.3132525.In this article, a pitch tracking algorithm [named discrete logarithmic Fourier transformation-pitch detection algorithm (DLFT-PDA)], originally designed for human telephone speech, was modified for killer whale vocalizations. The multiple frequency components of some of these vocalizations demand a spectral (rather than temporal) approach to pitch tracking. The DLFT-PDA algorithm derives reliable estimations of pitch and the temporal change of pitch from the harmonic structure of the vocal signal. Scores from both estimations are combined in a dynamic programming search to find a smooth pitch track. The algorithm is capable of tracking killer whale calls that contain simultaneous low and high frequency components and compares favorably across most signal to noise ratio ranges to the peak-picking and sidewinder algorithms that have been used for tracking killer whale vocalizations previously.C.W. was supported by DARPA under Contract No. N66001-96-C-8526, monitored through Naval Command, Control, and Ocean Surveillance Center and by the National Science Foundation under Grant No. IRI-9618731. A.D.S. was supported by a National Defense Science and Engineering Graduate Fellowship

Crossref

Woods Hole Open Access Server

VLSI implementation of an AMDF pitch detector

Author: Gittel Falko
Hilt E.
Schwarzbacher Andreas
Smith Tony
Timoney Joseph
Publication venue
Publication date: 01/07/2003
Field of study

Pitch detectors are used in a variety of speech processing applications such as speech recognition systems where the pitch of the speaker is used as one parameter for identification purposes. Furthermore, pitch detectors are also sued with adaptive filters to achieve high quality adaptive noise cancellation of speech signals. In voice conversion systems, pitch detection is an essential step since the pitch of the modified signal is altered to model the target voice. This paper describes a VLSI implementation of the computationally efficient and accurate pitch detection algorithm known as the Average Magnitude Difference Function (AMDF). The superior speed of a hardware pitch detect6or is essential particularly for use in real-time signal processing devices such as mobile phones

MURAL - Maynooth University Research Archive Library

Minimising latency of pitch detection algorithms for live vocals on low-cost hardware

Author: Firth Matthew
Publication venue: 'University of Huddersfield Press'
Publication date: 16/12/2015
Field of study

A pitch estimation device was proposed for live vocals to output appropriate pitch data through the musical instrument digital interface (MIDI). The intention was to ideally achieve unnoticeable latency while maintaining estimation accuracy. The projected target platform was low-cost, standalone hardware based around a microcontroller such as the Microchip PIC series. This study investigated, optimised and compared the performance of suitable algorithms for this application. Performance was determined by two key factors: accuracy and latency. Many papers have been published over the past six decades assessing and comparing the accuracy of pitch detection algorithms on various signals, including vocals. However, very little information is available concerning the latency of pitch detection algorithms and methods with which this can be minimised. Real-time audio introduces a further latency challenge that is sparsely studied, minimising the length of sampled audio required by the algorithms in order to reduce overall total latency. Thorough testing was undertaken in order to determine the best-performing algorithm and optimal parameter combination. Software modifications were implemented to facilitate accurate, repeatable, automated testing in order to build a comprehensive set of results encompassing a wide range of test conditions. The results revealed that the infinite-peak-clipping autocorrelation function (IACF) performed better than the other autocorrelation functions tested and also identified ideal parameter values or value ranges to provide the optimal latency/accuracy balance. Although the results were encouraging, testing highlighted some fundamental issues with vocal pitch detection. Potential solutions are proposed for further development

Crossref

Directory of Open Access Journals

University of Huddersfield Repository

Multiple-F0 estimation of piano sounds exploiting spectral structure and temporal evolution

Author: Benetos E.
Dixon S.
Publication venue
Publication date: 01/01/2010
Field of study

This paper proposes a system for multiple fundamental frequency estimation of piano sounds using pitch candidate selection rules which employ spectral structure and temporal evolution. As a time-frequency representation, the Resonator Time-Frequency Image of the input signal is employed, a noise suppression model is used, and a spectral whitening procedure is performed. In addition, a spectral flux-based onset detector is employed in order to select the steady-state region of the produced sound. In the multiple-F0 estimation stage, tuning and inharmonicity parameters are extracted and a pitch salience function is proposed. Pitch presence tests are performed utilizing information from the spectral structure of pitch candidates, aiming to suppress errors occurring at multiples and sub-multiples of the true pitches. A novel feature for the estimation of harmonically related pitches is proposed, based on the common amplitude modulation assumption. Experiments are performed on the MAPS database using 8784 piano samples of classical, jazz, and random chords with polyphony levels between 1 and 6. The proposed system is computationally inexpensive, being able to perform multiple-F0 estimation experiments in realtime. Experimental results indicate that the proposed system outperforms state-of-the-art approaches for the aforementioned task in a statistically significant manner. Index Terms: multiple-F0 estimation, resonator timefrequency image, common amplitude modulatio

CiteSeerX

City Research Online

Recommended from our members

Auditory Spectrum-Based Pitched Instrument Onset Detection

Author: Benetos E.
Stylianou Y.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2010
Field of study

In this paper, a method for onset detection of music signals using auditory spectra is proposed. The auditory spectrogram provides a time-frequency representation that employs a sound processing model resembling the human auditory system. Recent work on onset detection employs DFT-based features describing spectral energy and phase differences, as well as pitch-based features. These features are often combined for maximizing detection performance. Here, the spectral flux and phase slope features are derived in the auditory framework and a novel fundamental frequency estimation algorithm based on auditory spectra is introduced. An onset detection algorithm is proposed, which processes and combines the aforementioned features at the decision level. Experiments are conducted on a dataset covering 11 pitched instrument types, consisting of 1829 onsets in total. Results indicate that auditory representations outperform various state-of-the-art approaches, with the onset detection algorithm reaching an F-measure of 82.6%

City Research Online

Crossref

Adaptive and Online Health Monitoring System for Autonomous Aircraft

Author: Howe Joe M.
Hussein Saed
Mokhtar Maizura
Zapatel-Bayo Sergio Z.
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 20/12/2012
Field of study

Good situation awareness is one of the key attributes required to maintain safe flight, especially for an Unmanned Aerial System (UAS). Good situation awareness can be achieved by incorporating an Adaptive Health Monitoring System (AHMS) to the aircraft. The AHMS monitors the flight outcome or flight behaviours of the aircraft based on its external environmental conditions and the behaviour of its internal systems. The AHMS does this by associating a health value to the aircraft's behaviour based on the progression of its sensory values produced by the aircraft's modules, components and/or subsystems. The AHMS indicates erroneous flight behaviour when a deviation to this health information is produced. This will be useful for a UAS because the pilot is taken out of the control loop and is unaware of how the environment and/or faults are affecting the behaviour of the aircraft. The autonomous pilot can use this health information to help produce safer and securer flight behaviour or fault tolerance to the aircraft. This allows the aircraft to fly safely in whatever the environmental conditions. This health information can also be used to help increase the endurance of the aircraft. This paper describes how the AHMS performs its capabilities

CLoK

Crossref

Panako: a scalable acoustic fingerprinting system handling time-scale and pitch modification

Author: Leman Marc
Six Joren
Publication venue
Publication date: 01/01/2014
Field of study

In this paper a scalable granular acoustic fingerprinting system robust against time and pitch scale modification is presented. The aim of acoustic fingerprinting is to identify identical, or recognize similar, audio fragments in a large set using condensed representations of audio signals, i.e. fingerprints. A robust fingerprinting system generates similar fingerprints for perceptually similar audio signals. The new system, presented here, handles a variety of distortions well. It is designed to be robust against pitch shifting, time stretching and tempo changes, while remaining scalable. After a query, the system returns the start time in the reference audio, and the amount of pitch shift and tempo change that has been applied. The design of the system that offers this unique combination of features is the main contribution of this research. The fingerprint itself consists of a combination of key points in a Constant-Q spectrogram. The system is evaluated on commodity hardware using a freely available reference database with fingerprints of over 30.000 songs. The results show that the system responds quickly and reliably on queries, while handling time and pitch scale modifications of up to ten percent

ZENODO

Ghent University Academic Bibliography