Search CORE

213,814 research outputs found

Recommended from our members

Modelling and extraction of fundamental frequency in speech signals

Author: Pawi Alipah
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2014
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.One of the most important parameters of speech is the fundamental frequency of vibration of voiced sounds. The audio sensation of the fundamental frequency is known as the pitch. Depending on the tonal/non-tonal category of language, the fundamental frequency conveys intonation, pragmatics and meaning. In addition the fundamental frequency and intonation carry speaker gender, age, identity, speaking style and emotional state. Accurate estimation of the fundamental frequency is critically important for functioning of speech processing applications such as speech coding, speech recognition, speech synthesis and voice morphing. This thesis makes contributions to the development of accurate pitch estimation research in three distinct ways: (1) an investigation of the impact of the window length on pitch estimation error, (2) an investigation of the use of the higher order moments and (3) an investigation of an analysis-synthesis method for selection of the best pitch value among N proposed candidates. Experimental evaluations show that the length of the speech window has a major impact on the accuracy of pitch estimation. Depending on the similarity criteria and the order of the statistical moment a window length of 37 to 80 ms gives the least error. In order to avoid excessive delay as a consequence of using a longer window, a method is proposed ii where the current short window is concatenated with the previous frames to form a longer signal window for pitch extraction. The use of second order and higher order moments, and the magnitude difference function, as the similarity criteria were explored and compared. A novel method of calculation of moments is introduced where the signal is split, i.e. rectified, into positive and negative valued samples. The moments for the positive and negative parts of the signal are computed separately and combined. The new method of calculation of moments from positive and negative parts and the higher order criteria provide competitive results. A challenging issue in pitch estimation is the determination of the best candidate from N extrema of the similarity criteria. The analysis-synthesis method proposed in this thesis selects the pitch candidate that provides the best reproduction (synthesis) of the harmonic spectrum of the original speech. The synthesis method must be such that the distortion increases with the increasing error in the estimate of the fundamental frequency. To this end a new method of spectral synthesis is proposed using an estimate of the spectral envelop and harmonically spaced asymmetric Gaussian pulses as excitation. The N-best method provides consistent reduction in pitch estimation error. The methods described in this thesis result in a significant improvement in the pitch accuracy and outperform the benchmark YIN method

Brunel University Research Archive

Frequency estimation for low earth orbit satellites

Author: Aboutanios Elias
Publication venue
Publication date: 01/01/2002
Field of study

University of Technology Sydney. Faculty of Engineering.Low Earth Orbit (LEO) satellites have received increased attention in recent years. They have been proposed as a viable solution for remote sensing, telemedicine, weather monitoring, search and rescue and communications to name a few applications. LEO satellites move with respect to an earth station. Thus, the station must be capable of tracking the satellite both spatially and in frequency. In addition, as the spectrum becomes more congested, links are being designed at higher frequencies such as Ka band. These frequencies experience larger attenuations and therefore the system must be capable of operating at low signal to noise ratios. In this dissertation we report on the research conducted on the following problems. Firstly, we study the estimation of the frequency of a sinusoid for the purpose of acquiring and tracking the frequency of the received signal. Secondly, we propose the use of the frequency measurements to assist the spatial tracking of the satellite. The highly dynamic environment of a LEO system, combined with the high Ka band frequencies result in large Doppler rates. This limits the available processing time and, consequently, the fundamental resolution of a frequency estimator. The frequency estimation strategy that is adopted in the thesis consists of a coarse estimator followed by a fine estimation stage. The coarse estimator is implemented using the maximum of the periodogram. The threshold effect is studied and the derivation of an approximate expression of the signal to noise ratio at which the threshold occurs is examined. The maximum of the periodogram produces a frequency estimate with an accuracy that is Ο(N⁻¹), where N is the number of data samples used in the FFT. The lower bound for the estimation of the frequency of a sinusoid, given by the Cramer-Rao bound (CRB), is Ο(N⁻³⁄²) . This motivates the use of a second stage in order to improve the estimation resolution. A family of new frequency estimation algorithms that interpolate on the fractional Fourier coefficients is proposed. The new estimators can be implemented iteratively to give a performance that is uniform in frequency. The iterative algorithms are analysed and their asymptotic properties derived. The asymptotic variance of the iterative estimators is only 1.0147 times the asymptotic CRB. Another method of refining the frequency estimate is the Dichotomous search of the periodogram peak. This is essentially a binary search algorithm. However, the estimator must be padded with zeroes in order to achieve a performance that is comparable to the CRB. An insight into this is offered and a modified form that does not require the zero-padding is proposed. The new algorithm is referred to as the modified dichotomous search. A new hybrid technique that combines the dichotomous search with an interpolation technique in order to improve its performance is also suggested. The second research mm was to study the possibility of applying the frequency measurements to obtain spatial tracking information. This is called the frequency assisted spatial tracking (FAST) concept. A simple orbital model is presented and the resulting equations are used to show that the Doppler shift and rate uniquely specify the satellite’s position for the purpose of antenna pointing. Assuming the maximum elevation of the pass is known, the FAST concept is implemented using a scalar Extended Kalman Filter (EKF). The EKF performance was simulated at a signal to noise ratio of 0dB. The off-boresight error was found better than 0.1° for elevations higher than 30°

OPUS - University of Technology Sydney

Modern Methods of Time-Frequency Warping of Sound Signals

Author: Trzos Michal
Publication venue: Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií
Publication date: 01/01/2015
Field of study

Tato práce se zabývá reprezentací nestacionárních harmonických signálů s časově proměnnými komponentami. Primárně je zaměřena na Harmonickou transformaci a jeji variantu se subkvadratickou výpočetní složitostí, Rychlou harmonickou transformaci. V této práci jsou prezentovány dva algoritmy využívající Rychlou harmonickou transformaci. Prvni používá jako metodu odhadu změny základního kmitočtu sbírané logaritmické spektrum a druhá používá metodu analýzy syntézou. Oba algoritmy jsou použity k analýze řečového segmentu pro porovnání vystupů. Nakonec je algoritmus využívající metody analýzy syntézou použit na reálné zvukové signály, aby bylo možné změřit zlepšení reprezentace kmitočtově modulovaných signálů za použití Harmonické transformace.This thesis deals with representation of non-stationary harmonic signals with time-varying components. Its main focus is aimed at Harmonic Transform and its variant with subquadratic computational complexity, the Fast Harmonic Transform. Two algorithms using the Fast Harmonic Transform are presented. The first uses the gathered log-spectrum as fundamental frequency change estimation method, the second uses analysis-by-synthesis approach. Both algorithms are used on a speech segment to compare its output. Further the analysis-by-synthesis algorithm is applied on several real sound signals to measure the increase in the ability to represent real frequency-modulated signals using the Harmonic Transform.

Digital library of Brno University of Technology

National Repository of Grey Literature

A new method of accurate broken rotor bar diagnosis based on modulation signal bispectrum analysis of motor current signals

Author: Alwodai Ahmed
Ball Andrew
Gu Fengshou
Shao Yimin
Tian Xiange
Wang T.
Publication venue: 'Elsevier BV'
Publication date: 01/06/2014
Field of study

Motor current signature analysis (MCSA) has been an effective way of monitoring electrical machines for many years. However, inadequate accuracy in diagnosing incipient broken rotor bars (BRB) has motivated many studies into improving this method. In this paper a modulation signal bispectrum (MSB) analysis is applied to motor currents from different broken bar cases and a new MSB based sideband estimator (MSB-SE) and sideband amplitude estimator are introduced for obtaining the amplitude at (1±2s)fs(1±2s)fs (s is the rotor slip and fsfs is the fundamental supply frequency) with high accuracy. As the MSB-SE has a good performance of noise suppression, the new estimator produces more accurate results in predicting the number of BRB, compared with conventional power spectrum analysis. Moreover, the paper has also developed an improved model for motor current signals under rotor fault conditions and an effective method to decouple the BRB current which interferes with that of speed oscillations associated with BRB. These provide theoretical supports for the new estimators and clarify the issues in using conventional bispectrum analysis

Crossref

University of Huddersfield Repository

Huddersfield Research Portal

Recommended from our members

Auditory Spectrum-Based Pitched Instrument Onset Detection

Author: Benetos E.
Stylianou Y.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2010
Field of study

In this paper, a method for onset detection of music signals using auditory spectra is proposed. The auditory spectrogram provides a time-frequency representation that employs a sound processing model resembling the human auditory system. Recent work on onset detection employs DFT-based features describing spectral energy and phase differences, as well as pitch-based features. These features are often combined for maximizing detection performance. Here, the spectral flux and phase slope features are derived in the auditory framework and a novel fundamental frequency estimation algorithm based on auditory spectra is introduced. An onset detection algorithm is proposed, which processes and combines the aforementioned features at the decision level. Experiments are conducted on a dataset covering 11 pitched instrument types, consisting of 1829 onsets in total. Results indicate that auditory representations outperform various state-of-the-art approaches, with the onset detection algorithm reaching an F-measure of 82.6%

City Research Online

Crossref

Kalman tracking of linear predictor and harmonic noise models for noisy speech enhancement

Author: Ben Milner
Boll
Chen
Deller
Ephraim
Ephraim
Ephraim
Ephraim
Esfandiar Zavarehei
Friedman
Griffin
Hansen
Ioannis Andrianakis
Jonathan Darch
Kalman
Lim
Lim
Paul White
Qin Yan
Rentzos
Saeed Vaseghi
Sameti
Secrest
Seltzer
Stylianou
Stylianou
Tucker
Turunen
Vaseghi
Weber
Yan
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

This paper presents a speech enhancement method based on the tracking and denoising of the formants of a linear prediction (LP) model of the spectral envelope of speech and the parameters of a harmonic noise model (HNM) of its excitation. The main advantages of tracking and denoising the prominent energy contours of speech are the efficient use of the spectral and temporal structures of successive speech frames and a mitigation of processing artefact known as the ‘musical noise’ or ‘musical tones’.The formant-tracking linear prediction (FTLP) model estimation consists of three stages: (a) speech pre-cleaning based on a spectral amplitude estimation, (b) formant-tracking across successive speech frames using the Viterbi method, and (c) Kalman filtering of the formant trajectories across successive speech frames.The HNM parameters for the excitation signal comprise; voiced/unvoiced decision, the fundamental frequency, the harmonics’ amplitudes and the variance of the noise component of excitation. A frequency-domain pitch extraction method is proposed that searches for the peak signal to noise ratios (SNRs) at the harmonics. For each speech frame several pitch candidates are calculated. An estimate of the pitch trajectory across successive frames is obtained using a Viterbi decoder. The trajectories of the noisy excitation harmonics across successive speech frames are modeled and denoised using Kalman filters.The proposed method is used to deconstruct noisy speech, de-noise its model parameters and then reconstitute speech from its cleaned parts. Experimental evaluations show the performance gains of the formant tracking, pitch extraction and noise reduction stages

Crossref

Southampton (e-Prints Soton)

University of East Anglia digital repository

The information content of gravitational wave harmonics in compact binary inspiral

Author: Bender P
Blanchet L
Cutler C
Cutler C
Cutler C
Hellings R W
Peterseim M
Ronald W Hellings
Sintes A M
Thomas A Moore
Vecchio A
Vecchio A
Publication venue: 'IOP Publishing'
Publication date: 01/01/2002
Field of study

The nonlinear aspect of gravitational wave generation that produces power at harmonics of the orbital frequency, above the fundamental quadrupole frequency, is examined to see what information about the source is contained in these higher harmonics. We use an order (4/2) post-Newtonian expansion of the gravitational wave waveform of a binary system to model the signal seen in a spaceborne gravitational wave detector such as the proposed LISA detector. Covariance studies are then performed to determine the ultimate accuracy to be expected when the parameters of the source are fit to the received signal. We find three areas where the higher harmonics contribute crucial information that breaks degeneracies in the model and allows otherwise badly-correlated parameters to be separated and determined. First, we find that the position of a coalescing massive black hole binary in an ecliptic plane detector, such as OMEGA, is well-determined with the help of these harmonics. Second, we find that the individual masses of the stars in a chirping neutron star binary can be separated because of the mass dependence of the harmonic contributions to the wave. Finally, we note that supermassive black hole binaries, whose frequencies are too low to be seen in the detector sensitivity window for long, may still have their masses, distances, and positions determined since the information content of the higher harmonics compensates for the information lost when the orbit-induced modulation of the signal does not last long enough to be apparent in the data.Comment: 13 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server