213,814 research outputs found
Recommended from our members
Modelling and extraction of fundamental frequency in speech signals
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.One of the most important parameters of speech is the fundamental frequency of vibration of voiced sounds. The audio sensation of the fundamental frequency is known as the pitch. Depending on the tonal/non-tonal category of language, the fundamental frequency conveys intonation, pragmatics and meaning. In addition the fundamental frequency and intonation carry speaker gender, age, identity, speaking style and emotional state. Accurate estimation of the fundamental frequency is critically important for functioning of speech processing applications such as speech coding, speech recognition, speech synthesis and voice morphing. This thesis makes contributions to the development of accurate pitch estimation research in three distinct ways: (1) an investigation of the impact of the window length on pitch estimation error, (2) an investigation of the use of the higher order moments and (3) an investigation of an analysis-synthesis method for selection of the best pitch value among N proposed candidates. Experimental evaluations show that the length of the speech window has a major impact on the accuracy of pitch estimation. Depending on the similarity criteria and the order of the statistical moment a window length of 37 to 80 ms gives the least error. In order to avoid excessive delay as a consequence of using a longer window, a method is proposed
ii where the current short window is concatenated with the previous frames to form a longer signal window for pitch extraction. The use of second order and higher order moments, and the magnitude difference function, as the similarity criteria were explored and compared. A novel method of calculation of moments is introduced where the signal is split, i.e. rectified, into positive and negative valued samples. The moments for the positive and negative parts of the signal are computed separately and combined. The new method of calculation of moments from positive and negative parts and the higher order criteria provide competitive results. A challenging issue in pitch estimation is the determination of the best candidate from N extrema of the similarity criteria. The analysis-synthesis method proposed in this thesis selects the pitch candidate that provides the best reproduction (synthesis) of the harmonic spectrum of the original speech. The synthesis method must be such that the distortion increases with the increasing error in the estimate of the fundamental frequency. To this end a new method of spectral synthesis is proposed using an estimate of the spectral envelop and harmonically spaced asymmetric Gaussian pulses as excitation. The N-best method provides consistent reduction in pitch estimation error. The methods described in this thesis result in a significant improvement in the pitch accuracy and outperform the benchmark YIN method
Frequency estimation for low earth orbit satellites
University of Technology Sydney. Faculty of Engineering.Low Earth Orbit (LEO) satellites have received increased attention in recent years. They have been proposed as a viable solution for remote sensing, telemedicine, weather monitoring, search and rescue and communications to name a few applications. LEO satellites move with respect to an earth station. Thus, the station must be capable of tracking the satellite both spatially and in frequency. In addition, as the spectrum becomes more congested, links are being designed at higher frequencies such as Ka band. These frequencies experience larger attenuations and therefore the system must be capable of operating at low signal to noise ratios.
In this dissertation we report on the research conducted on the following problems. Firstly, we study the estimation of the frequency of a sinusoid for the purpose of acquiring and tracking the frequency of the received signal. Secondly, we propose the use of the frequency measurements to assist the spatial tracking of the satellite.
The highly dynamic environment of a LEO system, combined with the high Ka band frequencies result in large Doppler rates. This limits the available processing time and, consequently, the fundamental resolution of a frequency estimator. The frequency estimation strategy that is adopted in the thesis consists of a coarse estimator followed by a fine estimation stage. The coarse estimator is implemented using the maximum of the periodogram. The threshold effect is studied and the derivation of an approximate expression of the signal to noise ratio at which the threshold occurs is examined.
The maximum of the periodogram produces a frequency estimate with an accuracy that is Î(Nâ»Âč), where N is the number of data samples used in the FFT. The lower bound for the estimation of the frequency of a sinusoid, given by the Cramer-Rao bound (CRB), is Î(Nâ»ÂłâÂČ) . This motivates the use of a second stage in order to improve the estimation resolution. A family of new frequency estimation algorithms that interpolate on the fractional Fourier coefficients is proposed. The new estimators can be implemented iteratively to give a performance that is uniform in frequency. The iterative algorithms are analysed and their asymptotic properties derived. The asymptotic variance of the iterative estimators is only 1.0147 times the asymptotic CRB.
Another method of refining the frequency estimate is the Dichotomous search of the periodogram peak. This is essentially a binary search algorithm. However, the estimator must be padded with zeroes in order to achieve a performance that is comparable to the CRB. An insight into this is offered and a modified form that does not require the zero-padding is proposed. The new algorithm is referred to as the modified dichotomous search. A new hybrid technique that combines the dichotomous search with an interpolation technique in order to improve its performance is also suggested.
The second research mm was to study the possibility of applying the frequency measurements to obtain spatial tracking information. This is called the frequency assisted spatial tracking (FAST) concept. A simple orbital model is presented and the resulting equations are used to show that the Doppler shift and rate uniquely specify the satelliteâs position for the purpose of antenna pointing. Assuming the maximum elevation of the pass is known, the FAST concept is implemented using a scalar Extended Kalman Filter (EKF). The EKF performance was simulated at a signal to noise ratio of 0dB. The off-boresight error was found better than 0.1° for elevations higher than 30°
Modern Methods of Time-Frequency Warping of Sound Signals
Tato prĂĄce se zabĂœvĂĄ reprezentacĂ nestacionĂĄrnĂch harmonickĂœch signĂĄlĆŻ s ÄasovÄ promÄnnĂœmi komponentami. PrimĂĄrnÄ je zamÄĆena na Harmonickou transformaci a jeji variantu se subkvadratickou vĂœpoÄetnĂ sloĆŸitostĂ, Rychlou harmonickou transformaci. V tĂ©to prĂĄci jsou prezentovĂĄny dva algoritmy vyuĆŸĂvajĂcĂ Rychlou harmonickou transformaci. Prvni pouĆŸĂvĂĄ jako metodu odhadu zmÄny zĂĄkladnĂho kmitoÄtu sbĂranĂ© logaritmickĂ© spektrum a druhĂĄ pouĆŸĂvĂĄ metodu analĂœzy syntĂ©zou. Oba algoritmy jsou pouĆŸity k analĂœze ĆeÄovĂ©ho segmentu pro porovnĂĄnĂ vystupĆŻ. Nakonec je algoritmus vyuĆŸĂvajĂcĂ metody analĂœzy syntĂ©zou pouĆŸit na reĂĄlnĂ© zvukovĂ© signĂĄly, aby bylo moĆŸnĂ© zmÄĆit zlepĆĄenĂ reprezentace kmitoÄtovÄ modulovanĂœch signĂĄlĆŻ za pouĆŸitĂ HarmonickĂ© transformace.This thesis deals with representation of non-stationary harmonic signals with time-varying components. Its main focus is aimed at Harmonic Transform and its variant with subquadratic computational complexity, the Fast Harmonic Transform. Two algorithms using the Fast Harmonic Transform are presented. The first uses the gathered log-spectrum as fundamental frequency change estimation method, the second uses analysis-by-synthesis approach. Both algorithms are used on a speech segment to compare its output. Further the analysis-by-synthesis algorithm is applied on several real sound signals to measure the increase in the ability to represent real frequency-modulated signals using the Harmonic Transform.
A new method of accurate broken rotor bar diagnosis based on modulation signal bispectrum analysis of motor current signals
Motor current signature analysis (MCSA) has been an effective way of monitoring electrical machines for many years. However, inadequate accuracy in diagnosing incipient broken rotor bars (BRB) has motivated many studies into improving this method. In this paper a modulation signal bispectrum (MSB) analysis is applied to motor currents from different broken bar cases and a new MSB based sideband estimator (MSB-SE) and sideband amplitude estimator are introduced for obtaining the amplitude at (1±2s)fs(1±2s)fs (s is the rotor slip and fsfs is the fundamental supply frequency) with high accuracy. As the MSB-SE has a good performance of noise suppression, the new estimator produces more accurate results in predicting the number of BRB, compared with conventional power spectrum analysis. Moreover, the paper has also developed an improved model for motor current signals under rotor fault conditions and an effective method to decouple the BRB current which interferes with that of speed oscillations associated with BRB. These provide theoretical supports for the new estimators and clarify the issues in using conventional bispectrum analysis
Recommended from our members
Auditory Spectrum-Based Pitched Instrument Onset Detection
In this paper, a method for onset detection of music signals using auditory spectra is proposed. The auditory spectrogram provides a time-frequency representation that employs a sound processing model resembling the human auditory system. Recent work on onset detection employs DFT-based features describing spectral energy and phase differences, as well as pitch-based features. These features are often combined for maximizing detection performance. Here, the spectral flux and phase slope features are derived in the auditory framework and a novel fundamental frequency estimation algorithm based on auditory spectra is introduced. An onset detection algorithm is proposed, which processes and combines the aforementioned features at the decision level. Experiments are conducted on a dataset covering 11 pitched instrument types, consisting of 1829 onsets in total. Results indicate that auditory representations outperform various state-of-the-art approaches, with the onset detection algorithm reaching an F-measure of 82.6%
Kalman tracking of linear predictor and harmonic noise models for noisy speech enhancement
This paper presents a speech enhancement method based on the tracking and denoising of the formants of a linear prediction (LP) model of the spectral envelope of speech and the parameters of a harmonic noise model (HNM) of its excitation. The main advantages of tracking and denoising the prominent energy contours of speech are the efficient use of the spectral and temporal structures of successive speech frames and a mitigation of processing artefact known as the âmusical noiseâ or âmusical tonesâ.The formant-tracking linear prediction (FTLP) model estimation consists of three stages: (a) speech pre-cleaning based on a spectral amplitude estimation, (b) formant-tracking across successive speech frames using the Viterbi method, and (c) Kalman filtering of the formant trajectories across successive speech frames.The HNM parameters for the excitation signal comprise; voiced/unvoiced decision, the fundamental frequency, the harmonicsâ amplitudes and the variance of the noise component of excitation. A frequency-domain pitch extraction method is proposed that searches for the peak signal to noise ratios (SNRs) at the harmonics. For each speech frame several pitch candidates are calculated. An estimate of the pitch trajectory across successive frames is obtained using a Viterbi decoder. The trajectories of the noisy excitation harmonics across successive speech frames are modeled and denoised using Kalman filters.The proposed method is used to deconstruct noisy speech, de-noise its model parameters and then reconstitute speech from its cleaned parts. Experimental evaluations show the performance gains of the formant tracking, pitch extraction and noise reduction stages
The information content of gravitational wave harmonics in compact binary inspiral
The nonlinear aspect of gravitational wave generation that produces power at
harmonics of the orbital frequency, above the fundamental quadrupole frequency,
is examined to see what information about the source is contained in these
higher harmonics. We use an order (4/2) post-Newtonian expansion of the
gravitational wave waveform of a binary system to model the signal seen in a
spaceborne gravitational wave detector such as the proposed LISA detector.
Covariance studies are then performed to determine the ultimate accuracy to be
expected when the parameters of the source are fit to the received signal. We
find three areas where the higher harmonics contribute crucial information that
breaks degeneracies in the model and allows otherwise badly-correlated
parameters to be separated and determined. First, we find that the position of
a coalescing massive black hole binary in an ecliptic plane detector, such as
OMEGA, is well-determined with the help of these harmonics. Second, we find
that the individual masses of the stars in a chirping neutron star binary can
be separated because of the mass dependence of the harmonic contributions to
the wave. Finally, we note that supermassive black hole binaries, whose
frequencies are too low to be seen in the detector sensitivity window for long,
may still have their masses, distances, and positions determined since the
information content of the higher harmonics compensates for the information
lost when the orbit-induced modulation of the signal does not last long enough
to be apparent in the data.Comment: 13 pages, 5 figure
- âŠ