30 research outputs found

    Application of the bispectrum to glottal pulse analysis

    No full text
    Higher order spectral (HOS) techniques, such as the bispectrum, offer robustness to Gaussian noise and the ability to recover phase information. However, their drawbacks, such as the high variance of estimates and the need for long data records, have limited their use in conventional speech processing applications. As in glottal pulse estimation, all existing inverse filtering approaches use second-order statistics, it is of interest to explore the potential of HOS in this area. Using the theory of HOS factorization and the linear bispectrum, it is shown how voiced speech can be modelled as a nonGaussian coloured noise driven system. The linear bispectrum approach can be used to obtain alternative glottal pulse and vocal tract estimates in hybrid Iterative Adaptive Inverse Filtering (hIAIF) and the results are compared with traditional IAIF. Finally, a new technique which involves joint estimation of the glottal pulse and vocal tract followed by inverse filtering is presented. This new technique shows good preliminary results and is much simpler than previous techniques

    Statistical, spectral and stochastic characteristics of music

    No full text
    The goal of this paper is to explore some of the principal spectral and statistical characteristics of music that are relevant to the problem of Blind Source Separation (BSS). Some BSS algorithms require some a priori knowledge of statistical characteristics of the mixtures under examination, if only to more accurately establish initial estimates. Furthermore, theoretical investigations depend upon the assumption of Wide-Sense Stationarity - the extent to which this assumption holds is investigated

    A method of morphing spectral envelopes of the singing voice for use with backing vocals

    No full text
    The voice morphing process presented in this paper is based on the observation that, in many styles of music, it is often desirable for a backing vocalist to blend his or her timbre with that of the lead vocalist when the two voices are singing the same phonetic material concurrently. This paper proposes a novel application of recent morphing research for use with a source backing vocal and a target lead vocal. The function of the process is to alter the timbre of the backing vocal using spectral envelope information extracted from both vocal signals to achieve varying degrees of blending. Several original features are proposed for the unique usage context, including the use of LSFs as voice morphing parameters, and an original control algorithm that performs crossfades between synthesized and unsynthesized audio on the basis of voiced/unvoiced decision

    Application of bispectrum based signal reconstruction to sEMG signal

    No full text
    The surface electromyogram (sEMG) conveys information about the physiological properties of muscles. Unlike the power spectrum, the bispectrum can suppress noise when characterizing non-Gaussian random signals. In this paper we esablish a bispectrum based method to estimate a motor unit action potential from a simulated sEMG signal, improving on an earlier approach which combined bispectrum and power spectrum

    A review of glottal waveform analysis

    Get PDF
    Glottal inverse filtering is of potential use in a wide range of speech processing applications. As the process of voice production is, to a first order approximation, a source-filter process, then obtaining source and filter components provides for a flexible representation of the speech signal for use in processing applications. In certain applications the desire for accurate inverse filtering is more immediately obvious, e.g., in the assessment of laryngeal aspects of voice quality and for correlations between acoustics and vocal fold dynamics, the resonances of the vocal tract should firstly be removed. Similarly, for assessment of vocal performance, trained singers may wish to obtain quantitative data or feedback regarding their voice at the level of the larynx

    Simulation study for commercial time transfer service over geostationary satellite

    No full text
    Over the last twenty years, many technologies and services have come to rely on the GPS for precise timing. Concern is increasing about the wisdom of being reliant on a single timing solution provided by a single country and because of the susceptibility of the GPS signal to unintentional interference, jamming and spoofing. In this paper, we report on further development of our system for timing signal transfer from a precision reference clock using commercial satellite links. The system will have master stations tracking the satellite position and using TWSTFT measurements to synchronize their clocks, transmitting data with the reference timing signal to allow slave stations to adjust the PPS timing signal, compensating for the satellite motion and other uncertainties in the path delay. We will report on a simulation of the full system, including models for the master station clocks and TWTT measurements, using a Kalman filter to track the satellite position

    Automatic transcription of polyphonic piano music using a note masking technique

    No full text
    This paper describes a polyphonic note detection system incorporating a simple masking technique that can accurately transcribe chords and polyphonic piano music. The system, developed in MATLAB, will take input files in .wav format. The music is segmented by using Note Average Energy (NAE) onset detection. Onsets are used to segment the music into note windows which are then analysed using the FFT. Following compilation of the frequency peaks in each note window, an iterative masking procedure is used to detect and successively extract the notes. The masking procedure uses a database of note masks which are compiled from multiple note examples using both monophonic and polyphonic examples

    An analytical spectral formulation of glottal flow

    No full text
    The need for accurate voice source characterisation is an established goal in speech processing research. Practical limitations prohibit the widescale use of a glottal source/vocal tract filter implementation for many speech processing applications. In coding applications, for example, the transduction of the speech signal is with non-specialist microphones under diverse and often adverse conditions. In addition the transmission path and decoding process introduces further phase distortion. In the case of synthesis the accurate recording of a phase sensitive database is not overly problematic, however the extraction of the flow waveform from such a database is still a non-trivial task and as yet no automatic inverse filtering technique is readily available. One possible solution for overcoming the problem of extracting the timing events of the glottal flow is to implement a frequency domain representation and parameterization of the glottal flow waveform. An analytical spectral formulation of an existing time domain glottal model is presented

    A preliminary model for the synthesis of source spaciousness

    No full text
    We present here a basic model for the synthesis of source spaciousness over loudspeaker arrays. This model is based on two experiments carried out to quantify the contribution of early reflections and reverberation to the perception of source spaciousness

    Experimental and simulation study for commercial time transfer service over geostationary satellite

    No full text
    Time transfer over satellite links has been explored since the satellite era began. Currently, Two Way Satellite Time and Frequency Transfer (TWSTFT) is routinely used between national timing laboratories to align national timing standards, and the Global Positioning System (GPS) provides accurate timing signals in addition to its more familiar navigation solution. This paper reports on a method for transferring time from a reference clock over commercial geostationary satellite links with a specified low level of uncertainty at the receiving stations, using only the ephemeris information provided by the satellite operator. An initial experiment, reported here, showed that with one master station, measuring aggregate extraneous delays and transmitting positioning and delay data plus a correction factor to the slave stations, allowed transfer of a 1 pps (pulse per second) timing signal with a standard deviation of 72 to 98 ns and peak-to-peak variations of 500 to 600 ns, when measured against a GPS reference. Subsequent analysis of the experiment uncovered some issues with the implementation, suggesting that these results could be substantially improved upon. Furthermore, a simulation of the system that modeled the extraneous delays produced results similar to those obtained in the experiment
    corecore