57 research outputs found

    The Exponential Model for the Spectrum of a Time Series: Extensions and Applications

    Get PDF
    The exponential model for the spectrum of a time series and its fractional extensions are based on the Fourier series expansion of the logarithm of the spectral density. The coefficients of the expansion form the cepstrum of the time series. After deriving the cepstrum of important classes of time series processes, also featuring long memory, we discuss likelihood inferences based on the periodogram, for which the estimation of the cepstrum yields a generalized linear model for exponential data with logarithmic link, focusing on the issue of separating the contribution of the long memory component to the log-spectrum. We then propose two extensions. The first deals with replacing the logarithmic link with a more general Box-Cox link, which encompasses also the identity and the inverse links: this enables nesting alternative spectral estimation methods (autoregressive, exponential, etc.) under the same likelihood-based framework. Secondly, we propose a gradient boosting algorithm for the estimation of the log-spectrum and illustrate its potential for distilling the long memory component of the log-spectrum

    The Exponential Model for the Spectrum of a Time Series: Extensions and Applications

    Get PDF
    The exponential model for the spectrum of a time series and its fractional extensions are based on the Fourier series expansion of the logarithm of the spectral density. The coefficients of the expansion form the cepstrum of the time series. After deriving the cepstrum of important classes of time series processes, also featuring long memory, we discuss likelihood inferences based on the periodogram, for which the estimation of the cepstrum yields a generalized linear model for exponential data with logarithmic link, focusing on the issue of separating the contribution of the long memory component to the log-spectrum. We then propose two extensions. The first deals with replacing the logarithmic link with a more general Box-Cox link, which encompasses also the identity and the inverse links: this enables nesting alternative spectral estimation methods (autoregressive, exponential, etc.) under the same likelihood-based framework. Secondly, we propose a gradient boosting algorithm for the estimation of the log-spectrum and illustrate its potential for distilling the long memory component of the log-spectrum

    Charge and heat transport in ionic conductors

    Get PDF
    Transport coefficients relate the off-equilibrium flow of locally conserved quantities, such as charge, energy, and momentum, to gradients of intensive thermodynamic variables in the linear regime. Despite their mathematical formalization dating back to the middle of the last century, when Green and Kubo developed linear response theory, some conceptual subtleties were only recently understood through the formulation of the gauge-invariance and convective-invariance principles. In a nutshell, these invariance principles suggest that transport coefficients are mostly independent of the microscopic definition of the densities and currents. In this thesis, we analyze the consequences of gauge and convective invariances on the charge and heat-transport properties of ionic conductors. The combination of gauge invariance with Thouless' theorem on charge quantization reconciles Faraday's picture of ionic charge transport---whereby each atom carries a well-defined integer charge---with a rigorous quantum-mechanical definition of atomic oxidation states. The latter are topological invariants depending on the paths traced by the coordinates of nuclei in the atomic configuration space. When some general topological conditions are relaxed, we show that oxidation states lose their meaning, and charge can be adiabatically transported across macroscopic distances without a net ionic displacement. This allows for a classification of the different regimes of ionic transport in terms of the topological properties of the electronic structure of the conducting material. Invariance principles also allow one to compute thermal conductivity in multicomponent materials such as ionic conductors through equilibrium molecular dynamics simulations. In particular, heat management is of paramount importance in solid-state electrolytes, solid materials relevant for the production of next-generation batteries, where ionic conduction is mediated by diffusing vacancies and defects. The aforementioned conceptual difficulties in the theory of thermal transport are the root cause of a lack of systematic exploration of such properties in solid-state electrolytes. We showcase the ability of the invariance principles to overcome these issues together with state-of-the-art data analysis techniques in the paradigmatic example of the Li-ion conductor Li3ClO. We provide a simple rationale to explain the temperature and vacancy-concentration dependence of its thermal conductivity, which can be interpreted as the result of the interplay of a crystalline component and a contribution from the effective disorder generated by ionic diffusion

    Robust Phase-based Speech Signal Processing From Source-Filter Separation to Model-Based Robust ASR

    Get PDF
    The Fourier analysis plays a key role in speech signal processing. As a complex quantity, it can be expressed in the polar form using the magnitude and phase spectra. The magnitude spectrum is widely used in almost every corner of speech processing. However, the phase spectrum is not an obviously appealing start point for processing the speech signal. In contrast to the magnitude spectrum whose fine and coarse structures have a clear relation to speech perception, the phase spectrum is difficult to interpret and manipulate. In fact, there is not a meaningful trend or extrema which may facilitate the modelling process. Nonetheless, the speech phase spectrum has recently gained renewed attention. An expanding body of work is showing that it can be usefully employed in a multitude of speech processing applications. Now that the potential for the phase-based speech processing has been established, there is a need for a fundamental model to help understand the way in which phase encodes speech information. In this thesis a novel phase-domain source-filter model is proposed that allows for deconvolution of the speech vocal tract (filter) and excitation (source) components through phase processing. This model utilises the Hilbert transform, shows how the excitation and vocal tract elements mix in the phase domain and provides a framework for efficiently segregating the source and filter components through phase manipulation. To investigate the efficacy of the suggested approach, a set of features is extracted from the phase filter part for automatic speech recognition (ASR) and the source part of the phase is utilised for fundamental frequency estimation. Accuracy and robustness in both cases are illustrated and discussed. In addition, the proposed approach is improved by replacing the log with the generalised logarithmic function in the Hilbert transform and also by computing the group delay via regression filter. Furthermore, statistical distribution of the phase spectrum and its representations along the feature extraction pipeline are studied. It is illustrated that the phase spectrum has a bell-shaped distribution. Some statistical normalisation methods such as mean-variance normalisation, Laplacianisation, Gaussianisation and Histogram equalisation are successfully applied to the phase-based features and lead to a significant robustness improvement. The robustness gain achieved through using statistical normalisation and generalised logarithmic function encouraged the use of more advanced model-based statistical techniques such as vector Taylor Series (VTS). VTS in its original formulation assumes usage of the log function for compression. In order to simultaneously take advantage of the VTS and generalised logarithmic function, a new formulation is first developed to merge both into a unified framework called generalised VTS (gVTS). Also in order to leverage the gVTS framework, a novel channel noise estimation method is developed. The extensions of the gVTS framework and the proposed channel estimation to the group delay domain are then explored. The problems it presents are analysed and discussed, some solutions are proposed and finally the corresponding formulae are derived. Moreover, the effect of additive noise and channel distortion in the phase and group delay domains are scrutinised and the results are utilised in deriving the gVTS equations. Experimental results in the Aurora-4 ASR task in an HMM/GMM set up along with a DNN-based bottleneck system in the clean and multi-style training modes confirmed the efficacy of the proposed approach in dealing with both additive and channel noise

    Some Advances in Nonlinear Speech Modeling Using Modulations, Fractals, and Chaos

    Get PDF
    In this paper we briefly summarize our on-going work on modeling nonlinear structures in speech signals, caused by modulation and turbulence phenomena, using the theories of modulation, fractals, and chaos as well as suitable nonlinear signal analysis methods. Further, we focus on two advances: i) AM-FM modeling of fricative sounds with random modulation signals of the 1/f-noise type and ii) improved methods for speech analysis and prediction on reconstructed multidimensional attractors. 1

    Playing Technique and Violin Timbre: Detecting Bad Playing

    Get PDF
    For centuries, luthiers have committed to working towards better understanding and improving the sound characteristics and playability of violins. With advances in technology and signal processing, studies attempting to define a violin’s sound qualityvia physical characteristics and resonance patterns have ensued. Existing work has primarily focused on physical aspects reflecting an instrument’s sound quality. In the music information retrieval domain, advances have been made in areas suchas instrument identification tasks. Although much research has been completed on finding suitable features from which musical instruments can be represented, little work has focused on the violin’s complete timbre space and the effect a player has on the sound produced. This thesis specifically focuses on representing violin timbre such that a computer can detect the sound associated with a beginner from that of a professional standard player and detect typical beginner playing faults based on analysis of thewaveform signal only. Work has been limited to nine playing faults considered by professional musicians to be typical of beginner violinists. In order to achieve these goals, it was necessary to create a suitable dataset consisting of an equal number of beginner and professional standard legato notesamples. Feature extraction was then carried out by taking features from the time, spectral and cepstral domains. Selected features were then used to represent the samples in a classifier based on their efficacy at reflecting change within the violin’s timbrespace. The dataset underwent the scrutiny of professional standard stringed instrumentplayers via listening tests from which the target audience’s perception was captured. This information was verified and normalised before use as a priori labels in the classifier. Based on different feature representations, classification of violin notesreflecting perceived sound quality is presented in this thesis. The results show that it is possible to get a computer to determine between beginner and professional standard player legato notes and to detect playing faults. This thesis involves a thoroughunderstanding of violin playing, its perception, suitable analysis methods, feature extraction, representation and classification

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

    Radar target classification by micro-Doppler contributions

    Get PDF
    This thesis studies non-cooperative automatic radar target classification. Recent developments in silicon-germanium and monolithic microwave integrated circuit technologies allows to build cheap and powerful continuous wave radars. Availability of radars opens new applications in different areas. One of these applications is security. Radars could be used for surveillance of huge areas and detect unwanted moving objects. Determination of the type of the target is essential for such systems. Microwave radars use high frequencies that reflect from objects of millimetre size. The micro-Doppler signature of a target is a time-varying frequency modulated contribution that arose in radar backscattering and caused by the relative movement of separate parts of the target. The micro-Doppler phenomenon allows to classify non-rigid moving objects by analysing their signatures. This thesis is focused on designing of automatic target classification systems based on analysis of micro-Doppler signatures. Analysis of micro-Doppler radar signatures is usually performed by second-order statistics, i.e. common energy-based power spectra and spectrogram. However, the information about phase coupling content in backscattering is totally lost in these energy-based statistics. This useful phase coupling content can be extracted by higher-order spectral techniques. We show that this content is useful for radar target classification in terms of improved robustness to various corruption factors. A problem of unmanned aerial vehicle (UAV) classification using continuous wave radar is covered in the thesis. All steps of processing required to make a decision out of the raw radar data are considered. A novel feature extraction method is introduced. It is based on eigenpairs extracted from the correlation matrix of the signature. Different classes of UAVs are successfully separated in feature space by support vector machine. Within experiments or real radar data, achieved high classification accuracy proves the efficiency of the proposed solutions. Thesis also covers several applications of the automotive radar due to very high growth in technologies for intelligent vehicle radar systems. Such radars are already build-in in the vehicle and ready for new applications. We consider two novel applications. First application is a multi-sensor fusion of video camera and radar for more efficient vehicle-to-vehicle video transmission. Second application is a frequency band invariant pedestrian classification by an automotive radar. This system allows us to use the same signal processing hardware/software for different countries where regulations vary and radars with different operating frequency are required. We consider different radar applications: ground moving target classification, aerial target classification, unmanned aerial vehicles classification, pedestrian classification. The highest priority is given to verification of proposed methods on real radar data collected with frequencies equal to 9.5, 10, 16.8, 24 and 33 GHz

    Early adductive reasoning for blind signal separation

    Full text link
    We demonstrate that explicit and systematic incorporation of abductive reasoning capabilities into algorithms for blind signal separation can yield significant performance improvements. Our formulated mechanisms apply to the output data of signal processing modules in order to conjecture the structure of time-frequency interactions between the signal components that are to be separated. The conjectured interactions are used to drive subsequent signal separation processes that are as a result less blind to the interacting signal components and, therefore, more effective. We refer to this type of process as early abductive reasoning (EAR); the “early” refers to the fact that in contrast to classical Artificial Intelligence paradigms, the reasoning process here is utilized before the signal processing transformations are completed. We have used our EAR approach to formulate a practical algorithm that is more effective in realistically noisy conditions than reference algorithms that are representative of the current state of the art in two-speaker pitch tracking. Our algorithm uses the Blackboard architecture from Artificial Intelligence to control EAR and advanced signal processing modules. The algorithm has been implemented in MATLAB and successfully tested on a database of 570 mixture signals representing simultaneous speakers in a variety of real-world, noisy environments. With 0 dB Target-to-Masking Ratio (TMR) and no noise, the Gross Error Rate (GER) for our algorithm is 5% in comparison to the best GER performance of 11% among the reference algorithms. In diffuse noisy environments (such as street or restaurant environments), we find that our algorithm on the average outperforms the best reference algorithm by 9.4%. With directional noise, our algorithm also outperforms the best reference algorithm by 29%. The extracted pitch tracks from our algorithm were also used to carry out comb filtering for separating the harmonics of the two speakers from each other and from the other sound sources in the environment. The separated signals were evaluated subjectively by a set of 20 listeners to be of reasonable quality
    corecore