351 research outputs found

    Joint Bayesian Gaussian discriminant analysis for speaker verification

    Full text link
    State-of-the-art i-vector based speaker verification relies on variants of Probabilistic Linear Discriminant Analysis (PLDA) for discriminant analysis. We are mainly motivated by the recent work of the joint Bayesian (JB) method, which is originally proposed for discriminant analysis in face verification. We apply JB to speaker verification and make three contributions beyond the original JB. 1) In contrast to the EM iterations with approximated statistics in the original JB, the EM iterations with exact statistics are employed and give better performance. 2) We propose to do simultaneous diagonalization (SD) of the within-class and between-class covariance matrices to achieve efficient testing, which has broader application scope than the SVD-based efficient testing method in the original JB. 3) We scrutinize similarities and differences between various Gaussian PLDAs and JB, complementing the previous analysis of comparing JB only with Prince-Elder PLDA. Extensive experiments are conducted on NIST SRE10 core condition 5, empirically validating the superiority of JB with faster convergence rate and 9-13% EER reduction compared with state-of-the-art PLDA.Comment: accepted by ICASSP201

    Representation Learning With Convolutional Neural Networks

    Get PDF
    Deep learning methods have achieved great success in the areas of Computer Vision and Natural Language Processing. Recently, the rapidly developing field of deep learning is concerned with questions surrounding how we can learn meaningful and effective representations of data. This is because the performance of machine learning approaches is heavily dependent on the choice and quality of data representation, and different kinds of representation entangle and hide the different explanatory factors of variation behind the data. In this dissertation, we focus on representation learning with deep neural networks for different data formats including text, 3D polygon shapes, and brain fiber tracts. First, we propose a topic-based word representation learning approach for text classification. The proposed approach takes global semantic relationship between words over the whole corpus into consideration and encodes the relationships into distributed vector representations with continuous Skip-gram model. The learned representations which capture a large number of precise syntactic and semantic word relationships are taken as input of Convolution Neural Networks for classification. Our experimental results show the effectiveness of the proposed method on indexing of biomedical articles, behavior code annotation of clinical text fragments, and classification of news groups. Second, we present a 3D polygon shape representation learning framework for shape segmentation. We propose Directionally Convolutional Network (DCN) that extends convolution operations from images to the polygon mesh surface with rotation-invariant property. Based on the proposed DCN, we learn effective shape representations from raw geometric features and then classify each face of a given polygon into predefined semantic parts. Through extensive experiments, we demonstrate that our framework outperforms the current state-of-the-arts. Third, we propose to learn effective and meaningful representations for brain fiber tracts using deep learning frameworks. We handle the highly unbalanced dataset by introducing asymmetrical loss function for easily classified samples and hard classified ones. The training loss avoids to be dominated by the easy samples and the training step is more efficient. In addition, we learn more effective and meaningful representations by introducing deeper network and metric learning approaches. Furthermore, we propose to improve the interpretability of our framework by inducing attention mechanism. Our experimental results show that our proposed framework outperforms current golden standard significantly on the real-world dataset

    A Study of the Allan Variance for Constant-Mean Non-Stationary Processes

    Full text link
    The Allan Variance (AV) is a widely used quantity in areas focusing on error measurement as well as in the general analysis of variance for autocorrelated processes in domains such as engineering and, more specifically, metrology. The form of this quantity is widely used to detect noise patterns and indications of stability within signals. However, the properties of this quantity are not known for commonly occurring processes whose covariance structure is non-stationary and, in these cases, an erroneous interpretation of the AV could lead to misleading conclusions. This paper generalizes the theoretical form of the AV to some non-stationary processes while at the same time being valid also for weakly stationary processes. Some simulation examples show how this new form can help to understand the processes for which the AV is able to distinguish these from the stationary cases and hence allow for a better interpretation of this quantity in applied cases

    Azimuthal Anisotropy From Multimode Waveform Modeling Reveals Layering Within the Antarctica Craton

    Get PDF
    The isotropic structure of the crust and upper mantle under Antarctica has been constrained by many studies. However, the depth dependence of seismic anisotropy, a powerful tool to characterize deformation and flow, is still poorly known. Here, we modeled three-dimensional (3-D) variations in azimuthal anisotropy under Antarctica using a multimode Rayleigh waveform fitting technique. We first searched the model space with a reversible-jump Markov Chain Monte Carlo approach to find path-averaged vertically polarized shear wave velocity profiles that fit fundamental and higher mode Rayleigh waveforms. We then inverted them to obtain a 3-D velocity and azimuthal anisotropy model across the region down to 600 km depth. Our results reveal that the east-west dichotomy found in other studies is not only characterized by different wave velocities but also by different anisotropy directions, likely reflecting the different deformation histories of the two blocks. Azimuthal anisotropy was found to be present in the top 300 km only and peaks at 100 - 200 km depth under the East Antarctica craton. Additionally, depth changes in fast direction were observed within the craton between 75 km and 150 km depth, suggesting layering is present. We speculate this layering relates to the formation history of the craton.submitted to Seismic

    Azimuthal Anisotropy From Multimode Waveform Modeling Reveals Layering Within the Antarctica Craton

    Get PDF
    The isotropic structure of the crust and upper mantle under Antarctica has been constrained by many studies. However, the depth dependence of seismic anisotropy, a powerful tool to characterize deformation and flow, is still poorly known. Here, we modeled three-dimensional (3-D) variations in azimuthal anisotropy under Antarctica using a multimode Rayleigh waveform fitting technique. We first searched the model space with a reversible-jump Markov Chain Monte Carlo approach to find path-averaged vertically polarized shear wave velocity profiles that fit fundamental and higher mode Rayleigh waveforms. We then inverted them to obtain a 3-D velocity and azimuthal anisotropy model across the region down to 600 km depth. Our results reveal that the east-west dichotomy found in other studies is not only characterized by different wave velocities but also by different anisotropy directions, likely reflecting the different deformation histories of the two blocks. Azimuthal anisotropy was found to be present in the top 300 km only and peaks at 100 - 200 km depth under the East Antarctica craton. Additionally, depth changes in fast direction were observed within the craton between 75 km and 150 km depth, suggesting layering is present. We speculate this layering relates to the formation history of the craton.submitted to Seismic

    Localising change points in piecewise polynomials of general degrees

    Get PDF
    In this paper we are concerned with a sequence of univariate random variables with piecewise polynomial means and independent sub-Gaussian noise. The underlying polynomials are allowed to be of arbitrary but fixed degrees. All the other model parameters are allowed to vary depending on the sample size. We propose a two-step estimation procedure based on the â„“0\ell_0-penalisation and provide upper bounds on the localisation error. We complement these results by deriving a global information-theoretic lower bounds, which show that our two-step estimators are nearly minimax rate-optimal. We also show that our estimator enjoys near optimally adaptive performance by attaining individual localisation errors depending on the level of smoothness at individual change points of the underlying signal. In addition, under a special smoothness constraint, we provide a minimax lower bound on the localisation errors. This lower bound is independent of the polynomial orders and is sharper than the global minimax lower bound
    • …
    corecore