19,260 research outputs found
Computationally Efficient and Robust BIC-Based Speaker Segmentation
An algorithm for automatic speaker segmentation based on the Bayesian information criterion (BIC) is presented. BIC tests are not performed for every window shift, as previously, but when a speaker change is most probable to occur. This is done by estimating the next probable change point thanks to a model of utterance durations. It is found that the inverse Gaussian fits best the distribution of utterance durations. As a result, less BIC tests are needed, making the proposed system less computationally demanding in time and memory, and considerably more efficient with respect to missed speaker change points. A feature selection algorithm based on branch and bound search strategy is applied in order to identify the most efficient features for speaker segmentation. Furthermore, a new theoretical formulation of BIC is derived by applying centering and simultaneous diagonalization. This formulation is considerably more computationally efficient than the standard BIC, when the covariance matrices are estimated by other estimators than the usual maximum-likelihood ones. Two commonly used pairs of figures of merit are employed and their relationship is established. Computational efficiency is achieved through the speaker utterance modeling, whereas robustness is achieved by feature selection and application of BIC tests at appropriately selected time instants. Experimental results indicate that the proposed modifications yield a superior performance compared to existing approaches
Decorrelation of Neutral Vector Variables: Theory and Applications
In this paper, we propose novel strategies for neutral vector variable
decorrelation. Two fundamental invertible transformations, namely serial
nonlinear transformation and parallel nonlinear transformation, are proposed to
carry out the decorrelation. For a neutral vector variable, which is not
multivariate Gaussian distributed, the conventional principal component
analysis (PCA) cannot yield mutually independent scalar variables. With the two
proposed transformations, a highly negatively correlated neutral vector can be
transformed to a set of mutually independent scalar variables with the same
degrees of freedom. We also evaluate the decorrelation performances for the
vectors generated from a single Dirichlet distribution and a mixture of
Dirichlet distributions. The mutual independence is verified with the distance
correlation measurement. The advantages of the proposed decorrelation
strategies are intensively studied and demonstrated with synthesized data and
practical application evaluations
Modeling Individual Cyclic Variation in Human Behavior
Cycles are fundamental to human health and behavior. However, modeling cycles
in time series data is challenging because in most cases the cycles are not
labeled or directly observed and need to be inferred from multidimensional
measurements taken over time. Here, we present CyHMMs, a cyclic hidden Markov
model method for detecting and modeling cycles in a collection of
multidimensional heterogeneous time series data. In contrast to previous cycle
modeling methods, CyHMMs deal with a number of challenges encountered in
modeling real-world cycles: they can model multivariate data with discrete and
continuous dimensions; they explicitly model and are robust to missing data;
and they can share information across individuals to model variation both
within and between individual time series. Experiments on synthetic and
real-world health-tracking data demonstrate that CyHMMs infer cycle lengths
more accurately than existing methods, with 58% lower error on simulated data
and 63% lower error on real-world data compared to the best-performing
baseline. CyHMMs can also perform functions which baselines cannot: they can
model the progression of individual features/symptoms over the course of the
cycle, identify the most variable features, and cluster individual time series
into groups with distinct characteristics. Applying CyHMMs to two real-world
health-tracking datasets -- of menstrual cycle symptoms and physical activity
tracking data -- yields important insights including which symptoms to expect
at each point during the cycle. We also find that people fall into several
groups with distinct cycle patterns, and that these groups differ along
dimensions not provided to the model. For example, by modeling missing data in
the menstrual cycles dataset, we are able to discover a medically relevant
group of birth control users even though information on birth control is not
given to the model.Comment: Accepted at WWW 201
Probabilistic Modeling Paradigms for Audio Source Separation
This is the author's final version of the article, first published as E. Vincent, M. G. Jafari, S. A. Abdallah, M. D. Plumbley, M. E. Davies. Probabilistic Modeling Paradigms for Audio Source Separation. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 7, pp. 162-185. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch007file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04Most sound scenes result from the superposition of several sources, which can be separately perceived and analyzed by human listeners. Source separation aims to provide machine listeners with similar skills by extracting the sounds of individual sources from a given scene. Existing separation systems operate either by emulating the human auditory system or by inferring the parameters of probabilistic sound models. In this chapter, the authors focus on the latter approach and provide a joint overview of established and recent models, including independent component analysis, local time-frequency models and spectral template-based models. They show that most models are instances of one of the following two general paradigms: linear modeling or variance modeling. They compare the merits of either paradigm and report objective performance figures. They also,conclude by discussing promising combinations of probabilistic priors and inference algorithms that could form the basis of future state-of-the-art systems
Regularized Covariance Matrix Estimation in Complex Elliptically Symmetric Distributions Using the Expected Likelihood Approach - Part 2: The Under-Sampled Case
In the first part of this series of two papers, we extended the expected likelihood approach originally developed in the Gaussian case, to the broader class of complex elliptically symmetric (CES) distributions and complex angular central Gaussian (ACG) distributions. More precisely, we demonstrated that the probability density function (p.d.f.) of the likelihood ratio (LR) for the (unknown) actual scatter matrix \mSigma_{0} does not depend on the latter: it only depends on the density generator for the CES distribution and is distribution-free in the case of ACG distributed data, i.e., it only depends on the matrix dimension and the number of independent training samples , assuming that . Additionally, regularized scatter matrix estimates based on the EL methodology were derived. In this second part, we consider the under-sampled scenario () which deserves a specific treatment since conventional maximum likelihood estimates do not exist. Indeed, inference about the scatter matrix can only be made in the -dimensional subspace spanned by the columns of the data matrix. We extend the results derived under the Gaussian assumption to the CES and ACG class of distributions. Invariance properties of the under-sampled likelihood ratio evaluated at \mSigma_{0} are presented. Remarkably enough, in the ACG case, the p.d.f. of this LR can be written in a rather simple form as a product of beta distributed random variables. The regularized schemes derived in the first part, based on the EL principle, are extended to the under-sampled scenario and assessed through numerical simulations
- …