698 research outputs found

    ROBUST SPEAKER RECOGNITION BASED ON LATENT VARIABLE MODELS

    Get PDF
    Automatic speaker recognition in uncontrolled environments is a very challenging task due to channel distortions, additive noise and reverberation. To address these issues, this thesis studies probabilistic latent variable models of short-term spectral information that leverage large amounts of data to achieve robustness in challenging conditions. Current speaker recognition systems represent an entire speech utterance as a single point in a high-dimensional space. This representation is known as "supervector". This thesis starts by analyzing the properties of this representation. A novel visualization procedure of supervectors is presented by which qualitative insight about the information being captured is obtained. We then propose the use of an overcomplete dictionary to explicitly decompose a supervector into a speaker-specific component and an undesired variability component. An algorithm to learn the dictionary from a large collection of data is discussed and analyzed. A subset of the entries of the dictionary is learned to represent speaker-specific information and another subset to represent distortions. After encoding the supervector as a linear combination of the dictionary entries, the undesired variability is removed by discarding the contribution of the distortion components. This paradigm is closely related to the previously proposed paradigm of Joint Factor Analysis modeling of supervectors. We establish a connection between the two approaches and show how our proposed method provides improvements in terms of computation and recognition accuracy. An alternative way to handle undesired variability in supervector representations is to first project them into a lower dimensional space and then to model them in the reduced subspace. This low-dimensional projection is known as "i-vector". Unfortunately, i-vectors exhibit non-Gaussian behavior, and direct statistical modeling requires the use of heavy-tailed distributions for optimal performance. These approaches lack closed-form solutions, and therefore are hard to analyze. Moreover, they do not scale well to large datasets. Instead of directly modeling i-vectors, we propose to first apply a non-linear transformation and then use a linear-Gaussian model. We present two alternative transformations and show experimentally that the transformed i-vectors can be optimally modeled by a simple linear-Gaussian model (factor analysis). We evaluate our method on a benchmark dataset with a large amount of channel variability and show that the results compare favorably against the competitors. Also, our approach has closed-form solutions and scales gracefully to large datasets. Finally, a multi-classifier architecture trained on a multicondition fashion is proposed to address the problem of speaker recognition in the presence of additive noise. A large number of experiments are conducted to analyze the proposed architecture and to obtain guidelines for optimal performance in noisy environments. Overall, it is shown that multicondition training of multi-classifier architectures not only produces great robustness in the anticipated conditions, but also generalizes well to unseen conditions

    Generating structured non-smooth priors and associated primal-dual methods

    Get PDF
    The purpose of the present chapter is to bind together and extend some recent developments regarding data-driven non-smooth regularization techniques in image processing through the means of a bilevel minimization scheme. The scheme, considered in function space, takes advantage of a dualization framework and it is designed to produce spatially varying regularization parameters adapted to the data for well-known regularizers, e.g. Total Variation and Total Generalized variation, leading to automated (monolithic), image reconstruction workflows. An inclusion of the theory of bilevel optimization and the theoretical background of the dualization framework, as well as a brief review of the aforementioned regularizers and their parameterization, makes this chapter a self-contained one. Aspects of the numerical implementation of the scheme are discussed and numerical examples are provided

    Analysis of Allocation Rules for Heart Transplantation

    Get PDF
    In this dissertation, we utilize several mathematical, optimization, and simulation techniques to improve the outcomes of organ transplantation allocation systems. Specifically, in Chapter 1, we build a Monte Carlo simulation model of the heart transplantation system in the United States that can be used to compare the performance of different allocation policies and predict the future of allocation systems. In Chapter 2, we develop a constrained Markov Decision model of the transplant queuing system and investigate optimal allocation rules for heart transplantation in the presence of certain fairness constraints. In Chapter 3, we introduce a new measure of fairness in the organ transplantation queuing systems. We show that this measure helps improve the performance loss of incorporating fairness considerations in organ transplantation systems, and decrease the price of fairness

    Deep Belief Networks Based Voice Activity Detection

    Full text link

    Robust Risk Aggregation Techniques and Applications

    Get PDF
    Risk aggregation, which concerns the statistical behaviors of an aggregation position S(X) associated with a random vector X = (X1, . . . , Xn), is an important research topic in risk management, economics, and statistics. The distribution of S(X) is determined by both the marginal behaviors and the joint dependence structure of X. In general, it is challenging to obtain an accurate estimation of the dependence structure of X compared with the estimation of the marginal distributions. Given the marginal distributions of X, this thesis focuses on studying the aggregation position S(X) with different dependence assumptions in different contexts. We will assume that X has a specific dependence structure (e.g., independence), or its dependence structure is (partially) unknown. In particular, for the case that the dependence structure is (partially) unknown, we are interested in the worst-case and the best-case scenarios of S(X). In Chapter 2, we show the surprising inequality that the weighted average of iid ultra heavy-tailed Pareto losses (with infinite mean) is larger than a standalone loss in the sense of first-order stochastic dominance. This result is further generalized to allow for random total number and weights of Pareto losses and for the losses to be triggered by catastrophic events. We discuss several important implications of these results via an equilibrium analysis of a risk exchange market. First, diversification of ultra heavy-tailed Pareto losses leads to increases in portfolio risk, and thus diversification penalty exists. Second, agents with ultra heavy-tailed Pareto losses will not share risks in a market equilibrium. Third, transferring losses from agents bearing Pareto losses to external parties without any losses may arrive at an equilibrium which benefits every party involved. In Chapter 3, we focus on aggregation sets, which represent model uncertainty due to unknown dependence structure of random vectors. We investigate ordering relations between two aggregation sets for which the sets of marginals are related by two simple operations: distribution mixtures and quantile mixtures. Intuitively, these operations “ho- homogenize” marginal distributions by making them similar. As a general conclusion from our results, more “homogeneous” marginals lead to a larger aggregation set, and thus more severe model uncertainty, although the situation for quantile mixtures is much more complicated than that for distribution mixtures. We proceed to study inequalities on the worst-case values of risk measures in risk aggregation, which represent the conservative calculation of regulatory capital. Among other results, we obtain an order relation on VaR under quantile mixture for marginal distributions with monotone densities. Numerical results are presented to visualize the theoretical results. Finally, we provide applications on portfolio diversification under dependence uncertainty and merging p-values in multiple hypothesis testing, and discuss the connection of our results to joint mixability. In Chapter 4, we study the aggregation of two risks when the marginal distributions are known and the dependence structure is unknown, with the additional constraint that one risk is smaller than or equal to the other. Risk aggregation problems with the order constraint are closely related to the recently introduced notion of the directional lower (DL) coupling. The largest aggregate risk in concave order (thus, the smallest aggregate risk in convex order) is attained by the DL coupling. These results are further generalized to calculate the best-case and worst-case values of tail risk measures. In particular, we obtain analytical formulas for bounds on Value-at-Risk. Our numerical results suggest that the new bounds on risk measures with the extra order constraint can greatly improve those with full dependence uncertainty. In Chapter 5, we study various methods for combining p-values from multiple hypothesis testing into one p-value, under different dependence assumptions of p-values. We say that a combining method is valid for arbitrary dependence if it does not require any assumption on the dependence structure of the p-values, whereas it is valid for some dependence if it requires some specific, perhaps realistic, but unjustifiable, dependence structures. The trade-off between the validity and efficiency of these methods is studied by analyzing the choices of critical values under different dependence assumptions. We introduce the notions of independence-comonotonicity balance (IC-balance) and the price for validity. In particular, IC-balanced methods always produce an identical critical value for independent and perfectly positively dependent p-values, a specific type of insensitivity to a family of dependence assumptions. We show that among two very general classes of merging methods commonly used in practice, the Cauchy combination method and the Simes method are the only IC-balanced ones. Simulation studies and a real-data analysis are conducted to analyze the size and power of various combining methods in the presence of weak and strong dependence
    corecore