26 research outputs found

    Latent Factor Analysis of Gaussian Distributions under Graphical Constraints

    Full text link
    We explore the algebraic structure of the solution space of convex optimization problem Constrained Minimum Trace Factor Analysis (CMTFA), when the population covariance matrix Σx\Sigma_x has an additional latent graphical constraint, namely, a latent star topology. In particular, we have shown that CMTFA can have either a rank 1 1 or a rank n−1 n-1 solution and nothing in between. The special case of a rank 1 1 solution, corresponds to the case where just one latent variable captures all the dependencies among the observables, giving rise to a star topology. We found explicit conditions for both rank 1 1 and rank n−1n- 1 solutions for CMTFA solution of Σx\Sigma_x. As a basic attempt towards building a more general Gaussian tree, we have found a necessary and a sufficient condition for multiple clusters, each having rank 1 1 CMTFA solution, to satisfy a minimum probability to combine together to build a Gaussian tree. To support our analytical findings we have presented some numerical demonstrating the usefulness of the contributions of our work.Comment: 9 pages, 4 figure

    Robustly Learning Mixtures of kk Arbitrary Gaussians

    Full text link
    We give a polynomial-time algorithm for the problem of robustly estimating a mixture of kk arbitrary Gaussians in Rd\mathbb{R}^d, for any fixed kk, in the presence of a constant fraction of arbitrary corruptions. This resolves the main open problem in several previous works on algorithmic robust statistics, which addressed the special cases of robustly estimating (a) a single Gaussian, (b) a mixture of TV-distance separated Gaussians, and (c) a uniform mixture of two Gaussians. Our main tools are an efficient \emph{partial clustering} algorithm that relies on the sum-of-squares method, and a novel \emph{tensor decomposition} algorithm that allows errors in both Frobenius norm and low-rank terms.Comment: This version extends the previous one to yield 1) robust proper learning algorithm with poly(eps) error and 2) an information theoretic argument proving that the same algorithms in fact also yield parameter recovery guarantees. The updates are included in Sections 7,8, and 9 and the main result from the previous version (Thm 1.4) is presented and proved in Section

    Learning Binary Decision Trees by Argmin Differentiation

    Get PDF
    We address the problem of learning binary decision trees that partition data for some downstream task. We propose to learn discrete parameters (i.e., for tree traversals and node pruning) and continuous parameters (i.e., for tree split functions and prediction functions) simultaneously using argmin differentiation. We do so by sparsely relaxing a mixed-integer program for the discrete parameters, to allow gradients to pass through the program to continuous parameters. We derive customized algorithms to efficiently compute the forward and backward passes. This means that our tree learning procedure can be used as an (implicit) layer in arbitrary deep networks, and can be optimized with arbitrary loss functions. We demonstrate that our approach produces binary trees that are competitive with existing single tree and ensemble approaches, in both supervised and unsupervised settings. Further, apart from greedy approaches (which do not have competitive accuracies), our method is faster to train than all other tree-learning baselines we compare with. The code for reproducing the results is available at https://github.com/vzantedeschi/LatentTrees

    Learning Binary Decision Trees by Argmin Differentiation

    Get PDF
    We address the problem of learning binary decision trees that partition data for some downstream task. We propose to learn discrete parameters (i.e., for tree traversals and node pruning) and continuous parameters (i.e., for tree split functions and prediction functions) simultaneously using argmin differentiation. We do so by sparsely relaxing a mixed-integer program for the discrete parameters, to allow gradients to pass through the program to continuous parameters. We derive customized algorithms to efficiently compute the forward and backward passes. This means that our tree learning procedure can be used as an (implicit) layer in arbitrary deep networks, and can be optimized with arbitrary loss functions. We demonstrate that our approach produces binary trees that are competitive with existing single tree and ensemble approaches, in both supervised and unsupervised settings. Further, apart from greedy approaches (which do not have competitive accuracies), our method is faster to train than all other tree-learning baselines we compare with. The code for reproducing the results is available at https://github.com/vzantedeschi/LatentTrees

    Efficient Low Dimensional Representation of Vector Gaussian Distributions

    Get PDF
    This dissertation seeks to find optimal graphical tree model for low dimensional representation of vector Gaussian distributions. For a special case we assumed that the population co-variance matrix Σx\Sigma_x has an additional latent graphical constraint, namely, a latent star topology. We have found the Constrained Minimum Determinant Factor Analysis (CMDFA) and Constrained Minimum Trace Factor Analysis (CMTFA) decompositions of this special Σx\Sigma_x in connection with the operational meanings of the respective solutions. Characterizing the CMDFA solution of special Σx\Sigma_x, according to the second interpretation of Wyner\u27s common information, is equivalent to solving the source coding problem of finding the minimum rate of information required to synthesize a vector following distribution arbitrarily close to the observed vector. In search of finding optimal solution to the common information problem for more general population co-variance matrices where the closed-form solutions are non existent, we have proposed a novel neural network based approach. In the theoretical segment of this dissertation, we have shown that for this special Σx\Sigma_x both CMDFA and CMTFA can have either a rank 1 1 or a rank n−1 n-1 solution and nothing in between. For both CMDFA and CMTFA, the special case of a rank 1 1 solution, corresponds to the case where just one latent variable captures all the dependencies among the observables giving rise to a star topology. We found explicit conditions for both rank 1 1 and rank n−1n- 1 solutions for CMDFA as well as CMTFA. We have analytically characterized the common solution space that CMDFA and CMTFA share with each other despite working with different objective functions. In the computational segment of this dissertation, we have proposed a novel variational approach to solve common information problem for more general data i.e. non-star yet Gaussian data or even non-Gaussian data. Our approach is devoted to searching for a model that can capture the constraints of the common information problem. We studied the Variational Auto-encoder (VAE) framework as a potential candidate to capture the constraints of the common information problem and established some insightful connections between VAE structure and the common information problem. So far we have designed and implemented four different neural network based models and all of them incorporates the VAE framework in their structure. We have formulated a set of metrics to justify the closeness of the obtained results by these models to the desired benchmarks. The theoretical CMDFA solution obtained for the special cases serves as the benchmark when it comes to testing the efficacy of the variational models we designed. Considering the ease of analysis our investigation so far has been limited to 33-dimensional data. Our investigation has revealed some interesting insights about the trade-off between model capacity and the intricacy of data distribution. Our next plan is to design a hybrid model combining the useful properties from different models. We will keep exploring in pursuit of a variational model capable of finding an optimal common information solution for higher dimensional data underlying arbitrary structures

    Learning Binary Decision Trees by Argmin Differentiation

    Get PDF
    International audienceWe address the problem of learning binary decision trees that partition data for some downstream task. We propose to learn discrete parameters(i.e., for tree traversals and node pruning) and continuous parameters (i.e., for tree split functions and prediction functions) simultaneously usingargmin differentiation. We do so by sparsely relaxing a mixed-integer program for the discrete parameters, to allow gradients to pass throughthe program to continuous parameters. We derive customized algorithms to efficiently compute the forward and backward passes. This meansthat our tree learning procedure can be used as an (implicit) layer in arbitrary deep networks, and can be optimized with arbitrary loss functions. We demonstrate that our approach produces binary trees that are competitive with existing single tree and ensemble approaches, in both supervised and unsupervised settings. Further, apart from greedy approaches (which do not have competitive accuracies), our method is faster to train than all other tree-learning baselines we compare with. The code for reproducing the results is available at https://github.com/vzantedeschi/LatentTrees
    corecore