26 research outputs found
Latent Factor Analysis of Gaussian Distributions under Graphical Constraints
We explore the algebraic structure of the solution space of convex
optimization problem Constrained Minimum Trace Factor Analysis (CMTFA), when
the population covariance matrix has an additional latent graphical
constraint, namely, a latent star topology. In particular, we have shown that
CMTFA can have either a rank or a rank solution and nothing in
between. The special case of a rank solution, corresponds to the case
where just one latent variable captures all the dependencies among the
observables, giving rise to a star topology. We found explicit conditions for
both rank and rank solutions for CMTFA solution of . As
a basic attempt towards building a more general Gaussian tree, we have found a
necessary and a sufficient condition for multiple clusters, each having rank CMTFA solution, to satisfy a minimum probability to combine together to
build a Gaussian tree. To support our analytical findings we have presented
some numerical demonstrating the usefulness of the contributions of our work.Comment: 9 pages, 4 figure
Robustly Learning Mixtures of Arbitrary Gaussians
We give a polynomial-time algorithm for the problem of robustly estimating a
mixture of arbitrary Gaussians in , for any fixed , in the
presence of a constant fraction of arbitrary corruptions. This resolves the
main open problem in several previous works on algorithmic robust statistics,
which addressed the special cases of robustly estimating (a) a single Gaussian,
(b) a mixture of TV-distance separated Gaussians, and (c) a uniform mixture of
two Gaussians. Our main tools are an efficient \emph{partial clustering}
algorithm that relies on the sum-of-squares method, and a novel \emph{tensor
decomposition} algorithm that allows errors in both Frobenius norm and low-rank
terms.Comment: This version extends the previous one to yield 1) robust proper
learning algorithm with poly(eps) error and 2) an information theoretic
argument proving that the same algorithms in fact also yield parameter
recovery guarantees. The updates are included in Sections 7,8, and 9 and the
main result from the previous version (Thm 1.4) is presented and proved in
Section
Learning Binary Decision Trees by Argmin Differentiation
We address the problem of learning binary decision trees that partition data
for some downstream task. We propose to learn discrete parameters (i.e., for
tree traversals and node pruning) and continuous parameters (i.e., for tree
split functions and prediction functions) simultaneously using argmin
differentiation. We do so by sparsely relaxing a mixed-integer program for the
discrete parameters, to allow gradients to pass through the program to
continuous parameters. We derive customized algorithms to efficiently compute
the forward and backward passes. This means that our tree learning procedure
can be used as an (implicit) layer in arbitrary deep networks, and can be
optimized with arbitrary loss functions. We demonstrate that our approach
produces binary trees that are competitive with existing single tree and
ensemble approaches, in both supervised and unsupervised settings. Further,
apart from greedy approaches (which do not have competitive accuracies), our
method is faster to train than all other tree-learning baselines we compare
with. The code for reproducing the results is available at
https://github.com/vzantedeschi/LatentTrees
Learning Binary Decision Trees by Argmin Differentiation
We address the problem of learning binary decision trees that partition data
for some downstream task. We propose to learn discrete parameters (i.e., for
tree traversals and node pruning) and continuous parameters (i.e., for tree
split functions and prediction functions) simultaneously using argmin
differentiation. We do so by sparsely relaxing a mixed-integer program for the
discrete parameters, to allow gradients to pass through the program to
continuous parameters. We derive customized algorithms to efficiently compute
the forward and backward passes. This means that our tree learning procedure
can be used as an (implicit) layer in arbitrary deep networks, and can be
optimized with arbitrary loss functions. We demonstrate that our approach
produces binary trees that are competitive with existing single tree and
ensemble approaches, in both supervised and unsupervised settings. Further,
apart from greedy approaches (which do not have competitive accuracies), our
method is faster to train than all other tree-learning baselines we compare
with. The code for reproducing the results is available at
https://github.com/vzantedeschi/LatentTrees
Efficient Low Dimensional Representation of Vector Gaussian Distributions
This dissertation seeks to find optimal graphical tree model for low dimensional representation of vector Gaussian distributions. For a special case we assumed that the population co-variance matrix has an additional latent graphical constraint, namely, a latent star topology. We have found the Constrained Minimum Determinant Factor Analysis (CMDFA) and Constrained Minimum Trace Factor Analysis (CMTFA) decompositions of this special in connection with the operational meanings of the respective solutions. Characterizing the CMDFA solution of special , according to the second interpretation of Wyner\u27s common information, is equivalent to solving the source coding problem of finding the minimum rate of information required to synthesize a vector following distribution arbitrarily close to the observed vector. In search of finding optimal solution to the common information problem for more general population co-variance matrices where the closed-form solutions are non existent, we have proposed a novel neural network based approach.
In the theoretical segment of this dissertation, we have shown that for this special both CMDFA and CMTFA can have either a rank or a rank solution and nothing in between. For both CMDFA and CMTFA, the special case of a rank solution, corresponds to the case where just one latent variable captures all the dependencies among the observables giving rise to a star topology. We found explicit conditions for both rank and rank solutions for CMDFA as well as CMTFA. We have analytically characterized the common solution space that CMDFA and CMTFA share with each other despite working with different objective functions.
In the computational segment of this dissertation, we have proposed a novel variational approach to solve common information problem for more general data i.e. non-star yet Gaussian data or even non-Gaussian data. Our approach is devoted to searching for a model that can capture the constraints of the common information problem. We studied the Variational Auto-encoder (VAE) framework as a potential candidate to capture the constraints of the common information problem and established some insightful connections between VAE structure and the common information problem. So far we have designed and implemented four different neural network based models and all of them incorporates the VAE framework in their structure. We have formulated a set of metrics to justify the closeness of the obtained results by these models to the desired benchmarks. The theoretical CMDFA solution obtained for the special cases serves as the benchmark when it comes to testing the efficacy of the variational models we designed. Considering the ease of analysis our investigation so far has been limited to -dimensional data. Our investigation has revealed some interesting insights about the trade-off between model capacity and the intricacy of data distribution. Our next plan is to design a hybrid model combining the useful properties from different models. We will keep exploring in pursuit of a variational model capable of finding an optimal common information solution for higher dimensional data underlying arbitrary structures
Learning Binary Decision Trees by Argmin Differentiation
International audienceWe address the problem of learning binary decision trees that partition data for some downstream task. We propose to learn discrete parameters(i.e., for tree traversals and node pruning) and continuous parameters (i.e., for tree split functions and prediction functions) simultaneously usingargmin differentiation. We do so by sparsely relaxing a mixed-integer program for the discrete parameters, to allow gradients to pass throughthe program to continuous parameters. We derive customized algorithms to efficiently compute the forward and backward passes. This meansthat our tree learning procedure can be used as an (implicit) layer in arbitrary deep networks, and can be optimized with arbitrary loss functions. We demonstrate that our approach produces binary trees that are competitive with existing single tree and ensemble approaches, in both supervised and unsupervised settings. Further, apart from greedy approaches (which do not have competitive accuracies), our method is faster to train than all other tree-learning baselines we compare with. The code for reproducing the results is available at https://github.com/vzantedeschi/LatentTrees