Search CORE

1,500 research outputs found

Detecting epistasis via Markov bases

Author: Malaspinas Anna-Sapfo
Uhler Caroline
Publication venue
Publication date: 25/06/2010
Field of study

Rapid research progress in genotyping techniques have allowed large genome-wide association studies. Existing methods often focus on determining associations between single loci and a specific phenotype. However, a particular phenotype is usually the result of complex relationships between multiple loci and the environment. In this paper, we describe a two-stage method for detecting epistasis by combining the traditionally used single-locus search with a search for multiway interactions. Our method is based on an extended version of Fisher's exact test. To perform this test, a Markov chain is constructed on the space of multidimensional contingency tables using the elements of a Markov basis as moves. We test our method on simulated data and compare it to a two-stage logistic regression method and to a fully Bayesian method, showing that we are able to detect the interacting loci when other methods fail to do so. Finally, we apply our method to a genome-wide data set consisting of 685 dogs and identify epistasis associated with canine hair length for four pairs of SNPs

arXiv.org e-Print Archive

CiteSeerX

Packing ellipsoids with overlap

Author: Uhler Caroline
Wright Stephen J.
Publication venue
Publication date: 01/04/2012
Field of study

The problem of packing ellipsoids of different sizes and shapes into an ellipsoidal container so as to minimize a measure of overlap between ellipsoids is considered. A bilevel optimization formulation is given, together with an algorithm for the general case and a simpler algorithm for the special case in which all ellipsoids are in fact spheres. Convergence results are proved and computational experience is described and illustrated. The motivating application - chromosome organization in the human cell nucleus - is discussed briefly, and some illustrative results are presented

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)

Scalable Unbalanced Optimal Transport using Generative Adversarial Networks

Author: Uhler Caroline
Yang Karren D.
Publication venue
Publication date: 01/05/2019
Field of study

Generative adversarial networks (GANs) are an expressive class of neural generative models with tremendous success in modeling high-dimensional continuous measures. In this paper, we present a scalable method for unbalanced optimal transport (OT) based on the generative-adversarial framework. We formulate unbalanced OT as a problem of simultaneously learning a transport map and a scaling factor that push a source measure to a target measure in a cost-optimal manner. In addition, we propose an algorithm for solving this problem based on stochastic alternating gradient updates, similar in practice to GANs. We also provide theoretical justification for this formulation, showing that it is closely related to an existing static formulation by Liero et al. (2018), and perform numerical experiments demonstrating how this methodology can be applied to population modeling

arXiv.org e-Print Archive

DSpace@MIT

Geometry of Log-Concave Density Estimation

Author: Robeva Elina
Sturmfels Bernd
Uhler Caroline
Publication venue
Publication date: 06/04/2017
Field of study

Shape-constrained density estimation is an important topic in mathematical statistics. We focus on densities on

\mathbb{R}^d

that are log-concave, and we study geometric properties of the maximum likelihood estimator (MLE) for weighted samples. Cule, Samworth, and Stewart showed that the logarithm of the optimal log-concave density is piecewise linear and supported on a regular subdivision of the samples. This defines a map from the space of weights to the set of regular subdivisions of the samples, i.e. the face poset of their secondary polytope. We prove that this map is surjective. In fact, every regular subdivision arises in the MLE for some set of weights with positive probability, but coarser subdivisions appear to be more likely to arise than finer ones. To quantify these results, we introduce a continuous version of the secondary polytope, whose dual we name the Samworth body. This article establishes a new link between geometric combinatorics and nonparametric statistics, and it suggests numerous open problems.Comment: 22 pages, 3 figure

arXiv.org e-Print Archive

Faithfulness and learning hypergraphs from discrete distributions

Author: Klimova Anna
Rudas Tamas
Uhler Caroline
Publication venue
Publication date: 01/01/2015
Field of study

The concepts of faithfulness and strong-faithfulness are important for statistical learning of graphical models. Graphs are not sufficient for describing the association structure of a discrete distribution. Hypergraphs representing hierarchical log-linear models are considered instead, and the concept of parametric (strong-) faithfulness with respect to a hypergraph is introduced. Strong-faithfulness ensures the existence of uniformly consistent parameter estimators and enables building uniformly consistent procedures for a hypergraph search. The strength of association in a discrete distribution can be quantified with various measures, leading to different concepts of strong-faithfulness. Lower and upper bounds for the proportions of distributions that do not satisfy strong-faithfulness are computed for different parameterizations and measures of association.Comment: 23 pages, 6 figure

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)