947 research outputs found

    Minimum-weight triangulation is NP-hard

    Full text link
    A triangulation of a planar point set S is a maximal plane straight-line graph with vertex set S. In the minimum-weight triangulation (MWT) problem, we are looking for a triangulation of a given point set that minimizes the sum of the edge lengths. We prove that the decision version of this problem is NP-hard. We use a reduction from PLANAR-1-IN-3-SAT. The correct working of the gadgets is established with computer assistance, using dynamic programming on polygonal faces, as well as the beta-skeleton heuristic to certify that certain edges belong to the minimum-weight triangulation.Comment: 45 pages (including a technical appendix of 13 pages), 28 figures. This revision contains a few improvements in the expositio

    Sparse Linear Identifiable Multivariate Modeling

    Full text link
    In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully Bayesian hierarchy for sparse models using slab and spike priors (two-component delta-function and continuous mixtures), non-Gaussian latent factors and a stochastic search over the ordering of the variables. The framework, which we call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable computational complexity. We attribute this mainly to the stochastic search strategy used, and to parsimony (sparsity and identifiability), which is an explicit part of the model. We propose two extensions to the basic i.i.d. linear framework: non-linear dependence on observed variables, called SNIM (Sparse Non-linear Identifiable Multivariate modeling) and allowing for correlations between latent variables, called CSLIM (Correlated SLIM), for the temporal and/or spatial data. The source code and scripts are available from http://cogsys.imm.dtu.dk/slim/.Comment: 45 pages, 17 figure

    Sparse Coding on Symmetric Positive Definite Manifolds using Bregman Divergences

    Full text link
    This paper introduces sparse coding and dictionary learning for Symmetric Positive Definite (SPD) matrices, which are often used in machine learning, computer vision and related areas. Unlike traditional sparse coding schemes that work in vector spaces, in this paper we discuss how SPD matrices can be described by sparse combination of dictionary atoms, where the atoms are also SPD matrices. We propose to seek sparse coding by embedding the space of SPD matrices into Hilbert spaces through two types of Bregman matrix divergences. This not only leads to an efficient way of performing sparse coding, but also an online and iterative scheme for dictionary learning. We apply the proposed methods to several computer vision tasks where images are represented by region covariance matrices. Our proposed algorithms outperform state-of-the-art methods on a wide range of classification tasks, including face recognition, action recognition, material classification and texture categorization

    Contributions to Vine-Copula Modeling

    Get PDF
    144 p.Regular vine-copula models (R-vines) are a powerful statistical tool for modeling thedependence structure of multivariate distribution functions. In particular, they allow modelingdierent types of dependencies among random variables independently of their marginaldistributions, which is deemed the most valued characteristic of these models. In this thesis, weinvestigate the theoretical properties of R-vines for representing dependencies and extend theiruse to solve supervised classication problems. We focus on three research directions.!In the rst line of research, the relationship between the graphical representations of R-vines!ÁREA LÍNEA1 2 0 3 0 4ÁREA LÍNEA1 2 0 3 1 7ÁREA LÍNEAÁREA LÍNEA!and Bayesian polytree networks is analyzed in terms of how conditional pairwise independence!relationships are represented by both models. In order to do that, we use an extended graphical!representation of R-vines in which the R-vine graph is endowed with further expressiveness,being possible to distinguish between edges representing independence and dependencerelationships. Using this representation, a separation criterion in the R-vine graph, called Rseparation,is dened. The proposed criterion is used in designing methods for building thegraphical structure of polytrees from that of R-vines, and vice versa. Moreover, possiblecorrespondences between the R-vine graph and the associated R-vine copula as well as dierentproperties of R-separation are analyzed. In the second research line, we design methods forlearning the graphical structure of R-vines from dependence lists. The main challenge of thistask lies in the extremely large size of the search space of all possible R-vine structures. Weprovide two strategies to solve the problem of learning R-vines that represent the largestnumber of dependencies in a list. The rst approach is a 0 -1 linear programming formulation forbuilding truncated R-vines with only two trees. The second approach is an evolutionaryalgorithm, which is able to learn complete and truncated R-vines. Experimental results show thesuccess of this strategy in solving the optimization problem posed. In the third research line, weintroduce a supervised classication approach where the dependence structure of the problemfeatures is modeled through R-vines. The ecacy of these classiers is validated in a mentaldecoding problem and in an image recognition task. While Rvines have been extensivelyapplied in elds such as economics, nance and statistics, only recently have they found theirplace in classication tasks. This contribution represents a step forward in understanding R-vinesand the prospect of extending their use to other machine learning tasks

    Complexity analysis of Bayesian learning of high-dimensional DAG models and their equivalence classes

    Full text link
    We consider MCMC methods for learning equivalence classes of sparse Gaussian DAG models when p=eo(n)p = e^{o(n)}. The main contribution of this work is a rapid mixing result for a random walk Metropolis-Hastings algorithm, which we prove using a canonical path method. It reveals that the complexity of Bayesian learning of sparse equivalence classes grows only polynomially in nn and pp, under some common high-dimensional assumptions. Further, a series of high-dimensional consistency results is obtained by the path method, including the strong selection consistency of an empirical Bayes model for structure learning and the consistency of a greedy local search on the restricted search space. Rapid mixing and slow mixing results for other structure-learning MCMC methods are also derived. Our path method and mixing time results yield crucial insights into the computational aspects of high-dimensional structure learning, which may be used to develop more efficient MCMC algorithms

    The RODEO Approach for Nonparametric Density Estimation

    Get PDF
    Der von Lafferty und Wasserman (2008) entwickelte RODEO-Ansatz (Regularization of Derivative Expectation Operator) ist eine Regularisierungstechnik, die auf eine Vielzahl nichtparametrischer Kernel-Smoother angewendet werden kann. Die Idee des Ansatzes ist, die Reduktion der Verzerrung des Kernel-Smoothers, die mit einer Verringerung der Bandweiten einhergeht, entlang eines glatten Weges von abnehmenden Bandweite-Parameterwerten zu bestrafen. Der Einfluss von Dimensionen mit geringer lokaler Variation wird so effektiv ``herausgeglĂ€ttet'', wodurch eine Art implizite Variablenauswahl stattfindet. Unter bestimmten Annahmen können so schnellere Konvergenzraten fĂŒr den mittleren integrierten quadratischen Fehler des Kernel-Smoothers erreicht werden. Dies macht den RODEO-Ansatz vor allem fĂŒr höhere Dimensionen attraktiv. In der vorliegenden Arbeit wird eine Implementierung prĂ€sentiert, die den RODEO-Ansatz mit lokal polynomialer DichteschĂ€tzung kombiniert. Die Implementierung wurde durch das R-Paket lpderodeo realisiert. Ziel der vorliegenden Arbeit ist es, die Performance der Implementierung anhand einiger Beispiele zu evaluieren und mit einer Auswahl von acht weiteren nichtparametrischen DichteschĂ€tzverfahren zu vergleichen. Die Ergebnisse legen nahe, dass RODEO-Ansatz im Vergleich zu den anderen AnsĂ€tzen schlechter ist. DarĂŒber hinaus leidet die Implementierung aufgrund einer naiven Auswertungsabfolge unter relativ langen Rechenzeiten. Das wohl wichtigste Ergebnis dieser Arbeit ist jedoch die Tatsache, dass die von Liu, Lafferty und Wasserman (2007) entwickelte Theorie fehlerhaft ist. So fĂŒhrt bereits eine simple Rotation der Daten dazu, dass der Algorithmus nicht mehr richtig funktioniert.The regularization of derivative expectation operator (RODEO) approach developed by Lafferty and Wasserman (2008) is a regularization technique designed for a wide range of nonparametric kernel smoothers. The approach applies regularization by penalizing the bias reduction associated with a bandwidth reduction along a smooth path of decreasing bandwidth parameter values in order to avoid overfitting. Dimensions with small local variation are effectively smoothed out, thus implicitly carrying out variable selection. Under certain conditions, faster rates of converges of convergence for the mean integrated square error can be achieved, which makes the approach attractive for applications in high dimensions. In this paper we apply the RODEO approach to local polynomial density estimation. We implemented the approach in the R package lpderodeo. We apply our implementation to a few examples, and evaluate its performance in a comparative study using a sample of eight other approaches for nonparametric density estimation. Our findings suggest that the approach does not work well in comparison to the other considered approaches with regard to the applied performance metrics. Furthermore, our implementation suffers from long computation time due to a naive query. Our main finding, however, concerns the fact that the theoretical framework proposed by Liu, Lafferty, and Wasserman (2007) has severe shortcomings. In fact, we demonstrate that a simple rotation of the data makes the algorithm fail in practice

    Efficient Sampling and Structure Learning of Bayesian Networks

    Full text link
    Bayesian networks are probabilistic graphical models widely employed to understand dependencies in high dimensional data, and even to facilitate causal discovery. Learning the underlying network structure, which is encoded as a directed acyclic graph (DAG) is highly challenging mainly due to the vast number of possible networks. Efforts have focussed on two fronts: constraint-based methods that perform conditional independence tests to exclude edges and score and search approaches which explore the DAG space with greedy or MCMC schemes. Here we synthesise these two fields in a novel hybrid method which reduces the complexity of MCMC approaches to that of a constraint-based method. Individual steps in the MCMC scheme only require simple table lookups so that very long chains can be efficiently obtained. Furthermore, the scheme includes an iterative procedure to correct for errors from the conditional independence tests. The algorithm offers markedly superior performance to alternatives, particularly because DAGs can also be sampled from the posterior distribution, enabling full Bayesian model averaging for much larger Bayesian networks.Comment: Revised version. 40 pages including 16 pages of supplement, 5 figures and 15 supplemental figures; R package BiDAG is available at https://CRAN.R-project.org/package=BiDA
    • 

    corecore