947 research outputs found
Minimum-weight triangulation is NP-hard
A triangulation of a planar point set S is a maximal plane straight-line
graph with vertex set S. In the minimum-weight triangulation (MWT) problem, we
are looking for a triangulation of a given point set that minimizes the sum of
the edge lengths. We prove that the decision version of this problem is
NP-hard. We use a reduction from PLANAR-1-IN-3-SAT. The correct working of the
gadgets is established with computer assistance, using dynamic programming on
polygonal faces, as well as the beta-skeleton heuristic to certify that certain
edges belong to the minimum-weight triangulation.Comment: 45 pages (including a technical appendix of 13 pages), 28 figures.
This revision contains a few improvements in the expositio
Sparse Linear Identifiable Multivariate Modeling
In this paper we consider sparse and identifiable linear latent variable
(factor) and linear Bayesian network models for parsimonious analysis of
multivariate data. We propose a computationally efficient method for joint
parameter and model inference, and model comparison. It consists of a fully
Bayesian hierarchy for sparse models using slab and spike priors (two-component
delta-function and continuous mixtures), non-Gaussian latent factors and a
stochastic search over the ordering of the variables. The framework, which we
call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and
bench-marked on artificial and real biological data sets. SLIM is closest in
spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in
inference, Bayesian network structure learning and model comparison.
Experimentally, SLIM performs equally well or better than LiNGAM with
comparable computational complexity. We attribute this mainly to the stochastic
search strategy used, and to parsimony (sparsity and identifiability), which is
an explicit part of the model. We propose two extensions to the basic i.i.d.
linear framework: non-linear dependence on observed variables, called SNIM
(Sparse Non-linear Identifiable Multivariate modeling) and allowing for
correlations between latent variables, called CSLIM (Correlated SLIM), for the
temporal and/or spatial data. The source code and scripts are available from
http://cogsys.imm.dtu.dk/slim/.Comment: 45 pages, 17 figure
Sparse Coding on Symmetric Positive Definite Manifolds using Bregman Divergences
This paper introduces sparse coding and dictionary learning for Symmetric
Positive Definite (SPD) matrices, which are often used in machine learning,
computer vision and related areas. Unlike traditional sparse coding schemes
that work in vector spaces, in this paper we discuss how SPD matrices can be
described by sparse combination of dictionary atoms, where the atoms are also
SPD matrices. We propose to seek sparse coding by embedding the space of SPD
matrices into Hilbert spaces through two types of Bregman matrix divergences.
This not only leads to an efficient way of performing sparse coding, but also
an online and iterative scheme for dictionary learning. We apply the proposed
methods to several computer vision tasks where images are represented by region
covariance matrices. Our proposed algorithms outperform state-of-the-art
methods on a wide range of classification tasks, including face recognition,
action recognition, material classification and texture categorization
Contributions to Vine-Copula Modeling
144 p.Regular vine-copula models (R-vines) are a powerful statistical tool for modeling thedependence structure of multivariate distribution functions. In particular, they allow modelingdierent types of dependencies among random variables independently of their marginaldistributions, which is deemed the most valued characteristic of these models. In this thesis, weinvestigate the theoretical properties of R-vines for representing dependencies and extend theiruse to solve supervised classication problems. We focus on three research directions.!In the rst line of research, the relationship between the graphical representations of R-vines!ĂREA LĂNEA1 2 0 3 0 4ĂREA LĂNEA1 2 0 3 1 7ĂREA LĂNEAĂREA LĂNEA!and Bayesian polytree networks is analyzed in terms of how conditional pairwise independence!relationships are represented by both models. In order to do that, we use an extended graphical!representation of R-vines in which the R-vine graph is endowed with further expressiveness,being possible to distinguish between edges representing independence and dependencerelationships. Using this representation, a separation criterion in the R-vine graph, called Rseparation,is dened. The proposed criterion is used in designing methods for building thegraphical structure of polytrees from that of R-vines, and vice versa. Moreover, possiblecorrespondences between the R-vine graph and the associated R-vine copula as well as dierentproperties of R-separation are analyzed. In the second research line, we design methods forlearning the graphical structure of R-vines from dependence lists. The main challenge of thistask lies in the extremely large size of the search space of all possible R-vine structures. Weprovide two strategies to solve the problem of learning R-vines that represent the largestnumber of dependencies in a list. The rst approach is a 0 -1 linear programming formulation forbuilding truncated R-vines with only two trees. The second approach is an evolutionaryalgorithm, which is able to learn complete and truncated R-vines. Experimental results show thesuccess of this strategy in solving the optimization problem posed. In the third research line, weintroduce a supervised classication approach where the dependence structure of the problemfeatures is modeled through R-vines. The ecacy of these classiers is validated in a mentaldecoding problem and in an image recognition task. While Rvines have been extensivelyapplied in elds such as economics, nance and statistics, only recently have they found theirplace in classication tasks. This contribution represents a step forward in understanding R-vinesand the prospect of extending their use to other machine learning tasks
Complexity analysis of Bayesian learning of high-dimensional DAG models and their equivalence classes
We consider MCMC methods for learning equivalence classes of sparse Gaussian
DAG models when . The main contribution of this work is a rapid
mixing result for a random walk Metropolis-Hastings algorithm, which we prove
using a canonical path method. It reveals that the complexity of Bayesian
learning of sparse equivalence classes grows only polynomially in and ,
under some common high-dimensional assumptions. Further, a series of
high-dimensional consistency results is obtained by the path method, including
the strong selection consistency of an empirical Bayes model for structure
learning and the consistency of a greedy local search on the restricted search
space. Rapid mixing and slow mixing results for other structure-learning MCMC
methods are also derived. Our path method and mixing time results yield crucial
insights into the computational aspects of high-dimensional structure learning,
which may be used to develop more efficient MCMC algorithms
The RODEO Approach for Nonparametric Density Estimation
Der von Lafferty und Wasserman (2008) entwickelte RODEO-Ansatz (Regularization of Derivative Expectation Operator) ist eine Regularisierungstechnik, die auf eine Vielzahl nichtparametrischer Kernel-Smoother angewendet werden kann. Die Idee des Ansatzes ist, die Reduktion der Verzerrung des Kernel-Smoothers, die mit einer Verringerung der Bandweiten einhergeht, entlang eines glatten Weges von abnehmenden Bandweite-Parameterwerten zu bestrafen. Der Einfluss von Dimensionen mit geringer lokaler Variation wird so effektiv ``herausgeglĂ€ttet'', wodurch eine Art implizite Variablenauswahl stattfindet. Unter bestimmten Annahmen können so schnellere Konvergenzraten fĂŒr den mittleren integrierten quadratischen Fehler des Kernel-Smoothers erreicht werden. Dies macht den RODEO-Ansatz vor allem fĂŒr höhere Dimensionen attraktiv. In der vorliegenden Arbeit wird eine Implementierung prĂ€sentiert, die den RODEO-Ansatz mit lokal polynomialer DichteschĂ€tzung kombiniert. Die Implementierung wurde durch das R-Paket lpderodeo realisiert. Ziel der vorliegenden Arbeit ist es, die Performance der Implementierung anhand einiger Beispiele zu evaluieren und mit einer Auswahl von acht weiteren nichtparametrischen DichteschĂ€tzverfahren zu vergleichen. Die Ergebnisse legen nahe, dass RODEO-Ansatz im Vergleich zu den anderen AnsĂ€tzen schlechter ist. DarĂŒber hinaus leidet die Implementierung aufgrund einer naiven Auswertungsabfolge unter relativ langen Rechenzeiten. Das wohl wichtigste Ergebnis dieser Arbeit ist jedoch die Tatsache, dass die von Liu, Lafferty und Wasserman (2007) entwickelte Theorie fehlerhaft ist. So fĂŒhrt bereits eine simple Rotation der Daten dazu, dass der Algorithmus nicht mehr richtig funktioniert.The regularization of derivative expectation operator (RODEO) approach developed by Lafferty and Wasserman (2008) is a regularization technique designed for a wide range of nonparametric kernel smoothers. The approach applies regularization by penalizing the bias reduction associated with a bandwidth reduction along a smooth path of decreasing bandwidth parameter values in order to avoid overfitting. Dimensions with small local variation are effectively smoothed out, thus implicitly carrying out variable selection. Under certain conditions, faster rates of converges of convergence for the mean integrated square error can be achieved, which makes the approach attractive for applications in high dimensions. In this paper we apply the RODEO approach to local polynomial density estimation. We implemented the approach in the R package lpderodeo. We apply our implementation to a few examples, and evaluate its performance in a comparative study using a sample of eight other approaches for nonparametric density estimation. Our findings suggest that the approach does not work well in comparison to the other considered approaches with regard to the applied performance metrics. Furthermore, our implementation suffers from long computation time due to a naive query. Our main finding, however, concerns the fact that the theoretical framework proposed by Liu, Lafferty, and Wasserman (2007) has severe shortcomings. In fact, we demonstrate that a simple rotation of the data makes the algorithm fail in practice
Efficient Sampling and Structure Learning of Bayesian Networks
Bayesian networks are probabilistic graphical models widely employed to
understand dependencies in high dimensional data, and even to facilitate causal
discovery. Learning the underlying network structure, which is encoded as a
directed acyclic graph (DAG) is highly challenging mainly due to the vast
number of possible networks. Efforts have focussed on two fronts:
constraint-based methods that perform conditional independence tests to exclude
edges and score and search approaches which explore the DAG space with greedy
or MCMC schemes. Here we synthesise these two fields in a novel hybrid method
which reduces the complexity of MCMC approaches to that of a constraint-based
method. Individual steps in the MCMC scheme only require simple table lookups
so that very long chains can be efficiently obtained. Furthermore, the scheme
includes an iterative procedure to correct for errors from the conditional
independence tests. The algorithm offers markedly superior performance to
alternatives, particularly because DAGs can also be sampled from the posterior
distribution, enabling full Bayesian model averaging for much larger Bayesian
networks.Comment: Revised version. 40 pages including 16 pages of supplement, 5 figures
and 15 supplemental figures; R package BiDAG is available at
https://CRAN.R-project.org/package=BiDA
- âŠ