5,755 research outputs found
Penalized Likelihood Methods for Estimation of Sparse High Dimensional Directed Acyclic Graphs
Directed acyclic graphs (DAGs) are commonly used to represent causal
relationships among random variables in graphical models. Applications of these
models arise in the study of physical, as well as biological systems, where
directed edges between nodes represent the influence of components of the
system on each other. The general problem of estimating DAGs from observed data
is computationally NP-hard, Moreover two directed graphs may be observationally
equivalent. When the nodes exhibit a natural ordering, the problem of
estimating directed graphs reduces to the problem of estimating the structure
of the network. In this paper, we propose a penalized likelihood approach that
directly estimates the adjacency matrix of DAGs. Both lasso and adaptive lasso
penalties are considered and an efficient algorithm is proposed for estimation
of high dimensional DAGs. We study variable selection consistency of the two
penalties when the number of variables grows to infinity with the sample size.
We show that although lasso can only consistently estimate the true network
under stringent assumptions, adaptive lasso achieves this task under mild
regularity conditions. The performance of the proposed methods is compared to
alternative methods in simulated, as well as real, data examples.Comment: 19 pages, 8 figure
Unification and Matching on Compressed Terms
Term unification plays an important role in many areas of computer science,
especially in those related to logic. The universal mechanism of grammar-based
compression for terms, in particular the so-called Singleton Tree Grammars
(STG), have recently drawn considerable attention. Using STGs, terms of
exponential size and height can be represented in linear space. Furthermore,
the term representation by directed acyclic graphs (dags) can be efficiently
simulated. The present paper is the result of an investigation on term
unification and matching when the terms given as input are represented using
different compression mechanisms for terms such as dags and Singleton Tree
Grammars. We describe a polynomial time algorithm for context matching with
dags, when the number of different context variables is fixed for the problem.
For the same problem, NP-completeness is obtained when the terms are
represented using the more general formalism of Singleton Tree Grammars. For
first-order unification and matching polynomial time algorithms are presented,
each of them improving previous results for those problems.Comment: This paper is posted at the Computing Research Repository (CoRR) as
part of the process of submission to the journal ACM Transactions on
Computational Logic (TOCL)
Quantum Algorithm for Dynamic Programming Approach for DAGs. Applications for Zhegalkin Polynomial Evaluation and Some Problems on DAGs
In this paper, we present a quantum algorithm for dynamic programming
approach for problems on directed acyclic graphs (DAGs). The running time of
the algorithm is , and the running time of the
best known deterministic algorithm is , where is the number of
vertices, is the number of vertices with at least one outgoing edge;
is the number of edges. We show that we can solve problems that use OR,
AND, NAND, MAX and MIN functions as the main transition steps. The approach is
useful for a couple of problems. One of them is computing a Boolean formula
that is represented by Zhegalkin polynomial, a Boolean circuit with shared
input and non-constant depth evaluating. Another two are the single source
longest paths search for weighted DAGs and the diameter search problem for
unweighted DAGs.Comment: UCNC2019 Conference pape
Efficient computational strategies to learn the structure of probabilistic graphical models of cumulative phenomena
Structural learning of Bayesian Networks (BNs) is a NP-hard problem, which is
further complicated by many theoretical issues, such as the I-equivalence among
different structures. In this work, we focus on a specific subclass of BNs,
named Suppes-Bayes Causal Networks (SBCNs), which include specific structural
constraints based on Suppes' probabilistic causation to efficiently model
cumulative phenomena. Here we compare the performance, via extensive
simulations, of various state-of-the-art search strategies, such as local
search techniques and Genetic Algorithms, as well as of distinct regularization
methods. The assessment is performed on a large number of simulated datasets
from topologies with distinct levels of complexity, various sample size and
different rates of errors in the data. Among the main results, we show that the
introduction of Suppes' constraints dramatically improve the inference
accuracy, by reducing the solution space and providing a temporal ordering on
the variables. We also report on trade-offs among different search techniques
that can be efficiently employed in distinct experimental settings. This
manuscript is an extended version of the paper "Structural Learning of
Probabilistic Graphical Models of Cumulative Phenomena" presented at the 2018
International Conference on Computational Science
On Characterizing the Data Movement Complexity of Computational DAGs for Parallel Execution
Technology trends are making the cost of data movement increasingly dominant,
both in terms of energy and time, over the cost of performing arithmetic
operations in computer systems. The fundamental ratio of aggregate data
movement bandwidth to the total computational power (also referred to the
machine balance parameter) in parallel computer systems is decreasing. It is
there- fore of considerable importance to characterize the inherent data
movement requirements of parallel algorithms, so that the minimal architectural
balance parameters required to support it on future systems can be well
understood. In this paper, we develop an extension of the well-known red-blue
pebble game to develop lower bounds on the data movement complexity for the
parallel execution of computational directed acyclic graphs (CDAGs) on parallel
systems. We model multi-node multi-core parallel systems, with the total
physical memory distributed across the nodes (that are connected through some
interconnection network) and in a multi-level shared cache hierarchy for
processors within a node. We also develop new techniques for lower bound
characterization of non-homogeneous CDAGs. We demonstrate the use of the
methodology by analyzing the CDAGs of several numerical algorithms, to develop
lower bounds on data movement for their parallel execution
- …