108,839 research outputs found
Metabolic Network Alignments and their Applications
The accumulation of high-throughput genomic and proteomic data allows for the reconstruction of the increasingly large and complex metabolic networks. In order to analyze the accumulated data and reconstructed networks, it is critical to identify network patterns and evolutionary relations between metabolic networks. But even finding similar networks becomes computationally challenging. The dissertation addresses these challenges with discrete optimization and the corresponding algorithmic techniques. Based on the property of the gene duplication and function sharing in biological network,we have formulated the network alignment problem which asks the optimal vertex-to-vertex mapping allowing path contraction, vertex deletion, and vertex insertions. We have proposed the first polynomial time algorithm for aligning an acyclic metabolic pattern pathway with an arbitrary metabolic network. We also have proposed a polynomial-time algorithm for patterns with small treewidth and implemented it for series-parallel patterns which are commonly found among metabolic networks. We have developed the metabolic network alignment tool for free public use. We have performed pairwise mapping of all pathways among five organisms and found a set of statistically significant pathway similarities. We also have applied the network alignment to identifying inconsistency, inferring missing enzymes, and finding potential candidates
Rich time series classification using temporal logic
© 2017 MIT Press Journals. All rights reserved. Time series classification is an important task in robotics that is often solved using supervised machine learning. However, classifier models are typically not 'readable' in the sense that humans cannot intuitively learn useful information about the relationship between inputs and outputs. In this paper, we address the problem of rich time series classification where we propose a novel framework for finding a temporal logic classifier specified in a human-readable form. The classifier is represented as a signal temporal logic (STL) formula that is expressive in capturing spatial, temporal and logical relations from a continuous-valued dataset over time. In the framework, we first find a set of representative logical formulas from the raw dataset, and then construct an STL classifier using a treebased clustering algorithm. We show that the framework runs in polynomial time and validate it using simulated examples where our framework is significantly more efficient than the closest existing framework (up to 920 times faster)
Counting and Sampling from Markov Equivalent DAGs Using Clique Trees
A directed acyclic graph (DAG) is the most common graphical model for
representing causal relationships among a set of variables. When restricted to
using only observational data, the structure of the ground truth DAG is
identifiable only up to Markov equivalence, based on conditional independence
relations among the variables. Therefore, the number of DAGs equivalent to the
ground truth DAG is an indicator of the causal complexity of the underlying
structure--roughly speaking, it shows how many interventions or how much
additional information is further needed to recover the underlying DAG. In this
paper, we propose a new technique for counting the number of DAGs in a Markov
equivalence class. Our approach is based on the clique tree representation of
chordal graphs. We show that in the case of bounded degree graphs, the proposed
algorithm is polynomial time. We further demonstrate that this technique can be
utilized for uniform sampling from a Markov equivalence class, which provides a
stochastic way to enumerate DAGs in the equivalence class and may be needed for
finding the best DAG or for causal inference given the equivalence class as
input. We also extend our counting and sampling method to the case where prior
knowledge about the underlying DAG is available, and present applications of
this extension in causal experiment design and estimating the causal effect of
joint interventions
On Armstrong relations for strong dependencies
The strong dependency has been introduced and axiomatized in [2], [3], [4], [5]. The aim of this paper is to investigate on Armstrong relations for strong dependencies. We give a necessary and sufficient condition for an abitrary relation to be Armstrong relation of a given strong scheme. We also give an effective algorithm finding a relation r such that r is Armstrong relation of a given strong scheme G = (U,S) (i.e. Sr = S+, where Sr is a full family of strong dependencies of r, and S+ is a set of all strong dependencies that can be derived from S by the system of axioms). We estimate this algorithm. We show that the time complexity of this algorithm is polynomial in |U| and |S|
Ramsey-type theorems for lines in 3-space
We prove geometric Ramsey-type statements on collections of lines in 3-space.
These statements give guarantees on the size of a clique or an independent set
in (hyper)graphs induced by incidence relations between lines, points, and
reguli in 3-space. Among other things, we prove that: (1) The intersection
graph of n lines in R^3 has a clique or independent set of size Omega(n^{1/3}).
(2) Every set of n lines in R^3 has a subset of n^{1/2} lines that are all
stabbed by one line, or a subset of Omega((n/log n)^{1/5}) such that no
6-subset is stabbed by one line. (3) Every set of n lines in general position
in R^3 has a subset of Omega(n^{2/3}) lines that all lie on a regulus, or a
subset of Omega(n^{1/3}) lines such that no 4-subset is contained in a regulus.
The proofs of these statements all follow from geometric incidence bounds --
such as the Guth-Katz bound on point-line incidences in R^3 -- combined with
Tur\'an-type results on independent sets in sparse graphs and hypergraphs.
Although similar Ramsey-type statements can be proved using existing generic
algebraic frameworks, the lower bounds we get are much larger than what can be
obtained with these methods. The proofs directly yield polynomial-time
algorithms for finding subsets of the claimed size.Comment: 18 pages including appendi
- …