406 research outputs found
Language-based Abstractions for Dynamical Systems
Ordinary differential equations (ODEs) are the primary means to modelling
dynamical systems in many natural and engineering sciences. The number of
equations required to describe a system with high heterogeneity limits our
capability of effectively performing analyses. This has motivated a large body
of research, across many disciplines, into abstraction techniques that provide
smaller ODE systems while preserving the original dynamics in some appropriate
sense. In this paper we give an overview of a recently proposed
computer-science perspective to this problem, where ODE reduction is recast to
finding an appropriate equivalence relation over ODE variables, akin to
classical models of computation based on labelled transition systems.Comment: In Proceedings QAPL 2017, arXiv:1707.0366
Finding Mutated Subnetworks Associated with Survival in Cancer
Next-generation sequencing technologies allow the measurement of somatic
mutations in a large number of patients from the same cancer type. One of the
main goals in analyzing these mutations is the identification of mutations
associated with clinical parameters, such as survival time. This goal is
hindered by the genetic heterogeneity of mutations in cancer, due to the fact
that genes and mutations act in the context of pathways. To identify mutations
associated with survival time it is therefore crucial to study mutations in the
context of interaction networks.
In this work we study the problem of identifying subnetworks of a large
gene-gene interaction network that have mutations associated with survival. We
formally define the associated computational problem by using a score for
subnetworks based on the test statistic of the log-rank test, a widely used
statistical test for comparing the survival of two populations. We show that
the computational problem is NP-hard and we propose a novel algorithm, called
Network of Mutations Associated with Survival (NoMAS), to solve it. NoMAS is
based on the color-coding technique, that has been previously used in other
applications to find the highest scoring subnetwork with high probability when
the subnetwork score is additive. In our case the score is not additive;
nonetheless, we prove that under a reasonable model for mutations in cancer
NoMAS does identify the optimal solution with high probability. We test NoMAS
on simulated and cancer data, comparing it to approaches based on single gene
tests and to various greedy approaches. We show that our method does indeed
find the optimal solution and performs better than the other approaches.
Moreover, on two cancer datasets our method identifies subnetworks with
significant association to survival when none of the genes has significant
association with survival when considered in isolation.Comment: This paper was selected for oral presentation at RECOMB 2016 and an
abstract is published in the conference proceeding
Finding the True Frequent Itemsets
Frequent Itemsets (FIs) mining is a fundamental primitive in data mining. It
requires to identify all itemsets appearing in at least a fraction of
a transactional dataset . Often though, the ultimate goal of
mining is not an analysis of the dataset \emph{per se}, but the
understanding of the underlying process that generated it. Specifically, in
many applications is a collection of samples obtained from an
unknown probability distribution on transactions, and by extracting the
FIs in one attempts to infer itemsets that are frequently (i.e.,
with probability at least ) generated by , which we call the True
Frequent Itemsets (TFIs). Due to the inherently stochastic nature of the
generative process, the set of FIs is only a rough approximation of the set of
TFIs, as it often contains a huge number of \emph{false positives}, i.e.,
spurious itemsets that are not among the TFIs. In this work we design and
analyze an algorithm to identify a threshold such that the
collection of itemsets with frequency at least in
contains only TFIs with probability at least , for some
user-specified . Our method uses results from statistical learning
theory involving the (empirical) VC-dimension of the problem at hand. This
allows us to identify almost all the TFIs without including any false positive.
We also experimentally compare our method with the direct mining of
at frequency and with techniques based on widely-used
standard bounds (i.e., the Chernoff bounds) of the binomial distribution, and
show that our algorithm outperforms these methods and achieves even better
results than what is guaranteed by the theoretical analysis.Comment: 13 pages, Extended version of work appeared in SIAM International
Conference on Data Mining, 201
MultiVeStA: Statistical Model Checking for Discrete Event Simulators
The modeling, analysis and performance evaluation of large-scale systems are difficult tasks. Due to the size and complexity of the considered systems, an approach typically followed by engineers consists in performing simulations of systems models to obtain statistical estimations of quantitative properties. Similarly, a technique used by computer scientists working on quantitative analysis is Statistical Model Checking (SMC), where rigorous mathematical languages (typically logics) are used to express systems properties of interest. Such properties can then be automatically estimated by tools performing simulations of the model at hand. These property specifications languages, often not popular among engineers, provide a formal, compact and elegant way to express systems properties without needing to hard-code them in the model definition. This paper presents MultiVeStA, a statistical analysis tool which can be easily integrated with existing discrete event simulators, enriching them with efficient distributed statistical analysis and SMC capabilities
Statistical analysis of chemical computational systems with MULTIVESTA and ALCHEMIST
The chemical-oriented approach is an emerging paradigm for programming the behaviour of densely distributed and context-aware devices (e.g. in ecosystems of displays tailored to crowd steering, or to obtain profile-based coordinated visualization). Typically, the evolution of such systems cannot be easily predicted, thus making of paramount importance the availability of techniques and tools supporting prior-to-deployment analysis. Exact analysis techniques do not scale well when the complexity of systems grows: as a consequence, approximated techniques based on simulation assumed a relevant role. This work presents a new simulation-based distributed tool addressing the statistical analysis of such a kind of systems, which has been obtained by chaining two existing tools: MultiVeStA and Alchemist. The former is a recently proposed lightweight tool which allows to enrich existing discrete event simulators with distributed statistical analysis capabilities, while the latter is an efficient simulator for chemical-oriented computational systems. The tool is validated against a crowd steering scenario, and insights on the performance are provided by discussing how these scale distributing the analysis tasks on a multi-core architecture
Towards a Maude tool for model checking temporal graph properties
We present our prototypical tool for the verification of graph transformation systems. The major novelty of our tool is that it provides a model checker for temporal graph properties based on counterpart semantics for quantified m-calculi. Our tool can be considered as an instantiation of our approach to counterpart semantics which allows for a neat handling of creation, deletion and merging in systems
with dynamic structure. Our implementation is based on the object-based machinery of Maude, which provides the basics to deal with attributed graphs. Graph transformation
systems are specified with term rewrite rules. The model checker evaluates logical formulae of second-order modal m-calculus in the automatically generated CounterpartModel (a sort of unfolded graph transition system) of the graph transformation system under study. The result of evaluating a formula is a set of assignments for each state, associating node variables to actual nodes
Counterpart semantics for a second-order mu-calculus
We propose a novel approach to the semantics of quantified Ī¼-calculi, considering models where states are algebras; the evolution relation is given by a counterpart relation (a family of partial homomorphisms), allowing for the creation, deletion, and merging of components; and formulas are interpreted over sets of state assignments (families of substitutions, associating formula variables to state components). Our proposal avoids the limitations of existing approaches, usually enforcing restrictions of the evolution relation: the resulting semantics is a streamlined and intuitively appealing one, yet it is general enough to cover most of the alternative proposals we are aware of
Algebraic models for a second-order modal logic
We propose a predicative modal logic of the second order for expressing properties of the evolution of software systems. Each state of a system is specified as a unary algebra, and our logics allows to formalize the problem of verifying the properties of system evolutions as the checking of the truth of suitable formulas. The level of abstraction guaranteed by the algebraic presentation of system states allows the unification of many proposals in the literature, at the same time obtaining a greater level of expressiveness in terms of system representability.
Due to a different handling of the so-called ātrans-world identityā, we consider two alternative semantics for our logic: a āKripke-likeā model and a āCounterpart-likeā one. Furthermore, we instantiate our proposal by considering unary algebras representing graphs, thus showing the applicability of our approach to the graph transformation framework
Forward and Backward Bisimulations for Chemical Reaction Networks
We present two quantitative behavioral equivalences over species of a
chemical reaction network (CRN) with semantics based on ordinary differential
equations. Forward CRN bisimulation identifies a partition where each
equivalence class represents the exact sum of the concentrations of the species
belonging to that class. Backward CRN bisimulation relates species that have
the identical solutions at all time points when starting from the same initial
conditions. Both notions can be checked using only CRN syntactical information,
i.e., by inspection of the set of reactions. We provide a unified algorithm
that computes the coarsest refinement up to our bisimulations in polynomial
time. Further, we give algorithms to compute quotient CRNs induced by a
bisimulation. As an application, we find significant reductions in a number of
models of biological processes from the literature. In two cases we allow the
analysis of benchmark models which would be otherwise intractable due to their
memory requirements.Comment: Extended version of the CONCUR 2015 pape
- ā¦