8,098 research outputs found
Efficient algorithms to discover alterations with complementary functional association in cancer
Recent large cancer studies have measured somatic alterations in an
unprecedented number of tumours. These large datasets allow the identification
of cancer-related sets of genetic alterations by identifying relevant
combinatorial patterns. Among such patterns, mutual exclusivity has been
employed by several recent methods that have shown its effectivenes in
characterizing gene sets associated to cancer. Mutual exclusivity arises
because of the complementarity, at the functional level, of alterations in
genes which are part of a group (e.g., a pathway) performing a given function.
The availability of quantitative target profiles, from genetic perturbations or
from clinical phenotypes, provides additional information that can be leveraged
to improve the identification of cancer related gene sets by discovering groups
with complementary functional associations with such targets.
In this work we study the problem of finding groups of mutually exclusive
alterations associated with a quantitative (functional) target. We propose a
combinatorial formulation for the problem, and prove that the associated
computation problem is computationally hard. We design two algorithms to solve
the problem and implement them in our tool UNCOVER. We provide analytic
evidence of the effectiveness of UNCOVER in finding high-quality solutions and
show experimentally that UNCOVER finds sets of alterations significantly
associated with functional targets in a variety of scenarios. In addition, our
algorithms are much faster than the state-of-the-art, allowing the analysis of
large datasets of thousands of target profiles from cancer cell lines. We show
that on one such dataset from project Achilles our methods identify several
significant gene sets with complementary functional associations with targets.Comment: Accepted at RECOMB 201
The context-dependence of mutations: a linkage of formalisms
Defining the extent of epistasis - the non-independence of the effects of
mutations - is essential for understanding the relationship of genotype,
phenotype, and fitness in biological systems. The applications cover many areas
of biological research, including biochemistry, genomics, protein and systems
engineering, medicine, and evolutionary biology. However, the quantitative
definitions of epistasis vary among fields, and its analysis beyond just
pairwise effects remains obscure in general. Here, we show that different
definitions of epistasis are versions of a single mathematical formalism - the
weighted Walsh-Hadamard transform. We discuss that one of the definitions, the
backgound-averaged epistasis, is the most informative when the goal is to
uncover the general epistatic structure of a biological system, a description
that can be rather different from the local epistatic structure of specific
model systems. Key issues are the choice of effective ensembles for averaging
and to practically contend with the vast combinatorial complexity of mutations.
In this regard, we discuss possible approaches for optimally learning the
epistatic structure of biological systems.Comment: 6 pages, 3 figures, supplementary informatio
Chapter 14 – Evolutionary Algorithms Applied to Electronic-Structure Informatics: Accelerated Materials Design Using Data Discovery vs. Data Searching
We exemplify and propose extending the use of genetic programs (GPs) – a genetic algorithm (GA) that evolves computer programs via mechanisms similar to genetics and natural selection – to symbolically regress key functional relationships between materials data, especially from electronic structure. GPs can extract structure–property relations or enable simulations across multiple scales of time and/or length. Uniquely, GP-based regression permits “data discovery” – finding relevant data and/or extracting correlations (data reduction/data mining) – in contrast to searching for what you know, or you think you know (intuition). First, catalysis-related materials correlations are discussed, where simple electronic-structure-based rules are revealed using well-developed intuition, and then, after introducing the concepts, GP regression is used to obtain (i) a constitutive relation between flow stress and strain rate in aluminum, and (ii) multi-time-scale kinetics for surface alloys. We close with some outlook for a range of applications (materials discovery, excited-state chemistry, and multiscaling) that could rely primarily on density functional theory results
Multiagent cooperation for solving global optimization problems: an extendible framework with example cooperation strategies
This paper proposes the use of multiagent cooperation for solving global optimization problems through the introduction of a new multiagent environment, MANGO. The strength of the environment lays in itsflexible structure based on communicating software agents that attempt to solve a problem cooperatively. This structure allows the execution of a wide range of global optimization algorithms described as a set of interacting operations. At one extreme, MANGO welcomes an individual non-cooperating agent, which is basically the traditional way of solving a global optimization problem. At the other extreme, autonomous agents existing in the environment cooperate as they see fit during run time. We explain the development and communication tools provided in the environment as well as examples of agent realizations and cooperation scenarios. We also show how the multiagent structure is more effective than having a single nonlinear optimization algorithm with randomly selected initial points
- …