Search CORE

1,620,029 research outputs found

A framework for evolutionary systems biology

Author: A Aharoni
A Bergman
A Eyre-Walker
A Eyre-Walker
A Gardner
A Gelman
A Goriely
A Grafen
A Grafen
A Grafen
A Munteanu
A Poon
A Rambaut
A Regev
A Wagner
A Wagner
A Wagner
A Wagner
A Wagner
A Wagner
AB Gjuvsland
AB Gjuvsland
AB Gjuvsland
Academy of Medical Sciences Royal Academy of Engineering
AD McCulloch
AG Jones
AJ Betancourt
AJ Betancourt
AJ Drummond
AJ Drummond
AL Barabasi
AL Boulesteix
AL Pey
AM Dean
AM Feist
AM Law
AN Dodd
AR Joyce
B Alberts
B Charlesworth
B Charlesworth
B Palsson
B Papp
B Papp
B Papp
C Adami
C Canestro
C Darwin
C Greenman
C Kulheim
C Loehle
C Queitsch
CA Voigt
CH Waddington
CK Biebricher
CK Biebricher
CL Burch
CL Burch
CM Bishop
CO Wilke
CO Wilke
CO Wilke
CR Landry
CR Nicolson
CW Fox
D Charlesworth
D Endy
D Fell
D Jones
D Monroe
D Segre
D Sprinzak
D Waxman
DA Belostotsky
DA Drummond
DA Drummond
DA Fell
DA Fell
DE Dykhuizen
DF Burke
DJ Balding
DJ Begun
DJ Gaffney
DL Halligan
DL Hartl
DL Hartl
DM Weinreich
DM Weinreich
DPC Peters
DR Rokyta
DS Falconer
DT Gillespie
E Dekel
E Fischer
E Lamm
E Szathmary
E Szathmáry
E Werner
EK Davies
EM De Robertis
ES Lander
ES Snitkin
ES Snitkin
European Science Foundation
EV Koonin
FA Kondrashov
FC Boogerd
FE Ahmed
FH Rigler
FJ Bruggeman
FJ Poelwijk
G Gibson
G Hammer
G Helles
G Kudla
G Martin
G Martin
G Martin
G Martin
G von Dassow
G Zhu
GAT McVean
GB Muller
GE Briggs
GP Wagner
GR McGhee
H Kacser
H Kacser
H Kacser
H Kitano
H Kitano
H Kitano
H Kitano
H Kitano
H Li
H Li
H Rajasingh
HA Orr
HA Orr
HM Taylor
HV Westerhoff
I Famili
IT Jolliffe
J Bart
J Chory
J Förster
J Förster
J Hermisson
J Maynard Smith
J Schaff
J Southern
J Zhao
JA Dalton
JA de Visser
JA de Visser
JA Papin
JB Wolf
JBS Haldane
JC Hopkins
JC Locke
JD Bloom
JE Brommer
JF Crow
JG Kingsolver
JH Gillespie
JH Gillespie
JH Moore
JJ Welch
JJ Welch
JJ Welch
JJ Welch
JL Reed
JP Huelsenbeck
JS Edwards
JS Yuan
JV Chamary
JW Drake
KJ Kauffman
KR Popper
L Aceto
L Alberghina
L Chao
L Chao
L Hood
L Jasnos
L Loewe
L Loewe
L Loewe
L Loewe
L Loewe
L Loewe
Laurence Loewe
LD Hurst
LJ Johnson
M Calder
M Cassman
M Greaves
M Isalan
M Kimura
M Kirkpatrick
M Kreitman
M Lynch
M Lynch
M Lynch
M Lynch
M Lynch
M Medina
M Novak
M Stein
MA Beaumont
MA Gibson
MA O'Malley
MP Murphy
MR Rose
MS Lawrence
MW Covert
N Furnham
N Gershenfeld
N Jamshidi
N Philippe
N Schauer
N Tokuriki
N Tokuriki
NH Barton
NH Barton
NH Barton
NH Barton
NS Bogatyreva
O Mayo
O Wolkenhauer
O Wolkenhauer
OK Silander
OS Soyer
OS Soyer
OS Soyer
P Andolfatto
P Kumar
P Markiewicz
PC Phillips
PD Keightley
PK Shreenivasaiah
R Carlson
R Harrison
R Kassen
R Lande
R Lande
R Lande
R Lande
R Lande
R Lande
R Lande
R Nielsen
R Nielsen
R Sanjuan
R Sanjuan
R Schuetz
R Swarup
RA Fisher
RA Fisher
RA Raff
RA Raff
RE Lenski
RE Lenski
RE Lenski
RG Wiegert
RM May
RN Gutenkunst
RR Gabdoulline
RT Cirz
RU Ibarra
RU Ibarra
RV O'Neill
S Bershtein
S Bonhoeffer
S Gavrilets
S Lutz
S McConnell
S Okasha
S Richter
S Sunyaev
S Wright
S Wright
S Wright
SA Benner
SB Carroll
SC Stearns
SC Stearns
SC Stearns
SF Elena
SF Elena
SF Elena
SG Peisajovich
SG Peisajovich
SH Rice
SH Rice
SH Rice
SH Rice
SH Williamson
SJ Arnold
SJ Arnold
SJ Arnold
SJ Butler
SJ Gould
SJ Gould
SP Miller
SR Norrby
T Alber
T Dobzhansky
T Flatt
T Ohta
T Pfeiffer
T Shlomi
T Warnecke
The Chimpanzee Sequencing and Analysis Consortium
UF Müller
V Grimm
V Grimm
VM D'Costa
VS Cooper
W Arthur
W Arthur
W Qian
WB Provine
WB Watt
WB Watt
WB Watt
WJ Heuett
X Gu
Y Hayashi
Y Ouyang
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Many difficult problems in evolutionary genomics are related to mutations that have weak effects on fitness, as the consequences of mutations with large effects are often simple to predict. Current systems biology has accumulated much data on mutations with large effects and can predict the properties of knockout mutants in some systems. However experimental methods are too insensitive to observe small effects. Results Here I propose a novel framework that brings together evolutionary theory and current systems biology approaches in order to quantify small effects of mutations and their epistatic interactions <it>in silico</it>. Central to this approach is the definition of fitness correlates that can be computed in some current systems biology models employing the rigorous algorithms that are at the core of much work in computational systems biology. The framework exploits synergies between the realism of such models and the need to understand real systems in evolutionary theory. This framework can address many longstanding topics in evolutionary biology by defining various 'levels' of the adaptive landscape. Addressed topics include the distribution of mutational effects on fitness, as well as the nature of advantageous mutations, epistasis and robustness. Combining corresponding parameter estimates with population genetics models raises the possibility of testing evolutionary hypotheses at a new level of realism. Conclusion EvoSysBio is expected to lead to a more detailed understanding of the fundamental principles of life by combining knowledge about well-known biological systems from several disciplines. This will benefit both evolutionary theory and current systems biology. Understanding robustness by analysing distributions of mutational effects and epistasis is pivotal for drug design, cancer research, responsible genetic engineering in synthetic biology and many other practical applications.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

Building a finite state automaton for physical processes using queries and counterexamples on long short-term memory models

Author: Skarstein Eivind Anton Sætre
Publication venue: The University of Bergen
Publication date: 26/06/2023
Field of study

Most neural networks (NN) are commonly used as black-box functions. A network takes an input and produces an output, without the user knowing what rules and system dynamics have produced the specific output. In some situations, such as safety-critical applications, having the capability of understanding and validating models before applying them can be crucial. In this regard, some approaches for representing NN in more understandable ways, attempt to accurately extract symbolic knowledge from the networks using interpretable and simple systems consisting of a finite set of states and transitions known as deterministic finite-state automata (DFA). In this thesis, we have considered a rule extraction approach developed by Weiss et al. that employs the exact learning method L* to extract DFA from recurrent neural networks (RNNs) trained on classifying symbolic data sequences. Our aim has been to study the practicality of applying their rule extraction approach on more complex data based on physical processes consisting of continuous values. Specifically, we experimented with datasets of varying complexities, considering both the inherent complexity of the dataset itself and complexities introduced from different discretization intervals used to represent the continuous data values. Datasets incorporated in this thesis encompass sine wave prediction datasets, sequence value prediction datasets, and a safety-critical well-drilling pressure scenario generated through the use of the well-drilling simulator OpenLab and the sparse identification of nonlinear dynamical systems (SINDy) algorithm. We observe that the rule extraction algorithm is able to extract simple and small DFA representations of LSTM models. On the considered datasets, extracted DFA generally demonstrates worse performance than the LSTM models used for extraction. Overall, for both increasing problem complexity and more discretization intervals, the performance of the extracted DFA decreases. However, DFA extracted from datasets discretized using few intervals yields more impressive results, and the algorithm can in some cases extract DFA that outperforms their respective LSTM models.Masteroppgave i informatikkINF399MAMN-INFMAMN-PRO

University of Bergen

Efficient Algorithms And Optimizations For Scientific Computing On Many-Core Processors

Author: Rushaidat Kamel
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2015
Field of study

Designing efficient algorithms for many-core and multicore architectures requires using different strategies to allow for the best exploitation of the hardware resources on those architectures. Researchers have ported many scientific applications to modern many-core and multicore parallel architectures, and by doing so they have achieved significant speedups over running on single CPU cores. While many applications have achieved significant speedups, some applications still require more effort to accelerate due to their inherently serial behavior. One class of applications that has this serial behavior is the Monte Carlo simulations. Monte Carlo simulations have been used to simulate many problems in statistical physics and statistical mechanics that were not possible to simulate using Molecular Dynamics. While there are a fair number of well-known and recognized GPU Molecular Dynamics codes, the existing Monte Carlo ensemble simulations have not been ported to the GPU, so they are relatively slow and could not run large systems in a reasonable amount of time. Due to the previously mentioned shortcomings of existing Monte Carlo ensemble codes and due to the interest of researchers to have a fast Monte Carlo simulation framework that can simulate large systems, a new GPU framework called GOMC is implemented to simulate different particle and molecular-based force fields and ensembles. GOMC simulates different Monte Carlo ensembles such as the canonical, grand canonical, and Gibbs ensembles. This work describes many challenges in developing a GPU Monte Carlo code for such ensembles and how I addressed these challenges. This work also describes efficient many-core and multicore large-scale energy calculations for Monte Carlo Gibbs ensemble using cell lists. Designing Monte Carlo molecular simulations is challenging as they have less computation and parallelism when compared to similar molecular dynamics applications. The modified cell list allows for more speedup gains for energy calculations on both many-core and multicore architectures when compared to other implementations without using the conventional cell lists. The work presents results and analysis of the cell list algorithms for each one of the parallel architectures using top of the line GPUs, CPUs, and Intel’s Phi coprocessors. In addition, the work evaluates the performance of the cell list algorithms for different problem sizes and different radial cutoffs. In addition, this work evaluates two cell list approaches, a hybrid MPI+OpenMP approach and a hybrid MPI+CUDA approach. The cell list methods are evaluated on a small cluster of multicore CPUs, Intel Phi coprocessors, and GPUs. The performance results are evaluated using different combinations of MPI processes, threads, and problem sizes. Another application presented in this dissertation involves the understanding of the properties of crystalline materials, and their design and control. Recent developments include the introduction of new models to simulate system behavior and properties that are of large experimental and theoretical interest. One of those models is the Phase-Field Crystal (PFC) model. The PFC model has enabled researchers to simulate 2D and 3D crystal structures and study defects such as dislocations and grain boundaries. In this work, GPUs are used to accelerate various dynamic properties of polycrystals in the 2D PFC model. Some properties require very intensive computation that may involve hundreds of thousands of atoms. The GPU implementation has achieved significant speedups of more than 46 times for some large systems simulations

Digital Commons@Wayne State University

Limits for Stochastic Reaction Networks

Author: Cappelletti Daniele
Publication venue
Publication date: 01/01/2015
Field of study

Reaction systems have been introduced in the 70s to model biochemical systems. Nowadays their range of applications has increased and they are fruitfully used in different fields. The concept is simple: some chemical species react, the set of chemical reactions form a graph and a rate function is associated with each reaction. Such functions describe the speed of the different reactions, or their propensities. Two modelling regimes are then available: the evolution of the different species concentrations can be deterministically modelled through a system of ODE, while the counts of the different species at a certain time are stochastically modelled by means of a continuous-time Markov chain. Our work concerns primarily stochastic reaction systems, and their asymptotic properties. In Paper I, we consider a reaction system with intermediate species, i.e. species that are produced and fast degraded along a path of reactions. Let the rates of degradation of the intermediate species be functions of a parameter N that tends to infinity. We consider a reduced system where the intermediate species have been eliminated, and find conditions on the degradation rate of the intermediates such that the behaviour of the reduced network tends to that of the original one. In particular, we prove a uniform punctual convergence in distribution and weak convergence of the integrals of continuous functions along the paths of the two models. Under some extra conditions, we also prove weak convergence of the two processes. The result is stated in the setting of multiscale reaction systems: the amounts of all the species and the rates of all the reactions of the original model can scale as powers of N. A similar result also holds for the deterministic case, as shown in Appendix IA. In Paper II, we focus on the stationary distributions of the stochastic reaction systems. Specifically, we build a theory for stochastic reaction systems that is parallel to the deficiency zero theory for deterministic systems, which dates back to the 70s. A deficiency theory for stochastic reaction systems was missing, and few results connecting deficiency and stochastic reaction systems were known. The theory we build connects special form of product-form stationary distributions with structural properties of the reaction graph of the system. In Paper III, a special class of reaction systems is considered, namely systems exhibiting absolute concentration robust species. Such species, in the deterministic modelling regime, assume always the same value at any positive steady state. In the stochastic setting, we prove that, if the initial condition is a point in the basin of attraction of a positive steady state of the corresponding deterministic model and tends to infinity, then up to a fixed time T the counts of the species exhibiting absolute concentration robustness are, on average, near to their equilibrium value. The result is not obvious because when the counts of some species tend to infinity, so do some rate functions, and the study of the system may become hard. Moreover, the result states a substantial concordance between the paths of the stochastic and the deterministic models

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Вычислительный подход к построению биологии

Author: Mikhailov Igor F.
Publication venue
Publication date: 01/01/2021
Field of study

According to some critics, if biology is a kind of reverse engineering for the nature, it is quite poorly prepared for the task. Thus, the issue is more likely with its ontology. Multiple hypotheses and conjectures found in papers on methodological issues claim that living systems should be viewed as complex networks of signal-transmitting paths, both neural and non-neural, that feature modularity and feedback circuits and are prone to emergent properties and increasing complexity. If so, we are on the eve of a new stage in computer models development where not only computers are used to emulate life, but life itself is construed as a complex network of interacting natural computers. In 2002, Yuri Lazebnik used a salient and profound metaphor to clarify the main theoretical shortage that keeps biology from being a unified and deductively consistent science modeled after physics. Asking if a biologist could fix a broken radio, he revealed that what is missing there is a uni-fied formal language for describing ultimate elements of living devices together with their typical combinations, as it is commonly done in radio engineering. I specify in the paper that what Lazebnik means by a “formal language” is not a language of propositions about the world, i.e., of asserting some states of affairs, but rather a language of listing relevant types of objects and their relations. I refer to it as a domain ontology. A theory needs another language to describe actual states of affairs, which most probably shall be mathematical to be able to represent complicated natural structures in their detail. Then I touch on the popular views, according to which a domain ontology is inferred by a theory prop-er. The history of science shows that true theories that are viable today were often paired with now abandoned ontologies, like that of Caloric or Phlogiston. I suggest that a theory does not infer its on-tology, but rather is interpreted thereupon, being inferentially independent of it. I also review some historically important attempt to mathematize the knowledge of life. I mention Alan Turing’s article on morphogenesis where he used some linear differential equations to explain emergence of complexity from homogeneity. Then I briefly touch on works Nicolas Rashevsky whose theories provided inspira-tion to the inventors of artificial neural networks and allowed for abundant use of different mathemati-cal tools by his disciple Robert Rosen in his study of metabolism. Closer to nowadays, various compu-tational theories in biology have emerged. Some of them treat protein combinations as networks of signal-transmitting pathways that can store and process information. Moreover, in unicellular organ-isms, protein-based circuits replace the whole of the nervous system as a behavior-controlling network. Other theories propose a view, in which an organism is construed as a system of modules connected with protocols, of interfaces. A domain ontology like this may considerably simplify the task of scien-tific description. A special attention is paid to applications of the known free-energy (minimization) principle to the life science matters, as it has initially intended to explain issues of cognitive science. In general, within this view, for an organism to survive is to minimize its thermodynamic potential ener-gy, for which purpose the living being as a whole, and all its subsystems, must constantly produce statistical models of environment that are constantly updated with incoming data. Some strong Bayesi-an mathematics combine with this ontology to claim the whole enterprise as the most prominent uni-versal theory of complex developing systems nowadays. As a general output of the survey, I propose a computational methodological approach of doing biology based on the famous Marr’s three-level view on computational systems together with the necessity of identifying elementary nodes, of which living systems are composed. Such an approach may, as I hope, generate a set of competing theories that will eventually help biologists to fix their “radio”

Tomsk State University Repository

Recommended from our members

Optimization for Probabilistic Machine Learning

Author: Fazelnia Ghazal
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

We have access to great variety of datasets more than any time in the history. Everyday, more data is collected from various natural resources and digital platforms. Great advances in the area of machine learning research in the past few decades have relied strongly on availability of these datasets. However, analyzing them imposes significant challenges that are mainly due to two factors. First, the datasets have complex structures with hidden interdependencies. Second, most of the valuable datasets are high dimensional and are largely scaled. The main goal of a machine learning framework is to design a model that is a valid representative of the observations and develop a learning algorithm to make inference about unobserved or latent data based on the observations. Discovering hidden patterns and inferring latent characteristics in such datasets is one of the greatest challenges in the area of machine learning research. In this dissertation, I will investigate some of the challenges in modeling and algorithm design, and present my research results on how to overcome these obstacles. Analyzing data generally involves two main stages. The first stage is designing a model that is flexible enough to capture complex variation and latent structures in data and is robust enough to generalize well to the unseen data. Designing an expressive and interpretable model is one of crucial objectives in this stage. The second stage involves training learning algorithm on the observed data and measuring the accuracy of model and learning algorithm. This stage usually involves an optimization problem whose objective is to tune the model to the training data and learn the model parameters. Finding global optimal or sufficiently good local optimal solution is one of the main challenges in this step. Probabilistic models are one of the best known models for capturing data generating process and quantifying uncertainties in data using random variables and probability distributions. They are powerful models that are shown to be adaptive and robust and can scale well to large datasets. However, most probabilistic models have a complex structure. Training them could become challenging commonly due to the presence of intractable integrals in the calculation. To remedy this, they require approximate inference strategies that often results in non-convex optimization problems. The optimization part ensures that the model is the best representative of data or data generating process. The non-convexity of an optimization problem take away the general guarantee on finding a global optimal solution. It will be shown later in this dissertation that inference for a significant number of probabilistic models require solving a non-convex optimization problem. One of the well-known methods for approximate inference in probabilistic modeling is variational inference. In the Bayesian setting, the target is to learn the true posterior distribution for model parameters given the observations and prior distributions. The main challenge involves marginalization of all the other variables in the model except for the variable of interest. This high-dimensional integral is generally computationally hard, and for many models there is no known polynomial time algorithm for calculating them exactly. Variational inference deals with finding an approximate posterior distribution for Bayesian models where finding the true posterior distribution is analytically or numerically impossible. It assumes a family of distribution for the estimation, and finds the closest member of that family to the true posterior distribution using a distance measure. For many models though, this technique requires solving a non-convex optimization problem that has no general guarantee on reaching a global optimal solution. This dissertation presents a convex relaxation technique for dealing with hardness of the optimization involved in the inference. The proposed convex relaxation technique is based on semidefinite optimization that has a general applicability to polynomial optimization problem. I will present theoretical foundations and in-depth details of this relaxation in this work. Linear dynamical systems represent the functionality of many real-world physical systems. They can describe the dynamics of a linear time-varying observation which is controlled by a controller unit with quadratic cost function objectives. Designing distributed and decentralized controllers is the goal of many of these systems, which computationally, results in a non-convex optimization problem. In this dissertation, I will further investigate the issues arising in this area and develop a convex relaxation framework to deal with the optimization challenges. Setting the correct number of model parameters is an important aspect for a good probabilistic model. If there are only a few parameters, model may lack capturing all the essential relations and components in the observations while too many parameters may cause significant complications in learning or overfit to the observations. Non-parametric models are suitable techniques to deal with this issue. They allow the model to learn the appropriate number of parameters to describe the data and make predictions. In this dissertation, I will present my work on designing Bayesian non-parametric models as powerful tools for learning representations of data. Moreover, I will describe the algorithm that we derived to efficiently train the model on the observations and learn the number of model parameters. Later in this dissertation, I will present my works on designing probabilistic models in combination with deep learning methods for representing sequential data. Sequential datasets comprise a significant portion of resources in the area of machine learning research. Designing models to capture dependencies in sequential datasets are of great interest and have a wide variety of applications in engineering, medicine and statistics. Recent advances in deep learning research has shown exceptional promises in this area. However, they lack interpretability in their general form. To remedy this, I will present my work on mixing probabilistic models with neural network models that results in better performance and expressiveness of the results

Columbia University Academic Commons

An analysis of nature-based treatment processes for cleaning contaminated surface water runoff from an informal settlement: a case study of the Stiebeuel River catchment, Franschhoek, South Africa

Author: Nicklin Emily
Publication venue: 'University of Zagreb, Faculty of Science, Department of Mathematics'
Publication date: 12/04/2023
Field of study

Contaminated surface water runoff from inadequate drainage and sanitation systems in informal settlements threaten the quality of available freshwater and can negatively impact both human and environmental health. Biofiltration systems (biofilters) provide water pollution controls without inputs of additional energy and chemicals, placing them in the overall context of the need for affordable and sustainable stormwater infrastructure in informal settlements. In addition, cleaned waters from biofilters may be suitable for some reuse applications if they are well-designed and maintained. However, most research is conducted in developed countries where heavy metals are the main surface water pollutant. Consequently, little is known about the extent to which biofilters can be used to meet the water quality targets in conditions likely to be found in informal settlements. In addition, no attempts have been made to recover or reuse the surface water runoff from informal settlements, despite its high nutrient loadings. This study analyses the extent to which biofilters can be used to clean and reuse contaminated surface water runoff from informal settlements. The objectives are threefold: (i) to analyse the performance of two field-scale biofiltration cells (one vegetated and one non-vegetated) that are batch-fed with surface water runoff from an upstream informal settlement; (ii) to determine the effects of varying operating, design and environmental parameters on the performance of the cells; and (iii) to develop a model which predicts the outflow pollutant concentrations under varying conditions. Both cells effectively reduced ammonia (NH3), Total Phosphate (TP) and Escherichia coli (E. coli) concentrations, but leached nitrate (NO3 - ) and nitrite (NO2 - ). The treated waters were suitable for irrigational reuse, however, additional disinfection was required to reduce faecal contamination in some cases. Correlation analyses showed that inflow water quality significantly influenced cell performance, with the vegetated cell outperforming the non-vegetated cell under higher inflow pollutant concentrations. Multiple regression models also investigated several parameters influencing outflow NH3 and showed that inflow pH, temperature and NH3 concentration can be used to determine the outflow NH3 concentration of the cells. These models are important for predicting cell performance and thus can be used to improve the design and/or operation of the cells for varying inflow water quality conditions

Cape Town University OpenUCT

Temporal Data Modeling and Reasoning for Information Systems

Author: Bry François
Spranger Stephanie
Publication venue
Publication date: 01/01/2006
Field of study

Temporal knowledge representation and reasoning is a major research field in Artificial Intelligence, in Database Systems, and in Web and Semantic Web research. The ability to model and process time and calendar data is essential for many applications like appointment scheduling, planning, Web services, temporal and active database systems, adaptive Web applications, and mobile computing applications. This article aims at three complementary goals. First, to provide with a general background in temporal data modeling and reasoning approaches. Second, to serve as an orientation guide for further specific reading. Third, to point to new application fields and research perspectives on temporal knowledge representation and reasoning in the Web and Semantic Web

Open Access LMU