Search CORE

670 research outputs found

Statistical mechanics approaches to optimization and inference

Author: Muntoni ANNA PAOLA
Publication venue: Politecnico di Torino
Publication date: 01/01/2017
Field of study

Nowadays, typical methodologies employed in statistical physics are successfully applied to a huge set of problems arising from different research fields. In this thesis I will propose several statistical mechanics based models able to deal with two types of problems: optimization and inference problems. The intrinsic difficulty that characterizes both problems is that, due to the hard combinatorial nature of optimization and inference, finding exact solutions would require hard and impractical computations. In fact, the time needed to perform these calculations, in almost all cases, scales exponentially with respect to relevant parameters of the system and thus cannot be accomplished in practice. As combinatorial optimization addresses the problem of finding a fair configuration of variables able to minimize/maximize an objective function, inference seeks a posteriori the most fair assignment of a set of variables given a partial knowledge of the system. These two problems can be re-phrased in a statistical mechanics framework where elementary components of a physical system interact according to the constraints of the original problem. The information at our disposal can be encoded in the Boltzmann distribution of the new variables which, if properly investigated, can provide the solutions to the original problems. As a consequence, the methodologies originally adopted in statistical mechanics to study and, eventually, approximate the Boltzmann distribution can be fruitfully applied for solving inference and optimization problems. The structure of the thesis follows the path covered during the three years of my Ph.D. At first, I will propose a set of combinatorial optimization problems on graphs, the Prize collecting and the Packing of Steiner trees problems. The tools used to face these hard problems rely on the zero-temperature implementation of the Belief Propagation algorithm, called Max Sum algorithm. The second set of problems proposed in this thesis falls under the name of linear estimation problems. One of them, the compressed sensing problem, will guide us in the modelling of these problems within a Bayesian framework along with the introduction of a powerful algorithm known as Expectation Propagation or Expectation Consistent in statistical physics. I will propose a similar approach to other challenging problems: the inference of metabolic fluxes, the inverse problem of the electro-encephalography and the reconstruction of tomographic images

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Graphs in machine learning: an introduction

Author: Latouche Pierre
Rossi Fabrice
Publication venue
Publication date: 22/04/2015
Field of study

Graphs are commonly used to characterise interactions between objects of interest. Because they are based on a straightforward formalism, they are used in many scientific fields from computer science to historical sciences. In this paper, we give an introduction to some methods relying on graphs for learning. This includes both unsupervised and supervised methods. Unsupervised learning algorithms usually aim at visualising graphs in latent spaces and/or clustering the nodes. Both focus on extracting knowledge from graph topologies. While most existing techniques are only applicable to static graphs, where edges do not evolve through time, recent developments have shown that they could be extended to deal with evolving networks. In a supervised context, one generally aims at inferring labels or numerical values attached to nodes using both the graph and, when they are available, node characteristics. Balancing the two sources of information can be challenging, especially as they can disagree locally or globally. In both contexts, supervised and un-supervised, data can be relational (augmented with one or several global graphs) as described above, or graph valued. In this latter case, each object of interest is given as a full graph (possibly completed by other characteristics). In this context, natural tasks include graph clustering (as in producing clusters of graphs rather than clusters of nodes in a single graph), graph classification, etc. 1 Real networks One of the first practical studies on graphs can be dated back to the original work of Moreno [51] in the 30s. Since then, there has been a growing interest in graph analysis associated with strong developments in the modelling and the processing of these data. Graphs are now used in many scientific fields. In Biology [54, 2, 7], for instance, metabolic networks can describe pathways of biochemical reactions [41], while in social sciences networks are used to represent relation ties between actors [66, 56, 36, 34]. Other examples include powergrids [71] and the web [75]. Recently, networks have also been considered in other areas such as geography [22] and history [59, 39]. In machine learning, networks are seen as powerful tools to model problems in order to extract information from data and for prediction purposes. This is the object of this paper. For more complete surveys, we refer to [28, 62, 49, 45]. In this section, we introduce notations and highlight properties shared by most real networks. In Section 2, we then consider methods aiming at extracting information from a unique network. We will particularly focus on clustering methods where the goal is to find clusters of vertices. Finally, in Section 3, techniques that take a series of networks into account, where each network i

arXiv.org e-Print Archive

HAL-Paris1

Hal-Diderot

Inferring cellular networks – a review

Author: A Bernard
A Butte
A de la Fuente
A Dobra
A Gelman
A Margolin
A Wagner
A Wagner
A Wagner
A Wille
A Wille
AHY Tong
AJ Hartemink
AJ Hartemink
AV Aho
AV Werhli
B Alberts
B Efron
B Schölkopf
BE Perrin
BL Drees
C Brown
C Rangel
C Rangel
C Yoo
CH Yeang
CH Yeang
CJ Needham
CJ Wolfe
D di Bernardo
D di Bernardo
D Edwards
D Geiger
D Heckerman
D Heckerman
D Husmeier
D Hwang
D Kostka
D Madigan
D Pe'er
D Pe'er
DE Zak
DE Zak
DM Chickering
DM Chickering
DR Bickel
E Segal
E Segal
E Segal
EH Davidson
F Markowetz
F Markowetz
F Markowetz
F Markowetz
F Markowetz
FC Wimberly
Florian Markowetz
G Schwarz
GF Cooper
GF Cooper
GW Carter
H De Jong
H Kishino
H Li
H Steck
H Steck
H Steck
I Gat-Viks
I Nachman
I Nachman
I Pournara
IM Ong
J Mandel
J Pearl
J Pearl
J Peña
J Rung
J Schäfer
J Schäfer
J Tegner
J van Leeuwen
J Yu
JA Papin
JJ Rice
JM Stuart
K Basso
K Murphy
K Sachs
L Avery
L Ljung
L Wessels
LA Soinov
M Ashburner
M Eisen
M Zou
MJ Beal
N Friedman
N Friedman
N Friedman
N Friedman
N Friedman
N Friedman
N Friedman
N Friedman
N Meinshausen
NV Driessche
OG Troyanskaya
P D'haeseleer
P Spellman
P Spirtes
PM Magwene
PWF Smith
R Bonneau
R Jansen
Rainer Spang
RW Robinson
S Bulashevska
S Imoto
S Imoto
S Imoto
S Rogers
S Yeung
SG Bøttcher
SL Lauritzen
SL Wong
T Aittokallio
T Akutsu
T Akutsu
T Ideker
T Kato
TS Gardner
TS Verma
V Filkov
VA Smith
W Hastings
W Wang
Y Tamada
Y Yamanishi
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

In this review we give an overview of computational and statistical methods to reconstruct cellular networks. Although this area of research is vast and fast developing, we show that most currently used methods can be organized by a few key concepts. The first part of the review deals with conditional independence models including Gaussian graphical models and Bayesian networks. The second part discusses probabilistic and graph-based methods for data from experimental interventions and perturbations

University of Regensburg Publication Server

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Reactmine: a search algorithm for inferring chemical reaction networks from time series data

Author: Ballesta Annabelle
Fages François
Grignard Jeremy
Martinelli Julien
Soliman Sylvain
Publication venue
Publication date: 07/09/2022
Field of study

Inferring chemical reaction networks (CRN) from time series data is a challenge encouraged by the growing availability of quantitative temporal data at the cellular level. This motivates the design of algorithms to infer the preponderant reactions between the molecular species observed in a given biochemical process, and help to build CRN model structure and kinetics. Existing ODE-based inference methods such as SINDy resort to least square regression combined with sparsity-enforcing penalization, such as Lasso. However, when the input time series are only available in wild type conditions in which all reactions are present, we observe that current methods fail to learn sparse models. Results: We present Reactmine, a CRN learning algorithm which enforces sparsity by inferring reactions in a sequential fashion within a search tree of bounded depth, ranking the inferred reaction candidates according to the variance of their kinetics, and re-optimizing the CRN kinetic parameters on the whole trace in a final pass to rank the inferred CRN candidates. We first evaluate its performance on simulation data from a benchmark of hidden CRNs, together with algorithmic hyperparameter sensitivity analyses, and then on two sets of real experimental data: one from protein fluorescence videomicroscopy of cell cycle and circadian clock markers, and one from biomedical measurements of systemic circadian biomarkers possibly acting on clock gene expression in peripheral organs. We show that Reactmine succeeds both on simulation data by retrieving hidden CRNs where SINDy fails, and on the two real datasets by inferring reactions in agreement with previous studies

arXiv.org e-Print Archive

A hybrid algorithm for Bayesian network structure learning with application to multi-label learning

Author: Aussem Alex
Elghazel Haytham
Gasse Maxime
Publication venue: 'Elsevier BV'
Publication date: 01/11/2014
Field of study

We present a novel hybrid algorithm for Bayesian network structure learning, called H2PC. It first reconstructs the skeleton of a Bayesian network and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges. The algorithm is based on divide-and-conquer constraint-based subroutines to learn the local structure around a target variable. We conduct two series of experimental comparisons of H2PC against Max-Min Hill-Climbing (MMHC), which is currently the most powerful state-of-the-art algorithm for Bayesian network structure learning. First, we use eight well-known Bayesian network benchmarks with various data sizes to assess the quality of the learned structure returned by the algorithms. Our extensive experiments show that H2PC outperforms MMHC in terms of goodness of fit to new data and quality of the network structure with respect to the true dependence structure of the data. Second, we investigate H2PC's ability to solve the multi-label learning problem. We provide theoretical results to characterize and identify graphically the so-called minimal label powersets that appear as irreducible factors in the joint distribution under the faithfulness condition. The multi-label learning problem is then decomposed into a series of multi-class classification problems, where each multi-class variable encodes a label powerset. H2PC is shown to compare favorably to MMHC in terms of global classification accuracy over ten multi-label data sets covering different application domains. Overall, our experiments support the conclusions that local structural learning with H2PC in the form of local neighborhood induction is a theoretically well-motivated and empirically effective learning framework that is well suited to multi-label learning. The source code (in R) of H2PC as well as all data sets used for the empirical tests are publicly available.Comment: arXiv admin note: text overlap with arXiv:1101.5184 by other author

arXiv.org e-Print Archive

Crossref

HAL

Hal-Diderot

Integrate qualitative biological knowledge for gene regulatory network reconstruction with dynamic Bayesian networks

Author: Li Song
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2007
Field of study

Reconstructing gene regulatory networks, especially the dynamic gene networks that reveal the temporal program of gene expression from microarray expression data, is essential in systems biology. To overcome the challenges posed by the noisy and under-sampled microarray data, developing data fusion methods to integrate legacy biological knowledge for gene network reconstruction is a promising direction. However, large amount of qualitative biological knowledge accumulated by previous research, albeit very valuable, has received less attention for reconstructing dynamic gene networks due to its incompatibility with the quantitative computational models.;In this dissertation, I introduce a novel method to fuse qualitative gene interaction information with quantitative microarray data under the Dynamic Bayesian Networks framework. This method extends the previous data integration methods by its capabilities of both utilizing qualitative biological knowledge by using Bayesian Networks without the involvement of human experts, and taking time-series data to produce dynamic gene networks. The experimental study shows that when compared with standard Dynamic Bayesian Networks method which only uses microarray data, our method excels by both accuracy and consistency

Digital Repository @ Iowa State University (ISU)

Situation-appropriate Investment of Cognitive Resources

Author: Ott Florian
Publication venue
Publication date: 29/03/2022
Field of study

The human brain is equipped with the ability to plan ahead, i.e. to mentally simulate the expected consequences of candidate actions to select the one with the most desirable expected long-term outcome. Insufficient planning can lead to maladaptive behaviour and may even be a contributory cause of important societal problems such as the depletion of natural resources or man-made climate change. Understanding the cognitive and neural mechanisms of forward planning and its regulation are therefore of great importance and could ultimately give us clues on how to better align our behaviour with long-term goals. Apart from its potential beneficial effects, planning is time-consuming and therefore associated with opportunity costs. It is assumed that the brain regulates the investment into planning based on a cost-benefit analysis, so that planning only takes place when the perceived benefits outweigh the costs. But how can the brain know in advance how beneficial or costly planning will be? One potential solution is that people learn from experience how valuable planning would be in a given situation. It is however largely unknown how the brain implements such learning, especially in environments with large state spaces. This dissertation tested the hypothesis that humans construct and use so-called control contexts to efficiently adjust the degree of planning to the demands of the current situation. Control contexts can be seen as abstract state representations, that conveniently cluster together situations with a similar demand for planning. Inferring context thus allows to prospectively adjust the control system to the learned demands of the global context. To test the control context hypothesis, two complex sequential decision making tasks were developed. Each of the two tasks had to fulfil two important criteria. First, the tasks should generate both situations in which planning had the potential to improve performance, as well as situations in which a simple strategy was sufficient. Second, the tasks had to feature rich state spaces requiring participants to compress their state representation for efficient regulation of planning. Participants’ planning was modelled using a parametrized dynamic programming solution to a Markov Decision Process, with parameters estimated via hierarchical Bayesian inference. The first study used a 15-step task in which participants had to make a series of decisions to achieve one or multiple goals. In this task, the computational costs of accurate forward planning increased exponentially with the length of the planning horizon. We therefore hypothesized that participants identify ‘distance from goal’ as the relevant contextual feature to guide their regulation of forward planning. As expected we found that participants predominantly relied on a simple heuristic when still far from the goal but progressively switched towards forward planning when the goal approached. In the second study participants had to sustainably invest a limited but replenishable energy resource, that was needed to accept offers, in order to accumulate a maximum number of points in the long run. The demand for planning varied across the different situations of the task, but due to the large number of possible situations (n = 448) it would be difficult for the participants to develop an expectation for each individual situation of how beneficial planning would be. We therefore hypothesized, that to regulate their forward planning participants used a compressed tasks representation, clustering together states with similar demands for planning. Consistent with this, reaction times (operationalising planning duration) increased with trial-by-trial value-conflict (operationalising approximate planning demand), but this increase was more pronounced in a context with generally high demand for planning. We further found that fMRI activity in the dorsal anterior cingulate cortex (dACC) increased with conflict, but this increase was more pronounced in a context with generally high demand for planning as well. Taken together, the results suggest that the dACC integrates representations of planning demand on different levels of abstraction to regulate prospective information sampling in an efficient and situation-appropriate way. This dissertation provides novel insights into the question how humans adapt their planning to the demands of the current situation. The results are consistent with the view that the regulation of planning is based on an integrated signal of the expected costs and benefits of planning. Furthermore, the results of this dissertation provide evidence that the regulation of planning in environments with real-world complexity critically relies on the brain’s powerful ability to construct and use abstract hierarchical representations

Technische Universität Dresden: Qucosa

Integration of Pathway Data as Prior Knowledge into Methods for Network Reconstruction

Author: Kramer Frank
Publication venue
Publication date: 16/09/2014
Field of study

Georg-August-University Göttingen

Recommended from our members

A generic approach to behaviour-driven biochemical model construction

Author: Wu Zujian
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics
Publication date: 01/01/2012
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Modelling of biochemical systems has received considerable attention over the last decade from bioengineering, biochemistry, computer science, and mathematics. This thesis investigates the applications of computational techniques to computational systems biology, for the construction of biochemical models in terms of topology and kinetic rates. Due to the complexity of biochemical systems, it is natural to construct models representing the biochemical systems incrementally in a piecewise manner. Syntax and semantics of two patterns are defined for the instantiation of components which are extendable, reusable and fundamental building blocks for models composition. We propose and implement a set of genetic operators and composition rules to tackle issues of piecewise composing models from scratch. Quantitative Petri nets are evolved by the genetic operators, and evolutionary process of modelling are guided by the composition rules. Metaheuristic algorithms are widely applied in BioModel Engineering to support intelligent and heuristic analysis of biochemical systems in terms of structure and kinetic rates. We illustrate parameters of biochemical models based on Biochemical Systems Theory, and then the topology and kinetic rates of the models are manipulated by employing evolution strategy and simulated annealing respectively. A new hybrid modelling framework is proposed and implemented for the models construction. Two heuristic algorithms are performed on two embedded layers in the hybrid framework: an outer layer for topology mutation and an inner layer for rates optimization. Moreover, variants of the hybrid piecewise modelling framework are investigated. Regarding flexibility of these variants, various combinations of evolutionary operators, evaluation criteria and design principles can be taken into account. We examine performance of five sets of the variants on specific aspects of modelling. The comparison of variants is not to explicitly show that one variant clearly outperforms the others, but it provides an indication of considering important features for various aspects of the modelling. Because of the very heavy computational demands, the process of modelling is paralleled by employing a grid environment, GridGain. Application of the GridGain and heuristic algorithms to analyze biological processes can support modelling of biochemical systems in a computational manner, which can also benefit mathematical modelling in computer science and bioengineering. We apply our proposed modelling framework to model biochemical systems in a hybrid piecewise manner. Modelling variants of the framework are comparatively studied on specific aims of modelling. Simulation results show that our modelling framework can compose synthetic models exhibiting similar species behaviour, generate models with alternative topologies and obtain general knowledge about key modelling features

Brunel University Research Archive