1,261 research outputs found

    Information processing in biology

    Get PDF
    To survive, organisms must respond appropriately to a variety of challenges posed by a dynamic and uncertain environment. The mechanisms underlying such responses can in general be framed as input-output devices which map environment states (inputs) to associated responses (output. In this light, it is appealing to attempt to model these systems using information theory, a well developed mathematical framework to describe input-output systems. Under the information theoretical perspective, an organism’s behavior is fully characterized by the repertoire of its outputs under different environmental conditions. Due to natural selection, it is reasonable to assume this input-output mapping has been fine tuned in such a way as to maximize the organism’s fitness. If that is the case, it should be possible to abstract away the mechanistic implementation details and obtain the general principles that lead to fitness under a certain environment. These can then be used inferentially to both generate hypotheses about the underlying implementation as well as predict novel responses under external perturbations. In this work I use information theory to address the question of how biological systems generate complex outputs using relatively simple mechanisms in a robust manner. In particular, I will examine how communication and distributed processing can lead to emergent phenomena which allow collective systems to respond in a much richer way than a single organism could

    Information processing in biology

    Get PDF
    To survive, organisms must respond appropriately to a variety of challenges posed by a dynamic and uncertain environment. The mechanisms underlying such responses can in general be framed as input-output devices which map environment states (inputs) to associated responses (output. In this light, it is appealing to attempt to model these systems using information theory, a well developed mathematical framework to describe input-output systems. Under the information theoretical perspective, an organism’s behavior is fully characterized by the repertoire of its outputs under different environmental conditions. Due to natural selection, it is reasonable to assume this input-output mapping has been fine tuned in such a way as to maximize the organism’s fitness. If that is the case, it should be possible to abstract away the mechanistic implementation details and obtain the general principles that lead to fitness under a certain environment. These can then be used inferentially to both generate hypotheses about the underlying implementation as well as predict novel responses under external perturbations. In this work I use information theory to address the question of how biological systems generate complex outputs using relatively simple mechanisms in a robust manner. In particular, I will examine how communication and distributed processing can lead to emergent phenomena which allow collective systems to respond in a much richer way than a single organism could

    Graphical Models for Multivariate Time-Series

    Get PDF
    Gaussian graphical models have received much attention in the last years, due to their flexibility and expression power. In particular, lots of interests have been devoted to graphical models for temporal data, or dynamical graphical models, to understand the relation of variables evolving in time. While powerful in modelling complex systems, such models suffer from computational issues both in terms of convergence rates and memory requirements, and may fail to detect temporal patterns in case the information on the system is partial. This thesis comprises two main contributions in the context of dynamical graphical models, tackling these two aspects: the need of reliable and fast optimisation methods and an increasing modelling power, which are able to retrieve the model in practical applications. The first contribution consists in a forward-backward splitting (FBS) procedure for Gaussian graphical modelling of multivariate time-series which relies on recent theoretical studies ensuring global convergence under mild assumptions. Indeed, such FBS-based implementation achieves, with fast convergence rates, optimal results with respect to ground truth and standard methods for dynamical network inference. The second main contribution focuses on the problem of latent factors, that influence the system while hidden or unobservable. This thesis proposes the novel latent variable time-varying graphical lasso method, which is able to take into account both temporal dynamics in the data and latent factors influencing the system. This is fundamental for the practical use of graphical models, where the information on the data is partial. Indeed, extensive validation of the method on both synthetic and real applications shows the effectiveness of considering latent factors to deal with incomplete information

    The effect of noise on dynamics and the influence of biochemical systems

    No full text
    Understanding a complex system requires integration and collective analysis of data from many levels of organisation. Predictive modelling of biochemical systems is particularly challenging because of the nature of data being plagued by noise operating at each and every level. Inevitably we have to decide whether we can reliably infer the structure and dynamics of biochemical systems from present data. Here we approach this problem from many fronts by analysing the interplay between deterministic and stochastic dynamics in a broad collection of biochemical models. In a classical mathematical model we first illustrate how this interplay can be described in surprisingly simple terms; we furthermore demonstrate the advantages of a statistical point of view also for more complex systems. We then investigate strategies for the integrated analysis of models characterised by different organisational levels, and trace the propagation of noise through such systems. We use this approach to uncover, for the first time, the dynamics of metabolic adaptation of a plant pathogen throughout its life cycle and discuss the ecological implications. Finally, we investigate how reliably we can infer model parameters of biochemical models. We develop a novel sensitivity/inferability analysis framework that is generally applicable to a large fraction of current mathematical models of biochemical systems. By using this framework to quantify the effect of parametric variation on system dynamics, we provide practical guidelines as to when and why certain parameters are easily estimated while others are much harder to infer. We highlight the limitations on parameter inference due to model structure and qualitative dynamical behaviour, and identify candidate elements of control in biochemical pathways most likely of being subjected to regulation

    Machine learning approach to reconstructing signalling pathways and interaction networks in biology

    Get PDF
    In this doctoral thesis, I present my research into applying machine learning techniques for reconstructing species interaction networks in ecology, reconstructing molecular signalling pathways and gene regulatory networks in systems biology, and inferring parameters in ordinary differential equation (ODE) models of signalling pathways. Together, the methods I have developed for these applications demonstrate the usefulness of machine learning for reconstructing networks and inferring network parameters from data. The thesis consists of three parts. The first part is a detailed comparison of applying static Bayesian networks, relevance vector machines, and linear regression with L1 regularisation (LASSO) to the problem of reconstructing species interaction networks from species absence/presence data in ecology (Faisal et al., 2010). I describe how I generated data from a stochastic population model to test the different methods and how the simulation study led us to introduce spatial autocorrelation as an important covariate. I also show how we used the results of the simulation study to apply the methods to presence/absence data of bird species from the European Bird Atlas. The second part of the thesis describes a time-varying, non-homogeneous dynamic Bayesian network model for reconstructing signalling pathways and gene regulatory networks, based on L`ebre et al. (2010). I show how my work has extended this model to incorporate different types of hierarchical Bayesian information sharing priors and different coupling strategies among nodes in the network. The introduction of these priors reduces the inference uncertainty by putting a penalty on the number of structure changes among network segments separated by inferred changepoints (Dondelinger et al., 2010; Husmeier et al., 2010; Dondelinger et al., 2012b). Using both synthetic and real data, I demonstrate that using information sharing priors leads to a better reconstruction accuracy of the underlying gene regulatory networks, and I compare the different priors and coupling strategies. I show the results of applying the model to gene expression datasets from Drosophila melanogaster and Arabidopsis thaliana, as well as to a synthetic biology gene expression dataset from Saccharomyces cerevisiae. In each case, the underlying network is time-varying; for Drosophila melanogaster, as a consequence of measuring gene expression during different developmental stages; for Arabidopsis thaliana, as a consequence of measuring gene expression for circadian clock genes under different conditions; and for the synthetic biology dataset, as a consequence of changing the growth environment. I show that in addition to inferring sensible network structures, the model also successfully predicts the locations of changepoints. The third and final part of this thesis is concerned with parameter inference in ODE models of biological systems. This problem is of interest to systems biology researchers, as kinetic reaction parameters can often not be measured, or can only be estimated imprecisely from experimental data. Due to the cost of numerically solving the ODE system after each parameter adaptation, this is a computationally challenging problem. Gradient matching techniques circumvent this problem by directly fitting the derivatives of the ODE to the slope of an interpolant. I present an inference procedure for a model using nonparametric Bayesian statistics with Gaussian processes, based on Calderhead et al. (2008). I show that the new inference procedure improves on the original formulation in Calderhead et al. (2008) and I present the result of applying it to ODE models of predator-prey interactions, a circadian clock gene, a signal transduction pathway, and the JAK/STAT pathway

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

    Differential geometric MCMC methods and applications

    Get PDF
    This thesis presents novel Markov chain Monte Carlo methodology that exploits the natural representation of a statistical model as a Riemannian manifold. The methods developed provide generalisations of the Metropolis-adjusted Langevin algorithm and the Hybrid Monte Carlo algorithm for Bayesian statistical inference, and resolve many shortcomings of existing Monte Carlo algorithms when sampling from target densities that may be high dimensional and exhibit strong correlation structure. The performance of these Riemannian manifold Markov chain Monte Carlo algorithms is rigorously assessed by performing Bayesian inference on logistic regression models, log-Gaussian Cox point process models, stochastic volatility models, and both parameter and model level inference of dynamical systems described by nonlinear differential equations

    Investigating hybrids of evolution and learning for real-parameter optimization

    Get PDF
    In recent years, more and more advanced techniques have been developed in the field of hybridizing of evolution and learning, this means that more applications with these techniques can benefit from this progress. One example of these advanced techniques is the Learnable Evolution Model (LEM), which adopts learning as a guide for the general evolutionary search. Despite this trend and the progress in LEM, there are still many ideas and attempts which deserve further investigations and tests. For this purpose, this thesis has developed a number of new algorithms attempting to combine more learning algorithms with evolution in different ways. With these developments, we expect to understand the effects and relations between evolution and learning, and also achieve better performances in solving complex problems. The machine learning algorithms combined into the standard Genetic Algorithm (GA) are the supervised learning method k-nearest-neighbors (KNN), the Entropy-Based Discretization (ED) method, and the decision tree learning algorithm ID3. We test these algorithms on various real-parameter function optimization problems, especially the functions in the special session on CEC 2005 real-parameter function optimization. Additionally, a medical cancer chemotherapy treatment problem is solved in this thesis by some of our hybrid algorithms. The performances of these algorithms are compared with standard genetic algorithms and other well-known contemporary evolution and learning hybrid algorithms. Some of them are the CovarianceMatrix Adaptation Evolution Strategies (CMAES), and variants of the Estimation of Distribution Algorithms (EDA). Some important results have been derived from our experiments on these developed algorithms. Among them, we found that even some very simple learning methods hybridized properly with evolution procedure can provide significant performance improvement; and when more complex learning algorithms are incorporated with evolution, the resulting algorithms are very promising and compete very well against the state of the art hybrid algorithms both in well-defined real-parameter function optimization problems and a practical evaluation-expensive problem
    • …
    corecore