12 research outputs found

    Benchmarks for identification of ordinary differential equations from time series data

    Get PDF
    Motivation: In recent years, the biological literature has seen a significant increase of reported methods for identifying both structure and parameters of ordinary differential equations (ODEs) from time series data. A natural way to evaluate the performance of such methods is to try them on a sufficient number of realistic test cases. However, weak practices in specifying identification problems and lack of commonly accepted benchmark problems makes it difficult to evaluate and compare different methods

    Объектно-ориентированная архитектура информационной системы реконструкции генных регуляторных сетей

    No full text
    Описана архитектура информационной системы, предназначенной для решения задач реконструкции генных регуляторных сетей, в основу которой положен объектно-ориентированный подход, определяющий ее открытость, универсальность и расширяемость. Предложен сценарий реконструкции, в котором осуществляется оптимизация пространства поиска значений параметров S-системы.Описано архітектуру інформаційної системи, призначеної для вирішення завдань реконструкції генних регуляторних мереж, в основу якої закладено об'єктно-орієнтований підхід, який визначає її відкритість, універсальність і розширюваність. Запропоновано сценарій реконструкції, в якому проводиться оптимізація простору пошуку значень параметрів S-системи.Introduction. Insufficient level of understanding of the nature of regulation and functional mechanisms of gene regulatory networks does not allow to build their mathematical models, based on the fundamental laws of component’s interaction. Now, many different models and methods of gene regulatory reconstruction are developed, which have the advantages and disadvantages. At the choice of descriptive model it is necessary to consider the fact, that mathematical models, as a rule, have their own structure and a number of parameters, which need to be identified. A large number of computational methods are developed, for structuralparametric model’s identification. The majority of them have increased resistance to noise and uncertainty contained in the initial data. The presence of this property is real for the selection of a computational method, used for solving the reconstructing problem of the gene regulatory network based on the gene expression data. Purpose. The purpose of this work is the development of the information system architecture for the gene regulatory network reconstruction, based on the object-oriented approach. Method. The authors used the method of object-oriented design for the developing of this information system. Results. The architecture of the information system for the gene regulatory networks reconstruction, based on the objectoriented approach is proposed. The S-system is applied as a computational model. The parameters and structure are calculated using the clonal selection algorithm. The gene expression profiles are used as an input data. The developed system includes four basic components: the data source, the model, the solution converter and the identification method. The scenario of solving gene network reconstruction problem is developed. In addition, an iterative algorithm for the space optimization search of the computational model parameter values is implemented in this scenario. Conclusion. The developed architecture is open, so that allows to add or replace the separate components by expansions. Further researches suggest to expand the range of used models, such as radial-base network model and wavelet-neural network model, as well as the system for the gene expression programming. In the future, we are planning to implement the addition of new evolutionary algorithms to the information system. In such a way, the work of the evolutionary operators by the development of new scenarios for solving the gene regulatory networks reconstruction problems can be improved

    Comparison of evolutionary algorithms in gene regulatory network model inference

    Get PDF
    Background: The evolution of high throughput technologies that measure gene expression levels has created a data base for inferring GRNs (a process also known as reverse engineering of GRNs). However, the nature of these data has made this process very di±cult. At the moment, several methods of discovering qualitative causal relationships between genes with high accuracy from microarray data exist, but large scale quantitative analysis on real biological datasets cannot be performed, to date, as existing approaches are not suitable for real microarray data which are noisy and insu±cient. Results: This paper performs an analysis of several existing evolutionary algorithms for quantitative gene regulatory network modelling. The aim is to present the techniques used and o®er a comprehensive comparison of approaches, under a common framework. Algorithms are applied to both synthetic and real gene expression data from DNA microarrays, and ability to reproduce biological behaviour, scalability and robustness to noise are assessed and compared. Conclusions: Presented is a comparison framework for assessment of evolutionary algorithms, used to infer gene regulatory networks. Promising methods are identi¯ed and a platform for development of appropriate model formalisms is established

    Inference of gene expression networks using memetic gene expression programming

    Get PDF
    In this paper we aim to infer a model of genetic networks from time series data of gene expression profiles by using a new gene expression programming algorithm. Gene expression networks are modelled by differential equations which represent temporal gene expression relations. Gene Expression Programming is a new extension of genetic programming. Here we combine a local search method with gene expression programming to form a memetic algorithm in order to find not only the system of differential equations but also fine tune its constant parameters. The effectiveness of the proposed method is justified by comparing its performance with that of conventional genetic programming applied to this problem in previous studies

    Defining WRKY factors involved in plant defense: characterization of Arabidopsis thaliana WRKY27, a gene affecting Ralstonia solanacearum disease development

    Get PDF
    In the present study, it was shown that Arabidopsis plants lacking a functional gene, AtWRKY27, coding for a WRKY-type transcription factor, displayed an altered disease response towards the soil-borne pathogen Ralstonia solanacearum strain GMI 1000. Two independent Atwrky27 knockout (KO) lines consistently exhibited clearly delayed wilting symptoms in response to the bacterium. The steady-state transcript levels of AtWRKY27 were not significantly affected in any of the SA or JA/ET signaling pathway mutants under study. Additionally, Atwrky27-mediated delayed symptoms phenotype was not associated with constitutive expression of defense response marker genes such as PR1, PR5, Thi2.1 or PDF1.2. Loss of AtWRKY27 function did not affect the response of the plants towards other tested pathogens nor towards diverse abiotic stresses. Complementation of the KO lines with AtWRKY27 under the control of its own promoter restored wild type susceptibility to the GMI1000 strain, whereas ectopic overexpression of AtWRKY27 led to an even earlier wilting symptom response than wild type plants. Surprisingly, the bacterial density in aerial parts of both KO lines versus wild type plants increased at similar levels throughout the period assayed. These observations point to a role of AtWRKY27 in a specific defense mechanism known as enhanced pathogen tolerance. AtWRKY27 expressions appear mainly restricted to specific root parts and in vascular tissue that is highly consistent with sites of bacterial colonization and propagation. Interestingly however, AtWRKY27 also appears to be expressed in defined floral organs and the ectopic overexpressor lines showed significant partial male sterility. Our data suggest that AtWRKY27 or a component(s) under the control of this transcription factor can contribute to enhanced pathogen tolerance. There also reveal however that AtWRKY27 has additional functions within certain stages of anther and pollen development

    Gene regulatory network modelling with evolutionary algorithms -an integrative approach

    Get PDF
    Building models for gene regulation has been an important aim of Systems Biology over the past years, driven by the large amount of gene expression data that has become available. Models represent regulatory interactions between genes and transcription factors and can provide better understanding of biological processes, and means of simulating both natural and perturbed systems (e.g. those associated with disease). Gene regulatory network (GRN) quantitative modelling is still limited, however, due to data issues such as noise and restricted length of time series, typically used for GRN reverse engineering. These issues create an under-determination problem, with many models possibly fitting the data. However, large amounts of other types of biological data and knowledge are available, such as cross-platform measurements, knockout experiments, annotations, binding site affinities for transcription factors and so on. It has been postulated that integration of these can improve model quality obtained, by facilitating further filtering of possible models. However, integration is not straightforward, as the different types of data can provide contradictory information, and are intrinsically noisy, hence large scale integration has not been fully explored, to date. Here, we present an integrative parallel framework for GRN modelling, which employs evolutionary computation and different types of data to enhance model inference. Integration is performed at different levels. (i) An analysis of cross-platform integration of time series microarray data, discussing the effects on the resulting models and exploring crossplatform normalisation techniques, is presented. This shows that time-course data integration is possible, and results in models more robust to noise and parameter perturbation, as well as reduced noise over-fitting. (ii) Other types of measurements and knowledge, such as knock-out experiments, annotated transcription factors, binding site affinities and promoter sequences are integrated within the evolutionary framework to obtain more plausible GRN models. This is performed by customising initialisation, mutation and evaluation of candidate model solutions. The different data types are investigated and both qualitative and quantitative improvements are obtained. Results suggest that caution is needed in order to obtain improved models from combined data, and the case study presented here provides an example of how this can be achieved. Furthermore, (iii), RNA-seq data is studied in comparison to microarray experiments, to identify overlapping features and possibilities of integration within the framework. The extension of the framework to this data type is straightforward and qualitative improvements are obtained when combining predicted interactions from single-channel and RNA-seq datasets

    Metabolic Network Model Identification-Parameter Estimation and Ensemble Modeling

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    DYNAMIC MATHEMATICAL TOOLS FOR THE IDENTIFICATION OF REGULATORY STRUCTURES AND KINETIC PARAMETERS IN

    Get PDF
    En aquesta tesi presentem una metodologia sistemàtica la qual permet caracteritzar sistemes biològics dinàmics a partir de dades de series temporals. Del treball desenvolupat se’n desprenen tres publicacions. En la primera desenvolupem un mètode d’optimització global determinista basat en l’outer approximation per a la estimació de paràmetres en sistemes biològics dinàmics. El nostre mètode es basa en la reformulació d’un conjunt d’equacions diferencials ordinàries al seu equivalent algebraic mitjançant l’ús de mètodes de col•locació ortogonal, donant lloc a un problema no convex programació no lineal (NLP). Aquest problema no convex NLP es descompon en dos nivells jeràrquics: un problema master de programació entera mixta (MILP) que proporciona una cota inferior rigorosa al solució global, i una NLP esclau d’espai reduït que dóna un límit superior. L’algorisme itera entre aquests dos nivells fins que un criteri de terminació es satisfà. En les publicacions segona i tercera vam desenvolupar un mètode que és capaç d’identificar l’estructura regulatòria amb els corresponents paràmetres cinètics a partir de dades de series temporals. En la segona publicació vam definir un problema d’optimització dinàmica entera mixta (MIDO) on minimitzem el criteri d’informació d’Akaike. En la tercera publicació vam adoptar una perspectiva MIDO multicriteri on minimitzem l’ajust i complexitat simultàniament mitjançant el mètode de l’epsilon constraint on un dels objectius es tracta com la funció objectiu mentre que la resta es converteixen en restriccions auxiliars. En ambdues publicacions els problemes MIDO es reformulen a programació entera mixta no lineal (MINLP) mitjançant la col•locació ortogonal en elements finits on les variables binàries s’utilitzem per modelar l’existència d’interaccions regulatòries.En esta tesis presentamos una metodología sistemática que permite caracterizar sistemas biológicos dinámicos a partir de datos de series temporales. Del trabajo desarrollado se desprenden tres publicaciones. En la primera desarrollamos un método de optimización global determinista basado en el outer approximation para la estimación de parámetros en sistemas biológicos dinámicos. Nuestro método se basa en la reformulación de un conjunto de ecuaciones diferenciales ordinarias a su equivalente algebraico mediante el uso de métodos de colocación ortogonal, dando lugar a un problema no convexo de programación no lineal (NLP). Este problema no convexo NLP se descompone en dos niveles jerárquicos: un problema master de programación entera mixta (MILP) que proporciona una cota inferior rigurosa al solución global, y una NLP esclavo de espacio reducido que da un límite superior. El algoritmo itera entre estos dos niveles hasta que un criterio de terminación se satisface. En las publicaciones segunda y tercera desarrollamos un método que es capaz de identificar la estructura regulatoria con los correspondientes parámetros cinéticos a partir de datos de series temporales. En la segunda publicación definimos un problema de optimización dinámica entera mixta (MIDO) donde minimizamos el criterio de información de Akaike. En la tercera publicación adoptamos una perspectiva MIDO multicriterio donde minimizamos el ajuste y complejidad simultáneamente mediante el método del epsilon constraint donde uno de los objetivos se trata como la función objetivo mientras que el resto se convierten en restricciones auxiliares. En ambas publicaciones los problemas MIDO se reformulan a programación entera mixta no lineal (MINLP) mediante la colocación ortogonal en elementos finitos donde las variables binarias se utilizan para modelar la existencia de interacciones regulatorias.In this thesis we present a systematic methodology to characterize dynamic biological systems from time series data. From the work we derived three publications. In the first we developed a deterministic global optimization method based on the outer approximation for parameter estimation in dynamic biological systems. Our method is based on reformulating the set of ordinary differential equations into an equivalent set of algebraic equations through the use of orthogonal collocation methods, giving rise to a nonconvex nonlinear programming (NLP) problem. This nonconvex NLP is decomposed into two hierarchical levels: a master mixed-integer linear programming problem (MILP) that provides a rigorous lower bound on the optimal solution, and a reduced-space slave NLP that yields an upper bound. The algorithm iterates between these two levels until a termination criterion is satisfied. In the second and third publications we developed a method that is able to identify the regulatory structure and its corresponding kinetic parameters from time series data. In the second publication we defined a mixed integer dynamic optimization problem (MIDO) which minimize the Akaike information criterion. In the third publication, we adopted a multi-criteria MIDO which minimize complexity and fit simultaneously using the epsilon constraint method in which one objective is treated as the objective function while the rest are converted to auxiliary constraints. In both publications MIDO problems were reformulated to mixed integer nonlinear programming (MINLP) through the use of orthogonal collocation on finite elements where binary variables are used to model the existence of regulatory interactions
    corecore