7,989 research outputs found

    Netter: re-ranking gene network inference predictions using structural network properties

    Get PDF
    Background: Many algorithms have been developed to infer the topology of gene regulatory networks from gene expression data. These methods typically produce a ranking of links between genes with associated confidence scores, after which a certain threshold is chosen to produce the inferred topology. However, the structural properties of the predicted network do not resemble those typical for a gene regulatory network, as most algorithms only take into account connections found in the data and do not include known graph properties in their inference process. This lowers the prediction accuracy of these methods, limiting their usability in practice. Results: We propose a post-processing algorithm which is applicable to any confidence ranking of regulatory interactions obtained from a network inference method which can use, inter alia, graphlets and several graph-invariant properties to re-rank the links into a more accurate prediction. To demonstrate the potential of our approach, we re-rank predictions of six different state-of-the-art algorithms using three simple network properties as optimization criteria and show that Netter can improve the predictions made on both artificially generated data as well as the DREAM4 and DREAM5 benchmarks. Additionally, the DREAM5 E. coli. community prediction inferred from real expression data is further improved. Furthermore, Netter compares favorably to other post-processing algorithms and is not restricted to correlation-like predictions. Lastly, we demonstrate that the performance increase is robust for a wide range of parameter settings. Netter is available at http://bioinformatics. intec. ugent. be. Conclusions: Network inference from high-throughput data is a long-standing challenge. In this work, we present Netter, which can further refine network predictions based on a set of user-defined graph properties. Netter is a flexible system which can be applied in unison with any method producing a ranking from omics data. It can be tailored to specific prior knowledge by expert users but can also be applied in general uses cases. Concluding, we believe that Netter is an interesting second step in the network inference process to further increase the quality of prediction

    Extreme learning machines for reverse engineering of gene regulatory networks from expression time series

    Get PDF
    The reconstruction of gene regulatory networks (GRNs) from genes profiles has a growing interest in bioinformatics for understanding the complex regulatory mechanisms in cellular systems. GRNs explicitly represent the cause-effect of regulation among a group of genes and its reconstruction is today a challenging computational problem. Several methods were proposed, but most of them require different input sources to provide an acceptable prediction. Thus, it is a great challenge to reconstruct a GRN only from temporal gene-expression data. Results: Extreme Learning Machine (ELM) is a new supervised neural model that has gained interest in the last years because of its higher learning rate and better performance than existing supervised models in terms of predictive power. This work proposes a novel approach for GRNs reconstruction in which ELMs are used for modeling the relationships between gene expression time series. Artificial datasets generated with the well-known benchmark tool used in DREAM competitions were used. Real datasets were used for validation of this novel proposal with well-known GRNs underlying the time series. The impact of increasing the size of GRNs was analyzed in detail for the compared methods. The results obtained confirm the superiority of the ELM approach against very recent state-of-the-art methods in the same experimental conditions.Fil: Rubiolo, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Stegmayer, Georgina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentin

    Evolutionary connectionism: algorithmic principles underlying the evolution of biological organisation in evo-devo, evo-eco and evolutionary transitions

    Get PDF
    The mechanisms of variation, selection and inheritance, on which evolution by natural selection depends, are not fixed over evolutionary time. Current evolutionary biology is increasingly focussed on understanding how the evolution of developmental organisations modifies the distribution of phenotypic variation, the evolution of ecological relationships modifies the selective environment, and the evolution of reproductive relationships modifies the heritability of the evolutionary unit. The major transitions in evolution, in particular, involve radical changes in developmental, ecological and reproductive organisations that instantiate variation, selection and inheritance at a higher level of biological organisation. However, current evolutionary theory is poorly equipped to describe how these organisations change over evolutionary time and especially how that results in adaptive complexes at successive scales of organisation (the key problem is that evolution is self-referential, i.e. the products of evolution change the parameters of the evolutionary process). Here we first reinterpret the central open questions in these domains from a perspective that emphasises the common underlying themes. We then synthesise the findings from a developing body of work that is building a new theoretical approach to these questions by converting well-understood theory and results from models of cognitive learning. Specifically, connectionist models of memory and learning demonstrate how simple incremental mechanisms, adjusting the relationships between individually-simple components, can produce organisations that exhibit complex system-level behaviours and improve the adaptive capabilities of the system. We use the term “evolutionary connectionism” to recognise that, by functionally equivalent processes, natural selection acting on the relationships within and between evolutionary entities can result in organisations that produce complex system-level behaviours in evolutionary systems and modify the adaptive capabilities of natural selection over time. We review the evidence supporting the functional equivalences between the domains of learning and of evolution, and discuss the potential for this to resolve conceptual problems in our understanding of the evolution of developmental, ecological and reproductive organisations and, in particular, the major evolutionary transitions

    Learning the structure of Bayesian Networks: A quantitative assessment of the effect of different algorithmic schemes

    Full text link
    One of the most challenging tasks when adopting Bayesian Networks (BNs) is the one of learning their structure from data. This task is complicated by the huge search space of possible solutions, and by the fact that the problem is NP-hard. Hence, full enumeration of all the possible solutions is not always feasible and approximations are often required. However, to the best of our knowledge, a quantitative analysis of the performance and characteristics of the different heuristics to solve this problem has never been done before. For this reason, in this work, we provide a detailed comparison of many different state-of-the-arts methods for structural learning on simulated data considering both BNs with discrete and continuous variables, and with different rates of noise in the data. In particular, we investigate the performance of different widespread scores and algorithmic approaches proposed for the inference and the statistical pitfalls within them

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Reverse engineering a gene network using an asynchronous parallel evolution strategy.

    Get PDF
    RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.BACKGROUND: The use of reverse engineering methods to infer gene regulatory networks by fitting mathematical models to gene expression data is becoming increasingly popular and successful. However, increasing model complexity means that more powerful global optimisation techniques are required for model fitting. The parallel Lam Simulated Annealing (pLSA) algorithm has been used in such approaches, but recent research has shown that island Evolutionary Strategies can produce faster, more reliable results. However, no parallel island Evolutionary Strategy (piES) has yet been demonstrated to be effective for this task. RESULTS: Here, we present synchronous and asynchronous versions of the piES algorithm, and apply them to a real reverse engineering problem: inferring parameters in the gap gene network. We find that the asynchronous piES exhibits very little communication overhead, and shows significant speed-up for up to 50 nodes: the piES running on 50 nodes is nearly 10 times faster than the best serial algorithm. We compare the asynchronous piES to pLSA on the same test problem, measuring the time required to reach particular levels of residual error, and show that it shows much faster convergence than pLSA across all optimisation conditions tested. CONCLUSIONS: Our results demonstrate that the piES is consistently faster and more reliable than the pLSA algorithm on this problem, and scales better with increasing numbers of nodes. In addition, the piES is especially well suited to further improvements and adaptations: Firstly, the algorithm's fast initial descent speed and high reliability make it a good candidate for being used as part of a global/local search hybrid algorithm. Secondly, it has the potential to be used as part of a hierarchical evolutionary algorithm, which takes advantage of modern multi-core computing architectures

    Single-cell transcriptomics : a high-resolution avenue for plant functional genomics

    Get PDF
    Plant function is the result of the concerted action of single cells in different tissues. Advances in RNA-seq technologies and tissue processing allow us now to capture transcriptional changes at single-cell resolution. The incredible potential of single-cell RNA-seq lies in the novel ability to study and exploit regulatory processes in complex tissues based on the behaviour of single cells. Importantly, the independence from reporter lines allows the analysis of any given tissue in any plant. While there are challenges associated with the handling and analysis of complex datasets, the opportunities are unique to generate knowledge of tissue functions in unprecedented detail and to facilitate the application of such information by mapping cellular functions and interactions in a plant cell atlas. [Abstract copyright: Copyright © 2019 The Authors. Published by Elsevier Ltd.. All rights reserved.

    GENECI: A novel evolutionary machine learning consensus-based approach for the inference of gene regulatory networks

    Get PDF
    Gene regulatory networks define the interactions between DNA products and other substances in cells. Increasing knowledge of these networks improves the level of detail with which the processes that trigger different diseases are described and fosters the development of new therapeutic targets. These networks are usually represented by graphs, and the primary sources for their correct construction are usually time series from differential expression data. The inference of networks from this data type has been approached differently in the literature. Mostly, computational learning techniques have been implemented, which have finally shown some specialization in specific datasets. For this reason, the need arises to create new and more robust strategies for reaching a consensus based on previous results to gain a particular capacity for generalization. This paper presents GENECI (GEne NEtwork Consensus Inference), an evolutionary machine learning approach that acts as an organizer for constructing ensembles to process the results of the main inference techniques reported in the literature and to optimize the consensus network derived from them, according to their confidence levels and topological characteristics. After its design, the proposal was confronted with datasets collected from academic benchmarks (DREAM challenges and IRMA network) to quantify its accuracy. Subsequently, it was applied to a real-world biological network of melanoma patients whose results could be contrasted with medical research collected in the literature. Finally, it has been proved that its ability to optimize the consensus of several networks leads to outstanding robustness and accuracy, gaining a certain generalization capacity after facing the inference of multiple datasetsThis work has been partially funded by grant (funded by MCIN/AEI/10.13039/501100011033/) PID2020-112540RB-C41, AETHER-UMA, Spain (A smart data holistic approach for context-aware data analytics: semantics and context exploitation) and Andalusian PAIDI program, Spain with grant P18-RT-2799. Funding for open access charge: Universidad de Málaga, Spain/CBUA. Adrián SeguraOrtiz is supported by Grant FPU21/03837 (Spanish Ministry of Science, Innovation and Universities, Spain
    corecore