15,220 research outputs found

    Causality, Information and Biological Computation: An algorithmic software approach to life, disease and the immune system

    Full text link
    Biology has taken strong steps towards becoming a computer science aiming at reprogramming nature after the realisation that nature herself has reprogrammed organisms by harnessing the power of natural selection and the digital prescriptive nature of replicating DNA. Here we further unpack ideas related to computability, algorithmic information theory and software engineering, in the context of the extent to which biology can be (re)programmed, and with how we may go about doing so in a more systematic way with all the tools and concepts offered by theoretical computer science in a translation exercise from computing to molecular biology and back. These concepts provide a means to a hierarchical organization thereby blurring previously clear-cut lines between concepts like matter and life, or between tumour types that are otherwise taken as different and may not have however a different cause. This does not diminish the properties of life or make its components and functions less interesting. On the contrary, this approach makes for a more encompassing and integrated view of nature, one that subsumes observer and observed within the same system, and can generate new perspectives and tools with which to view complex diseases like cancer, approaching them afresh from a software-engineering viewpoint that casts evolution in the role of programmer, cells as computing machines, DNA and genes as instructions and computer programs, viruses as hacking devices, the immune system as a software debugging tool, and diseases as an information-theoretic battlefield where all these forces deploy. We show how information theory and algorithmic programming may explain fundamental mechanisms of life and death.Comment: 30 pages, 8 figures. Invited chapter contribution to Information and Causality: From Matter to Life. Sara I. Walker, Paul C.W. Davies and George Ellis (eds.), Cambridge University Pres

    Statistical advances and challenges for analyzing correlated high dimensional SNP data in genomic study for complex diseases

    Full text link
    Recent advances of information technology in biomedical sciences and other applied areas have created numerous large diverse data sets with a high dimensional feature space, which provide us a tremendous amount of information and new opportunities for improving the quality of human life. Meanwhile, great challenges are also created driven by the continuous arrival of new data that requires researchers to convert these raw data into scientific knowledge in order to benefit from it. Association studies of complex diseases using SNP data have become more and more popular in biomedical research in recent years. In this paper, we present a review of recent statistical advances and challenges for analyzing correlated high dimensional SNP data in genomic association studies for complex diseases. The review includes both general feature reduction approaches for high dimensional correlated data and more specific approaches for SNPs data, which include unsupervised haplotype mapping, tag SNP selection, and supervised SNPs selection using statistical testing/scoring, statistical modeling and machine learning methods with an emphasis on how to identify interacting loci.Comment: Published in at http://dx.doi.org/10.1214/07-SS026 the Statistics Surveys (http://www.i-journals.org/ss/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Designing Data-Driven Learning Algorithms: A Necessity to Ensure Effective Post-Genomic Medicine and Biomedical Research

    Get PDF
    Advances in sequencing technology have significantly contributed to shaping the area of genetics and enabled the identification of genetic variants associated with complex traits through genome-wide association studies. This has provided insights into genetic medicine, in which case, genetic factors influence variability in disease and treatment outcomes. On the other side, the missing or hidden heritability has suggested that the host quality of life and other environmental factors may also influence differences in disease risk and drug/treatment responses in genomic medicine, and orient biomedical research, even though this may be highly constrained by genetic capabilities. It is expected that combining these different factors can yield a paradigm-shift of personalized medicine and lead to a more effective medical treatment. With existing “big data” initiatives and high-performance computing infrastructures, there is a need for data-driven learning algorithms and models that enable the selection and prioritization of relevant genetic variants (post-genomic medicine) and trigger effective translation into clinical practice. In this chapter, we survey and discuss existing machine learning algorithms and post-genomic analysis models supporting the process of identifying valuable markers

    How to understand the cell by breaking it: network analysis of gene perturbation screens

    Get PDF
    Modern high-throughput gene perturbation screens are key technologies at the forefront of genetic research. Combined with rich phenotypic descriptors they enable researchers to observe detailed cellular reactions to experimental perturbations on a genome-wide scale. This review surveys the current state-of-the-art in analyzing perturbation screens from a network point of view. We describe approaches to make the step from the parts list to the wiring diagram by using phenotypes for network inference and integrating them with complementary data sources. The first part of the review describes methods to analyze one- or low-dimensional phenotypes like viability or reporter activity; the second part concentrates on high-dimensional phenotypes showing global changes in cell morphology, transcriptome or proteome.Comment: Review based on ISMB 2009 tutorial; after two rounds of revisio

    Mining Pure, Strict Epistatic Interactions from High-Dimensional Datasets: Ameliorating the Curse of Dimensionality

    Get PDF
    Background: The interaction between loci to affect phenotype is called epistasis. It is strict epistasis if no proper subset of the interacting loci exhibits a marginal effect. For many diseases, it is likely that unknown epistatic interactions affect disease susceptibility. A difficulty when mining epistatic interactions from high-dimensional datasets concerns the curse of dimensionality. There are too many combinations of SNPs to perform an exhaustive search. A method that could locate strict epistasis without an exhaustive search can be considered the brass ring of methods for analyzing high-dimensional datasets. Methodology/Findings: A SNP pattern is a Bayesian network representing SNP-disease relationships. The Bayesian score for a SNP pattern is the probability of the data given the pattern, and has been used to learn SNP patterns. We identified a bound for the score of a SNP pattern. The bound provides an upper limit on the Bayesian score of any pattern that could be obtained by expanding a given pattern. We felt that the bound might enable the data to say something about the promise of expanding a 1-SNP pattern even when there are no marginal effects. We tested the bound using simulated datasets and semi-synthetic high-dimensional datasets obtained from GWAS datasets. We found that the bound was able to dramatically reduce the search time for strict epistasis. Using an Alzheimer's dataset, we showed that it is possible to discover an interaction involving the APOE gene based on its score because of its large marginal effect, but that the bound is most effective at discovering interactions without marginal effects. Conclusions/Significance: We conclude that the bound appears to ameliorate the curse of dimensionality in high-dimensional datasets. This is a very consequential result and could be pivotal in our efforts to reveal the dark matter of genetic disease risk from high-dimensional datasets. © 2012 Jiang, Neapolitan

    Multi-scale modeling of gene-behavior associations in an artificial neural network model of cognitive development

    Get PDF
    In the multi-disciplinary field of developmental cognitive neuroscience, statistical associations between levels of description play an increasingly important role. One example of such associations is the observation of correlations between relatively common gene variants and individual differences in behavior. It is perhaps surprising that such associations can be detected despite the remoteness of these levels of description, and the fact that behavior is the outcome of an extended developmental process involving interaction with a variable environment. Given that they have been detected, how do such associations inform cognitive-level theories? To investigate this question, we employed a multi-scale computational model of development, using a sample domain drawn from the field of language acquisition. The model comprised an artificial neural network model of past-tense acquisition trained using the backpropagation learning algorithm, extended to incorporate population modeling and genetic algorithms. It included five levels of description, four internal: genetic, network, neurocomputation, behavior; and one external: environment. Since the mechanistic assumptions of the model were known and its operation was relatively transparent, we could evaluate whether cross-level associations gave an accurate picture of causal processes. We established that associations could be detected between artificial genes and behavioral variation, even under polygenic assumptions of a many-to-one relationship between genes and neurocomputational parameters, and when an experience-dependent developmental process interceded between the action of genes and the emergence of behavior. We evaluated these associations with respect to their specificity (to different behaviors, to function versus structure), to their developmental stability, and to their replicability, as well as considering issues of missing heritability and gene-environment interactions. We argue that gene-behavior associations can inform cognitive theory with respect to effect size, specificity, and timing. The model demonstrates a means by which researchers can undertake modeling multi-scale modeling with respect to cognition, and develop highly specific and complex hypotheses across multiple levels of description

    Mathematics at the eve of a historic transition in biology

    Full text link
    A century ago physicists and mathematicians worked in tandem and established quantum mechanism. Indeed, algebras, partial differential equations, group theory, and functional analysis underpin the foundation of quantum mechanism. Currently, biology is undergoing a historic transition from qualitative, phenomenological and descriptive to quantitative, analytical and predictive. Mathematics, again, becomes a driving force behind this new transition in biology.Comment: 5 pages, 2 figure
    corecore