114,548 research outputs found

    Pathway-Based Genomics Prediction using Generalized Elastic Net.

    Get PDF
    We present a novel regularization scheme called The Generalized Elastic Net (GELnet) that incorporates gene pathway information into feature selection. The proposed formulation is applicable to a wide variety of problems in which the interpretation of predictive features using known molecular interactions is desired. The method naturally steers solutions toward sets of mechanistically interlinked genes. Using experiments on synthetic data, we demonstrate that pathway-guided results maintain, and often improve, the accuracy of predictors even in cases where the full gene network is unknown. We apply the method to predict the drug response of breast cancer cell lines. GELnet is able to reveal genetic determinants of sensitivity and resistance for several compounds. In particular, for an EGFR/HER2 inhibitor, it finds a possible trans-differentiation resistance mechanism missed by the corresponding pathway agnostic approach

    An Introduction to Programming for Bioscientists: A Python-based Primer

    Full text link
    Computing has revolutionized the biological sciences over the past several decades, such that virtually all contemporary research in the biosciences utilizes computer programs. The computational advances have come on many fronts, spurred by fundamental developments in hardware, software, and algorithms. These advances have influenced, and even engendered, a phenomenal array of bioscience fields, including molecular evolution and bioinformatics; genome-, proteome-, transcriptome- and metabolome-wide experimental studies; structural genomics; and atomistic simulations of cellular-scale molecular assemblies as large as ribosomes and intact viruses. In short, much of post-genomic biology is increasingly becoming a form of computational biology. The ability to design and write computer programs is among the most indispensable skills that a modern researcher can cultivate. Python has become a popular programming language in the biosciences, largely because (i) its straightforward semantics and clean syntax make it a readily accessible first language; (ii) it is expressive and well-suited to object-oriented programming, as well as other modern paradigms; and (iii) the many available libraries and third-party toolkits extend the functionality of the core language into virtually every biological domain (sequence and structure analyses, phylogenomics, workflow management systems, etc.). This primer offers a basic introduction to coding, via Python, and it includes concrete examples and exercises to illustrate the language's usage and capabilities; the main text culminates with a final project in structural bioinformatics. A suite of Supplemental Chapters is also provided. Starting with basic concepts, such as that of a 'variable', the Chapters methodically advance the reader to the point of writing a graphical user interface to compute the Hamming distance between two DNA sequences.Comment: 65 pages total, including 45 pages text, 3 figures, 4 tables, numerous exercises, and 19 pages of Supporting Information; currently in press at PLOS Computational Biolog

    Rhythmic dynamics and synchronization via dimensionality reduction : application to human gait

    Get PDF
    Reliable characterization of locomotor dynamics of human walking is vital to understanding the neuromuscular control of human locomotion and disease diagnosis. However, the inherent oscillation and ubiquity of noise in such non-strictly periodic signals pose great challenges to current methodologies. To this end, we exploit the state-of-the-art technology in pattern recognition and, specifically, dimensionality reduction techniques, and propose to reconstruct and characterize the dynamics accurately on the cycle scale of the signal. This is achieved by deriving a low-dimensional representation of the cycles through global optimization, which effectively preserves the topology of the cycles that are embedded in a high-dimensional Euclidian space. Our approach demonstrates a clear advantage in capturing the intrinsic dynamics and probing the subtle synchronization patterns from uni/bivariate oscillatory signals over traditional methods. Application to human gait data for healthy subjects and diabetics reveals a significant difference in the dynamics of ankle movements and ankle-knee coordination, but not in knee movements. These results indicate that the impaired sensory feedback from the feet due to diabetes does not influence the knee movement in general, and that normal human walking is not critically dependent on the feedback from the peripheral nervous system
    corecore