2,612 research outputs found

    Whole-exome sequencing for finding de novo mutations in sporadic mental retardation

    Get PDF
    Recent work has used a family-based approach and whole-exome sequencing to identify de novo mutations in sporadic cases of mental retardation

    An expanded evaluation of protein function prediction methods shows an improvement in accuracy

    Get PDF
    Background A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. Results We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. Conclusions The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent

    Phenotype ontologies and cross-species analysis for translational research

    Get PDF
    The use of model organisms as tools for the investigation of human genetic variation has significantly and rapidly advanced our understanding of the aetiologies underlying hereditary traits. However, while equivalences in the DNA sequence of two species may be readily inferred through evolutionary models, the identification of equivalence in the phenotypic consequences resulting from comparable genetic variation is far from straightforward, limiting the value of the modelling paradigm. In this review, we provide an overview of the emerging statistical and computational approaches to objectively identify phenotypic equivalence between human and model organisms with examples from the vertebrate models, mouse and zebrafish. Firstly, we discuss enrichment approaches, which deem the most frequent phenotype among the orthologues of a set of genes associated with a common human phenotype as the orthologous phenotype, or phenolog, in the model species. Secondly, we introduce and discuss computational reasoning approaches to identify phenotypic equivalences made possible through the development of intra- and interspecies ontologies. Finally, we consider the particular challenges involved in modelling neuropsychiatric disorders, which illustrate many of the remaining difficulties in developing comprehensive and unequivocal interspecies phenotype mappings

    HotSwap for bioinformatics: A STRAP tutorial

    Get PDF
    BACKGROUND: Bioinformatics applications are now routinely used to analyze large amounts of data. Application development often requires many cycles of optimization, compiling, and testing. Repeatedly loading large datasets can significantly slow down the development process. We have incorporated HotSwap functionality into the protein workbench STRAP, allowing developers to create plugins using the Java HotSwap technique. RESULTS: Users can load multiple protein sequences or structures into the main STRAP user interface, and simultaneously develop plugins using an editor of their choice such as Emacs. Saving changes to the Java file causes STRAP to recompile the plugin and automatically update its user interface without requiring recompilation of STRAP or reloading of protein data. This article presents a tutorial on how to develop HotSwap plugins. STRAP is available at and . CONCLUSION: HotSwap is a useful and time-saving technique for bioinformatics developers. HotSwap can be used to efficiently develop bioinformatics applications that require loading large amounts of data into memory

    FABIAN-variant: predicting the effects of DNA variants on transcription factor binding.

    Get PDF
    While great advances in predicting the effects of coding variants have been made, the assessment of non-coding variants remains challenging. This is especially problematic for variants within promoter regions which can lead to over-expression of a gene or reduce or even abolish its expression. The binding of transcription factors to the DNA can be predicted using position weight matrices (PWMs). More recently, transcription factor flexible models (TFFMs) have been introduced and shown to be more accurate than PWMs. TFFMs are based on hidden Markov models and can account for complex positional dependencies. Our new web-based application FABIAN-variant uses 1224 TFFMs and 3790 PWMs to predict whether and to which degree DNA variants affect the binding of 1387 different human transcription factors. For each variant and transcription factor, the software combines the results of different models for a final prediction of the resulting binding-affinity change. The software is written in C++ for speed but variants can be entered through a web interface. Alternatively, a VCF file can be uploaded to assess variants identified by high-throughput sequencing. The search can be restricted to variants in the vicinity of candidate genes. FABIAN-variant is available freely at https://www.genecascade.org/fabian/

    Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods

    Get PDF
    Background The prediction of human gene–abnormal phenotype associations is a fundamental step toward the discovery of novel genes associated with human disorders, especially when no genes are known to be associated with a specific disease. In this context the Human Phenotype Ontology (HPO) provides a standard categorization of the abnormalities associated with human diseases. While the problem of the prediction of gene–disease associations has been widely investigated, the related problem of gene–phenotypic feature (i.e., HPO term) associations has been largely overlooked, even if for most human genes no HPO term associations are known and despite the increasing application of the HPO to relevant medical problems. Moreover most of the methods proposed in literature are not able to capture the hierarchical relationships between HPO terms, thus resulting in inconsistent and relatively inaccurate predictions. Results We present two hierarchical ensemble methods that we formally prove to provide biologically consistent predictions according to the hierarchical structure of the HPO. The modular structure of the proposed methods, that consists in a “flat” learning first step and a hierarchical combination of the predictions in the second step, allows the predictions of virtually any flat learning method to be enhanced. The experimental results show that hierarchical ensemble methods are able to predict novel associations between genes and abnormal phenotypes with results that are competitive with state-of- the-art algorithms and with a significant reduction of the computational complexity. Conclusions Hierarchical ensembles are efficient computational methods that guarantee biologically meaningful predictions that obey the true path rule, and can be used as a tool to improve and make consistent the HPO terms predictions starting from virtually any flat learning method. The implementation of the proposed methods is available as an R package from the CRAN repository

    the rare bone disorders use case

    Get PDF
    Background Lately, ontologies have become a fundamental building block in the process of formalising and storing complex biomedical information. The community-driven ontology curation process, however, ignores the possibility of multiple communities building, in parallel, conceptualisations of the same domain, and thus providing slightly different perspectives on the same knowledge. The individual nature of this effort leads to the need of a mechanism to enable us to create an overarching and comprehensive overview of the different perspectives on the domain knowledge. Results We introduce an approach that enables the loose integration of knowledge emerging from diverse sources under a single coherent interoperable resource. To accurately track the original knowledge statements, we record the provenance at very granular levels. We exemplify the approach in the rare bone disorders domain by proposing the Rare Bone Disorders Ontology (RBDO). Using RBDO, researchers are able to answer queries, such as: “What phenotypes describe a particular disorder and are common to all sources?” or to understand similarities between disorders based on divergent groupings (classifications) provided by the underlying sources

    The extinct, giant giraffid Sivatherium giganteum: skeletal reconstruction and body mass estimation

    Get PDF
    Sivatherium giganteum is an extinct giraffid from the Plio–Pleistocene boundary of the Himalayan foothills. To date, there has been no rigorous skeletal reconstruction of this unusual mammal. Historical and contemporary accounts anecdotally state that Sivatherium rivalled the African elephant in terms of its body mass, but this statement has never been tested. Here, we present a three-dimensional composite skeletal reconstruction and calculate a representative body mass estimate for this species using a volumetric method. We find that the estimated adult body mass of 1246 kg (857—1812 kg range) does not approach that of an African elephant, but confirms that Sivatherium was certainly a large giraffid, and may have been the largest ruminant mammal that has ever existed. We contrast this volumetric estimate with a bivariate scaling estimate derived from Sivatherium's humeral circumference and find that there is a discrepancy between the two. The difference implies that the humeral circumference of Sivatherium is greater than expected for an animal of this size, and we speculate this may be linked to a cranial shift in centre of mass

    Reasoning about goal-directed real-time teleo-reactive programs

    Get PDF
    The teleo-reactive programming model is a high-level approach to developing real-time systems that supports hierarchical composition and durative actions. The model is different from frameworks such as action systems, timed automata and TLA+, and allows programs to be more compact and descriptive of their intended behaviour. Teleo-reactive programs are particularly useful for implementing controllers for autonomous agents that must react robustly to their dynamically changing environments. In this paper, we develop a real-time logic that is based on Duration Calculus and use this logic to formalise the semantics of teleo-reactive programs. We develop rely/guarantee rules that facilitate reasoning about a program and its environment in a compositional manner. We present several theorems for simplifying proofs of teleo-reactive programs and present a partially mechanised method for proving progress properties of goal-directed agents. © 2013 British Computer Society
    • …
    corecore