2,612 research outputs found
Whole-exome sequencing for finding de novo mutations in sporadic mental retardation
Recent work has used a family-based approach and whole-exome sequencing to identify de novo mutations in sporadic cases of mental retardation
An expanded evaluation of protein function prediction methods shows an improvement in accuracy
Background A major bottleneck in our understanding of the molecular
underpinnings of life is the assignment of function to proteins. While
molecular experiments provide the most reliable annotation of proteins, their
relatively low throughput and restricted purview have led to an increasing
role for computational function prediction. However, assessing methods for
protein function prediction and tracking progress in the field remain
challenging. Results We conducted the second critical assessment of functional
annotation (CAFA), a timed challenge to assess computational methods that
automatically assign protein function. We evaluated 126 methods from 56
research groups for their ability to predict biological functions using Gene
Ontology and gene-disease associations using Human Phenotype Ontology on a set
of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared
with CAFA1, with regards to data set size, variety, and assessment metrics. To
review progress in the field, the analysis compared the best methods from
CAFA1 to those of CAFA2. Conclusions The top-performing methods in CAFA2
outperformed those from CAFA1. This increased accuracy can be attributed to a
combination of the growing number of experimental annotations and improved
methods for function prediction. The assessment also revealed that the
definition of top-performing algorithms is ontology specific, that different
performance metrics can be used to probe the nature of accurate predictions,
and the relative diversity of predictions in the biological process and human
phenotype ontologies. While there was methodological improvement between CAFA1
and CAFA2, the interpretation of results and usefulness of individual methods
remain context-dependent
Phenotype ontologies and cross-species analysis for translational research
The use of model organisms as tools for the investigation of human genetic variation has significantly and rapidly advanced our understanding of the aetiologies underlying hereditary traits. However, while equivalences in the DNA sequence of two species may be readily inferred through evolutionary models, the identification of equivalence in the phenotypic consequences resulting from comparable genetic variation is far from straightforward, limiting the value of the modelling paradigm. In this review, we provide an overview of the emerging statistical and computational approaches to objectively identify phenotypic equivalence between human and model organisms with examples from the vertebrate models, mouse and zebrafish. Firstly, we discuss enrichment approaches, which deem the most frequent phenotype among the orthologues of a set of genes associated with a common human phenotype as the orthologous phenotype, or phenolog, in the model species. Secondly, we introduce and discuss computational reasoning approaches to identify phenotypic equivalences made possible through the development of intra- and interspecies ontologies. Finally, we consider the particular challenges involved in modelling neuropsychiatric disorders, which illustrate many of the remaining difficulties in developing comprehensive and unequivocal interspecies phenotype mappings
HotSwap for bioinformatics: A STRAP tutorial
BACKGROUND: Bioinformatics applications are now routinely used to analyze large amounts of data. Application development often requires many cycles of optimization, compiling, and testing. Repeatedly loading large datasets can significantly slow down the development process. We have incorporated HotSwap functionality into the protein workbench STRAP, allowing developers to create plugins using the Java HotSwap technique. RESULTS: Users can load multiple protein sequences or structures into the main STRAP user interface, and simultaneously develop plugins using an editor of their choice such as Emacs. Saving changes to the Java file causes STRAP to recompile the plugin and automatically update its user interface without requiring recompilation of STRAP or reloading of protein data. This article presents a tutorial on how to develop HotSwap plugins. STRAP is available at and . CONCLUSION: HotSwap is a useful and time-saving technique for bioinformatics developers. HotSwap can be used to efficiently develop bioinformatics applications that require loading large amounts of data into memory
FABIAN-variant: predicting the effects of DNA variants on transcription factor binding.
While great advances in predicting the effects of coding variants have been made, the assessment of non-coding variants remains challenging. This is especially problematic for variants within promoter regions which can lead to over-expression of a gene or reduce or even abolish its expression. The binding of transcription factors to the DNA can be predicted using position weight matrices (PWMs). More recently, transcription factor flexible models (TFFMs) have been introduced and shown to be more accurate than PWMs. TFFMs are based on hidden Markov models and can account for complex positional dependencies. Our new web-based application FABIAN-variant uses 1224 TFFMs and 3790 PWMs to predict whether and to which degree DNA variants affect the binding of 1387 different human transcription factors. For each variant and transcription factor, the software combines the results of different models for a final prediction of the resulting binding-affinity change. The software is written in C++ for speed but variants can be entered through a web interface. Alternatively, a VCF file can be uploaded to assess variants identified by high-throughput sequencing. The search can be restricted to variants in the vicinity of candidate genes. FABIAN-variant is available freely at https://www.genecascade.org/fabian/
Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods
Background The prediction of human gene–abnormal phenotype associations is a
fundamental step toward the discovery of novel genes associated with human
disorders, especially when no genes are known to be associated with a specific
disease. In this context the Human Phenotype Ontology (HPO) provides a
standard categorization of the abnormalities associated with human diseases.
While the problem of the prediction of gene–disease associations has been
widely investigated, the related problem of gene–phenotypic feature (i.e., HPO
term) associations has been largely overlooked, even if for most human genes
no HPO term associations are known and despite the increasing application of
the HPO to relevant medical problems. Moreover most of the methods proposed in
literature are not able to capture the hierarchical relationships between HPO
terms, thus resulting in inconsistent and relatively inaccurate predictions.
Results We present two hierarchical ensemble methods that we formally prove to
provide biologically consistent predictions according to the hierarchical
structure of the HPO. The modular structure of the proposed methods, that
consists in a “flat” learning first step and a hierarchical combination of the
predictions in the second step, allows the predictions of virtually any flat
learning method to be enhanced. The experimental results show that
hierarchical ensemble methods are able to predict novel associations between
genes and abnormal phenotypes with results that are competitive with state-of-
the-art algorithms and with a significant reduction of the computational
complexity. Conclusions Hierarchical ensembles are efficient computational
methods that guarantee biologically meaningful predictions that obey the true
path rule, and can be used as a tool to improve and make consistent the HPO
terms predictions starting from virtually any flat learning method. The
implementation of the proposed methods is available as an R package from the
CRAN repository
the rare bone disorders use case
Background Lately, ontologies have become a fundamental building block in the
process of formalising and storing complex biomedical information. The
community-driven ontology curation process, however, ignores the possibility
of multiple communities building, in parallel, conceptualisations of the same
domain, and thus providing slightly different perspectives on the same
knowledge. The individual nature of this effort leads to the need of a
mechanism to enable us to create an overarching and comprehensive overview of
the different perspectives on the domain knowledge. Results We introduce an
approach that enables the loose integration of knowledge emerging from diverse
sources under a single coherent interoperable resource. To accurately track
the original knowledge statements, we record the provenance at very granular
levels. We exemplify the approach in the rare bone disorders domain by
proposing the Rare Bone Disorders Ontology (RBDO). Using RBDO, researchers are
able to answer queries, such as: “What phenotypes describe a particular
disorder and are common to all sources?” or to understand similarities between
disorders based on divergent groupings (classifications) provided by the
underlying sources
The extinct, giant giraffid Sivatherium giganteum: skeletal reconstruction and body mass estimation
Sivatherium giganteum is an extinct giraffid from the Plio–Pleistocene boundary of the Himalayan foothills. To date, there has been no rigorous skeletal reconstruction of this unusual mammal. Historical and contemporary accounts anecdotally state that Sivatherium rivalled the African elephant in terms of its body mass, but this statement has never been tested. Here, we present a three-dimensional composite skeletal reconstruction and calculate a representative body mass estimate for this species using a volumetric method. We find that the estimated adult body mass of 1246 kg (857—1812 kg range) does not approach that of an African elephant, but confirms that Sivatherium was certainly a large giraffid, and may have been the largest ruminant mammal that has ever existed. We contrast this volumetric estimate with a bivariate scaling estimate derived from Sivatherium's humeral circumference and find that there is a discrepancy between the two. The difference implies that the humeral circumference of Sivatherium is greater than expected for an animal of this size, and we speculate this may be linked to a cranial shift in centre of mass
Reasoning about goal-directed real-time teleo-reactive programs
The teleo-reactive programming model is a high-level approach to developing real-time systems that supports hierarchical composition and durative actions. The model is different from frameworks such as action systems, timed automata and TLA+, and allows programs to be more compact and descriptive of their intended behaviour. Teleo-reactive programs are particularly useful for implementing controllers for autonomous agents that must react robustly to their dynamically changing environments. In this paper, we develop a real-time logic that is based on Duration Calculus and use this logic to formalise the semantics of teleo-reactive programs. We develop rely/guarantee rules that facilitate reasoning about a program and its environment in a compositional manner. We present several theorems for simplifying proofs of teleo-reactive programs and present a partially mechanised method for proving progress properties of goal-directed agents. © 2013 British Computer Society
- …