268 research outputs found

    Integrating protein-protein interactions and text mining for protein function prediction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Functional annotation of proteins remains a challenging task. Currently the scientific literature serves as the main source for yet uncurated functional annotations, but curation work is slow and expensive. Automatic techniques that support this work are still lacking reliability. We developed a method to identify conserved protein interaction graphs and to predict missing protein functions from orthologs in these graphs. To enhance the precision of the results, we furthermore implemented a procedure that validates all predictions based on findings reported in the literature.</p> <p>Results</p> <p>Using this procedure, more than 80% of the GO annotations for proteins with highly conserved orthologs that are available in UniProtKb/Swiss-Prot could be verified automatically. For a subset of proteins we predicted new GO annotations that were not available in UniProtKb/Swiss-Prot. All predictions were correct (100% precision) according to the verifications from a trained curator.</p> <p>Conclusion</p> <p>Our method of integrating CCSs and literature mining is thus a highly reliable approach to predict GO annotations for weakly characterized proteins with orthologs.</p

    A realistic assessment of methods for extracting gene/protein interactions from free text

    Get PDF
    Background: The automated extraction of gene and/or protein interactions from the literature is one of the most important targets of biomedical text mining research. In this paper we present a realistic evaluation of gene/protein interaction mining relevant to potential non-specialist users. Hence we have specifically avoided methods that are complex to install or require reimplementation, and we coupled our chosen extraction methods with a state-of-the-art biomedical named entity tagger. Results: Our results show: that performance across different evaluation corpora is extremely variable; that the use of tagged (as opposed to gold standard) gene and protein names has a significant impact on performance, with a drop in F-score of over 20 percentage points being commonplace; and that a simple keyword-based benchmark algorithm when coupled with a named entity tagger outperforms two of the tools most widely used to extract gene/protein interactions. Conclusion: In terms of availability, ease of use and performance, the potential non-specialist user community interested in automatically extracting gene and/or protein interactions from free text is poorly served by current tools and systems. The public release of extraction tools that are easy to install and use, and that achieve state-of-art levels of performance should be treated as a high priority by the biomedical text mining community

    Stability of sub-surface oxygen at Rh(111)

    Full text link
    Using density-functional theory (DFT) we investigate the incorporation of oxygen directly below the Rh(111) surface. We show that oxygen incorporation will only commence after nearly completion of a dense O adlayer (\theta_tot = 1.0 monolayer) with O in the fcc on-surface sites. The experimentally suggested octahedral sub-surface site occupancy, inducing a site-switch of the on-surface species from fcc to hcp sites, is indeed found to be a rather low energy structure. Our results indicate that at even higher coverages oxygen incorporation is followed by oxygen agglomeration in two-dimensional sub-surface islands directly below the first metal layer. Inside these islands, the metastable hcp/octahedral (on-surface/sub-surface) site combination will undergo a barrierless displacement, introducing a stacking fault of the first metal layer with respect to the underlying substrate and leading to a stable fcc/tetrahedral site occupation. We suggest that these elementary steps, namely, oxygen incorporation, aggregation into sub-surface islands and destabilization of the metal surface may be more general and precede the formation of a surface oxide at close-packed late transition metal surfaces.Comment: 9 pages including 9 figure files. Submitted to Phys. Rev. B. Related publications can be found at http://www.fhi-berlin.mpg.de/th/paper.htm

    Text mining for biology - the way forward: opinions from leading scientists

    Get PDF
    This article collects opinions from leading scientists about how text mining can provide better access to the biological literature, how the scientific community can help with this process, what the next steps are, and what role future BioCreative evaluations can play. The responses identify several broad themes, including the possibility of fusing literature and biological databases through text mining; the need for user interfaces tailored to different classes of users and supporting community-based annotation; the importance of scaling text mining technology and inserting it into larger workflows; and suggestions for additional challenge evaluations, new applications, and additional resources needed to make progress

    Влияние ионной бомбардировки на формирование поверхностных слоев при азотировании в безводородной плазме газового разряда

    Get PDF
    В работе приведены результаты изучения процессов упрочнения стальных образцов в безводородной плазме низковольтного газового разряда в вакууме. В отличие от тлеющего разряда, широко применяемого в промышленных технологиях для проведения ионно-плазменного азотирования, в данном типе газового разряда из-за низкого давления заметно выражены эффекты распыления поверхности обрабатываемых деталей. Данное обстоятельство однозначно должно приводить к интенсификации диффузионных процессов. В работе описаны параметры оборудования, приводятся экспериментальные данные по измерениям шероховатости, микротвердости и структуре поверхностей легированных сталей. Отдельное внимание уделено вопросу азотирования закаленных сталей в данном типе разряда в диапазоне температур отпуска.In this work presents the results of studying the processes of hardening of steel samples in hydrogen-free plasma of low-voltage gas discharge in vacuum. In contrast to the glow discharge, widely used in industrial technologies for ion-plasma nitriding, in this type of gas discharge due to the low pressure, the effects of spraying the surface of the processed parts are markedly expressed. This fact should definitely lead to the intensification of diffusion processes. The paper describes the parameters of the equipment, provides experimental data on the measurements of roughness, microhardness and structure of the surfaces of alloy steels. Special attention is paid to the issue of nitriding of hardened steels in this type of discharge in the tempering temperature range

    Cluster analysis of protein array results via similarity of Gene Ontology annotation

    Get PDF
    BACKGROUND: With the advent of high-throughput proteomic experiments such as arrays of purified proteins comes the need to analyse sets of proteins as an ensemble, as opposed to the traditional one-protein-at-a-time approach. Although there are several publicly available tools that facilitate the analysis of protein sets, they do not display integrated results in an easily-interpreted image or do not allow the user to specify the proteins to be analysed. RESULTS: We developed a novel computational approach to analyse the annotation of sets of molecules. As proof of principle, we analysed two sets of proteins identified in published protein array screens. The distance between any two proteins was measured as the graph similarity between their Gene Ontology (GO) annotations. These distances were then clustered to highlight subsets of proteins sharing related GO annotation. In the first set of proteins found to bind small molecule inhibitors of rapamycin, we identified three subsets containing four or five proteins each that may help to elucidate how rapamycin affects cell growth whereas the original authors chose only one novel protein from the array results for further study. In a set of phosphoinositide-binding proteins, we identified subsets of proteins associated with different intracellular structures that were not highlighted by the analysis performed in the original publication. CONCLUSION: By determining the distances between annotations, our methodology reveals trends and enrichment of proteins of particular functions within high-throughput datasets at a higher sensitivity than perusal of end-point annotations. In an era of increasingly complex datasets, such tools will help in the formulation of new, testable hypotheses from high-throughput experimental data
    corecore