20 research outputs found
The Autoimmune Disease Database: a dynamically compiled literature-derived database
BACKGROUND: Autoimmune diseases are disorders caused by an immune response directed against the body's own organs, tissues and cells. In practice more than 80 clinically distinct diseases, among them systemic lupus erythematosus and rheumatoid arthritis, are classified as autoimmune diseases. Although their etiology is unclear these diseases share certain similarities at the molecular level i.e. susceptibility regions on the chromosomes or the involvement of common genes. To gain an overview of these related diseases it is not feasible to do a literary review but it requires methods of automated analyses of the more than 500,000 Medline documents related to autoimmune disorders. RESULTS: In this paper we present the first version of the Autoimmune Disease Database which to our knowledge is the first comprehensive literature-based database covering all known or suspected autoimmune diseases. This dynamically compiled database allows researchers to link autoimmune diseases to the candidate genes or proteins through the use of named entity recognition which identifies genes/proteins in the corresponding Medline abstracts. The Autoimmune Disease Database covers 103 autoimmune disease concepts. This list was expanded to include synonyms and spelling variants yielding a list of over 1,200 disease names. The current version of the database provides links to 541,690 abstracts and over 5,000 unique genes/proteins. CONCLUSION: The Autoimmune Disease Database provides the researcher with a tool to navigate potential gene-disease relationships in Medline abstracts in the context of autoimmune diseases
ProMiner: rule-based protein and gene entity recognition
doi:10.1186/1471-2105-6-S1-S14 <supplement> <title> <p>A critical assessment of text mining methods in molecular biology</p> </title> <editor>Christian Blaschke, Lynette Hirschman, Alfonso Valencia, Alexander Yeh</editor> <note>Report</note> </supplement> Background: Identification of gene and protein names in biomedical text is a challenging task as the corresponding nomenclature has evolved over time. This has led to multiple synonyms for individual genes and proteins, as well as names that may be ambiguous with other gene names or with general English words. The Gene List Task of the BioCreAtIvE challenge evaluation enables comparison of systems addressing the problem of protein and gene name identification on common benchmark data. Methods: The ProMiner system uses a pre-processed synonym dictionary to identify potential name occurrences in the biomedical text and associate protein and gene database identifiers with the detected matches. It follows a rule-based approach and its search algorithm is geared towards recognition of multi-word names [1]. To account for the large number of ambiguous synonyms in the considered organisms, the system has been extended to use specific variants of the detection procedure for highly ambiguous and case-sensitive synonyms. Based on all detected synonyms fo
Abstracts versus Full Texts and Patents: A Quantitative Analysis of Biomedical Entities
MĂĽller B, Klinger R, Gurulingappa H, et al. Abstracts versus Full Texts and Patents: A Quantitative Analysis of Biomedical Entities. In: Proceedings of the 1st IRF Conference. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer; 2010
For each result, the F-measure as determined from the published gold-standards is given in brackets
<p><b>Copyright information:</b></p><p>Taken from "ProMiner: rule-based protein and gene entity recognition"</p><p></p><p>BMC Bioinformatics 2005;6(Suppl 1):S14-S14.</p><p>Published online 24 May 2005</p><p>PMCID:PMC1869006.</p><p></p> Details on this figure are provided in the section
Both candidates are wrong matches because the significant token "receptor" is present in the text
<p><b>Copyright information:</b></p><p>Taken from "ProMiner: rule-based protein and gene entity recognition"</p><p></p><p>BMC Bioinformatics 2005;6(Suppl 1):S14-S14.</p><p>Published online 24 May 2005</p><p>PMCID:PMC1869006.</p><p></p> Naive matching would accept both candidates
Public microarray repository semantic annotation with ontologies employing text mining and expression profile correlation
Identifying Gene Specific Variations in Biomedical Text
Klinger R, Furlong LI, Friedrich CM, et al. Identifying Gene Specific Variations in Biomedical Text. Journal of Bioinformatics and Computational Biology. 2007;5(06):1277-1296
HuPSON: The human physiology simulation ontology
BACKGROUND: Large biomedical simulation initiatives, such as the Virtual Physiological Human (VPH), are substantially dependent on controlled vocabularies to facilitate the exchange of information, of data and of models. Hindering these initiatives is a lack of a comprehensive ontology that covers the essential concepts of the simulation domain. RESULTS: We propose a first version of a newly constructed ontology, HuPSON, as a basis for shared semantics and interoperability of simulations, of models, of algorithms and of other resources in this domain. The ontology is based on the Basic Formal Ontology, and adheres to the MIREOT principles; the constructed ontology has been evaluated via structural features, competency questions and use case scenarios. The ontology is freely available at: http://www.scai.fraunhofer.de/en/business-research-areas/bioinformatics/downloads.html webcite (owl files) and http://bishop.scai.fraunhofer.de/scaiview/ webcite (browser). CONCLUSIONS: HuPSON provides a framework for a) annotating simulation experiments, b) retrieving relevant information that are required for modelling, c) enabling interoperability of algorithmic approaches used in biomedical simulation, d) comparing simulation results and e) linking knowledge-based approaches to simulation-based approaches. It is meant to foster a more rapid uptake of semantic technologies in the modelling and simulation domain, with particular focus on the VPH domain