145 research outputs found
Writing clinical practice guidelines in controlled natural language
Clinicians could benefit from decision support systems incorporating the knowledge contained in clinical practice guidelines. However, the unstructured form of these guidelines makes them unsuitable for formal representation. To address this challenge we translated a complete set of pediatric guideline recommendations into Attempto Controlled English (ACE). One experienced pediatrician, one physician and a knowledge engineer assessed that a suitably extended version of ACE can accurately and naturally represent the clinical concepts and the proposed actions of the guidelines. Currently, we are developing a systematic and replicable approach to authoring guideline recommendations in ACE
OntoGene in BioCreative II
BACKGROUND: Research scientists and companies working in the domains of biomedicine and genomics are increasingly faced with the problem of efficiently locating, within the vast body of published scientific findings, the critical pieces of information that are needed to direct current and future research investment. RESULTS: In this report we describe approaches taken within the scope of the second BioCreative competition in order to solve two aspects of this problem: detection of novel protein interactions reported in scientific articles, and detection of the experimental method that was used to confirm the interaction. Our approach to the former problem is based on a high-recall protein annotation step, followed by two strict disambiguation steps. The remaining proteins are then combined according to a number of lexico-syntactic filters, which deliver high-precision results while maintaining reasonable recall. The detection of the experimental methods is tackled by a pattern matching approach, which has delivered the best results in the official BioCreative evaluation. CONCLUSION: Although the results of BioCreative clearly show that no tool is sufficiently reliable for fully automated annotations, a few of the proposed approaches (including our own) already perform at a competitive level. This makes them interesting either as standalone tools for preliminary document inspection, or as modules within an environment aimed at supporting the process of curation of biomedical literature
An environment for relation mining over richly annotated corpora: the case of GENIA
BACKGROUND: The biomedical domain is witnessing a rapid growth of the amount of published scientific results, which makes it increasingly difficult to filter the core information. There is a real need for support tools that 'digest' the published results and extract the most important information. RESULTS: We describe and evaluate an environment supporting the extraction of domain-specific relations, such as protein-protein interactions, from a richly-annotated corpus. We use full, deep-linguistic parsing and manually created, versatile patterns, expressing a large set of syntactic alternations, plus semantic ontology information. CONCLUSION: The experiments show that our approach described is capable of delivering high-precision results, while maintaining sufficient levels of recall. The high level of abstraction of the rules used by the system, which are considerably more powerful and versatile than finite-state approaches, allows speedy interactive development and validation
The role of technical Terminology in Question Answering
Terminology is arguably the most vital linguistic unit of technical documentation. Characterising the content of documents by the terminology they contain is a key factor in satisfactory document retrieval. But when users require answers rather than documents, more complex strategies for exploiting terminology are needed. Dealing effectively with this problem requires not only good techniques for terminology extraction but also ways to organize and structure the terminology. We describe some potential solutions to this problem, taking a Question Answering system as an example. We show which benefits our techniques bring to the system
Exploiting language resources for semantic web annotations
A large portion of the useful information on the web is in the form of unstructured natural language documents. Currently such documents are understandable to humans but not to software agents. One of the goals of the Semantic Web activity is to enrich a considerable number of web documents with annotations, which will then allow new generation search engines and novel web services to access those documents in a more intelligent fashion than currently possible. Currently the most reliable method of providing such semantic markup is via manual annotation, possibly based on predefined ontologies and with the support of specialized editors. In this paper we propose an approach for the automatic processing of textual documents to be published on the web, which can be used to automatically
generate (some of) the semantic annotations. In particular, we focus on detecting the entities mentioned in the documents, their roles and relationships to other entities
- …