2,244 research outputs found

    Building a semantically annotated corpus of clinical texts

    Get PDF
    In this paper, we describe the construction of a semantically annotated corpus of clinical texts for use in the development and evaluation of systems for automatically extracting clinically significant information from the textual component of patient records. The paper details the sampling of textual material from a collection of 20,000 cancer patient records, the development of a semantic annotation scheme, the annotation methodology, the distribution of annotations in the final corpus, and the use of the corpus for development of an adaptive information extraction system. The resulting corpus is the most richly semantically annotated resource for clinical text processing built to date, whose value has been demonstrated through its use in developing an effective information extraction system. The detailed presentation of our corpus construction and annotation methodology will be of value to others seeking to build high-quality semantically annotated corpora in biomedical domains

    Improving Term Extraction with Terminological Resources

    Full text link
    Studies of different term extractors on a corpus of the biomedical domain revealed decreasing performances when applied to highly technical texts. The difficulty or impossibility of customising them to new domains is an additional limitation. In this paper, we propose to use external terminologies to influence generic linguistic data in order to augment the quality of the extraction. The tool we implemented exploits testified terms at different steps of the process: chunking, parsing and extraction of term candidates. Experiments reported here show that, using this method, more term candidates can be acquired with a higher level of reliability. We further describe the extraction process involving endogenous disambiguation implemented in the term extractor YaTeA

    The use of Natural Language Processing techniques to support Health Literacy: an evidence-based review

    Get PDF
    Background and objectives: To conduct a literature search and analysis of the existing research using natural language processing for improving or helping health literacy, as well as to discuss the importance and potentials of addressing both fields in a joint manner. This review targets researchers who are unfamiliar with natural language processing in the field of health literacy, and in general, any researcher, regardless of his or her background, interested in multi-disciplinary research involving technology and health care. Methods: We introduce the concepts of health literacy and natural language processing. Then, a thorough search is performed using relevant databases and well-defined criteria. We review the existing literature addressing these topics, both in an independent and joint manner, and provide an overview of the state of the art using natural language processing in health literacy. We additionally discuss how the different issues in health literacy that are related to the comprehension of specialised health texts can be improved using natural language processing techniques, and the challenges involved in these processes. Results: The search process yielded 235 potential relevant references, 49 of which fully fulfilled the established search criteria, and therefore they were later analysed in more detail. These articles were clustered into groups with respect to their purpose, and most of them were focused on the development of specific natural language processing modules, such as question answering, information retrieval, text simplification or natural language generation in order to facilitate the understanding of health information.This research work has been partially funded by the University of Alicante, Generalitat Valenciana, Spanish Government and the European Commission through the projects, "Tratamiento inteligente de la informacion para la ayuda a la toma de decisiones" (GRE12-44), "Explotacion y tratamiento de la informacion disponible en Internet para la anotacion y generacion de textos adaptados al usuario" (GRE13-15), DIIM2.0 (PROMETEOII/2014/001), ATTOS (TIN2012-38536-C03-03), LEGOLANG-UAGE (TIN2012-31224), SAM (FP7-611312), and FIRST (FP7-287607)

    Annotating patient clinical records with syntactic chunks and named entities: the Harvey corpus

    Get PDF
    The free text notes typed by physicians during patient consultations contain valuable information for the study of disease and treatment. These notes are difficult to process by existing natural language analysis tools since they are highly telegraphic (omitting many words), and contain many spelling mistakes, inconsistencies in punctuation, and non-standard word order. To support information extraction and classification tasks over such text, we describe a de-identified corpus of free text notes, a shallow syntactic and named entity annotation scheme for this kind of text, and an approach to training domain specialists with no linguistic background to annotate the text. Finally, we present a statistical chunking system for such clinical text with a stable learning rate and good accuracy, indicating that the manual annotation is consistent and that the annotation scheme is tractable for machine learning

    A roadmap toward the automatic composition of systematic literature reviews

    Get PDF
    Objective.  This paper presents an overview of existing artificial intelligence tools to produce systematic literature reviews. Furthermore, we propose a general framework resulting from combining these techniques to highlight the challenges and possibilities currently existing in this research area. Design/Methodology/Approach. We undertook a scoping review on the systematic literature review steps to automate them via computational techniques. Results/Discussion. The process of creating a literature review is both creative and technical. The technical part of this process is liable to automation. Based on the literature, we chose to divide this technical part into four steps: searching, screening, extraction, and synthesis. For each one of these steps, we presented practical artificial intelligence techniques to carry them out. In addition, we presented the obstacles encountered in the application of each technique. Conclusion. We proposed a framework for automatically creating systematic literature reviews by combining and placing existing techniques in stages where they possess the greatest potential to be useful. Despite still lacking practical assessment in different areas of knowledge, this proposal indicates ways with the potential to reduce the time-consuming and repetitive work embedded in the systematic literature review process. Originality/Value. The paper presents the current possibilities for automating systematic literature reviews and how they can work together to reduce researchers’ operational workload
    • …
    corecore