11,078 research outputs found

    Building a semantically annotated corpus of clinical texts

    Get PDF
    In this paper, we describe the construction of a semantically annotated corpus of clinical texts for use in the development and evaluation of systems for automatically extracting clinically significant information from the textual component of patient records. The paper details the sampling of textual material from a collection of 20,000 cancer patient records, the development of a semantic annotation scheme, the annotation methodology, the distribution of annotations in the final corpus, and the use of the corpus for development of an adaptive information extraction system. The resulting corpus is the most richly semantically annotated resource for clinical text processing built to date, whose value has been demonstrated through its use in developing an effective information extraction system. The detailed presentation of our corpus construction and annotation methodology will be of value to others seeking to build high-quality semantically annotated corpora in biomedical domains

    Report on the EHCR (Deliverable 26.2)

    Get PDF
    This deliverable is the second for Workpackage 26. The first, submitted after Month 12, summarised the areas of research that the partners had identified as being relevant to the semantic indexing of the EHR. This second one reports progress on the key threads of work identified by the partners during the project to contribute towards semantically interoperable and processable EHRs. This report provides a set of short summaries on key topics that have emerged as important, and to which the partners are able to make strong contributions. Some of these are also being extended via two new EU Framework 6 proposals that include WP26 partners: this is also a measure of the success of this Network of Excellence

    Implementing a Portable Clinical NLP System with a Common Data Model - a Lisp Perspective

    Full text link
    This paper presents a Lisp architecture for a portable NLP system, termed LAPNLP, for processing clinical notes. LAPNLP integrates multiple standard, customized and in-house developed NLP tools. Our system facilitates portability across different institutions and data systems by incorporating an enriched Common Data Model (CDM) to standardize necessary data elements. It utilizes UMLS to perform domain adaptation when integrating generic domain NLP tools. It also features stand-off annotations that are specified by positional reference to the original document. We built an interval tree based search engine to efficiently query and retrieve the stand-off annotations by specifying positional requirements. We also developed a utility to convert an inline annotation format to stand-off annotations to enable the reuse of clinical text datasets with inline annotations. We experimented with our system on several NLP facilitated tasks including computational phenotyping for lymphoma patients and semantic relation extraction for clinical notes. These experiments showcased the broader applicability and utility of LAPNLP.Comment: 6 pages, accepted by IEEE BIBM 2018 as regular pape

    Report on the EHCR (Deliverable 26.1)

    Get PDF
    The challenge of richly interpreting electronic health information, in order to populate EHR instances with suitable terms, to provide decision support in the care of individuals, to identify suitable patients for teaching or clinical trials recruitment, and to mine populations of records for public health or to discover new medical knowledge, all require that the heterogeneous clinical entry instances within EHR repositories can be systematically analysed and interpreted. Achieving this requires the combination and co-operation of many different health informatics tools and technologies, underpinned by shared representations of clinical concepts and inferencing formalisms. Much of this work is at the level of R&D, and is well represented across the Semantic Mining consortium. The challenge of WP26 is to build up a vision of the ways in which these historically independent threads of health informatics research can collaborate, and uncover the research challenges that are needed in order to deliver good demonstrations of semantically indexed and richly analysable EHRs. The partners have begun WP26 by acquiring a better knowledge of each other’s areas of endeavour, and are beginning to steer their research interests towards future areas of collaboration

    Hypotheses, evidence and relationships: The HypER approach for representing scientific knowledge claims

    Get PDF
    Biological knowledge is increasingly represented as a collection of (entity-relationship-entity) triplets. These are queried, mined, appended to papers, and published. However, this representation ignores the argumentation contained within a paper and the relationships between hypotheses, claims and evidence put forth in the article. In this paper, we propose an alternate view of the research article as a network of 'hypotheses and evidence'. Our knowledge representation focuses on scientific discourse as a rhetorical activity, which leads to a different direction in the development of tools and processes for modeling this discourse. We propose to extract knowledge from the article to allow the construction of a system where a specific scientific claim is connected, through trails of meaningful relationships, to experimental evidence. We discuss some current efforts and future plans in this area
    corecore