Search CORE

5,864 research outputs found

Extracting clinical information from electronic medical records

Author: Ferreira J.
Lamy M.
Melo F.
Pereira R.
Vasconcelos J. B.
Velez I.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

As the adoption of Electronic Medical Records (EMRs) rises in the healthcare institutions, these resources are each day more important because of the clinical data they contain about patients. However, the unstructured textual data in the form of narrative present in those records, makes it hard to extract and structure useful clinical information. This unstructured text limits the potential of the EMRs, because the clinical data these records contain, can be used to perform important operations inside healthcare institutions such as searching, summarization, decision support and statistical analysis, as well as be used to support management decisions or serve for research. These operations can only be done if the clinical data from the narratives is properly extracted and structured. Usually this extraction is made manually by healthcare practitioners, what is not efficient and is error-prone. The present work uses Natural Language Processing (NLP) and Information Extraction(IE) techniques in order to develop a pipeline system that can extract clinical information directly from unstructured texts present in Portuguese EMRs, in an automated way, in order to help EMRs to fulfil their potential.info:eu-repo/semantics/acceptedVersio

Repositório Institucional do ISCTE-IUL

Knowledge-based best of breed approach for automated detection of clinical events based on German free text digital hospital discharge letters

Author: Demuth Ilja
Diekmann Daniel
König Maximilian
Sander André
Steinhagen-Thiessen Elisabeth
Publication venue
Publication date: 01/01/2019
Field of study

OBJECTIVES: The secondary use of medical data contained in electronic medical records, such as hospital discharge letters, is a valuable resource for the improvement of clinical care (e.g. in terms of medication safety) or for research purposes. However, the automated processing and analysis of medical free text still poses a huge challenge to available natural language processing (NLP) systems. The aim of this study was to implement a knowledge-based best of breed approach, combining a terminology server with integrated ontology, a NLP pipeline and a rules engine. METHODS: We tested the performance of this approach in a use case. The clinical event of interest was the particular drug-disease interaction "proton-pump inhibitor [PPI] use and osteoporosis". Cases were to be identified based on free text digital discharge letters as source of information. Automated detection was validated against a gold standard. RESULTS: Precision of recognition of osteoporosis was 94.19%, and recall was 97.45%. PPIs were detected with 100% precision and 97.97% recall. The F-score for the detection of the given drug-disease-interaction was 96,13%. CONCLUSION: We could show that our approach of combining a NLP pipeline, a terminology server, and a rules engine for the purpose of automated detection of clinical events such as drug-disease interactions from free text digital hospital discharge letters was effective. There is huge potential for the implementation in clinical and research contexts, as this approach enables analyses of very high numbers of medical free text documents within a short time period

Institutional Repository of the Freie Universität Berlin

Directory of Open Access Journals

MPG.PuRe

Recommended from our members

Lexical patterns, features and knowledge resources for coreference resolution in clinical notes

Author: Abdul Roudsari
D’Avolio
Miller
Phil Gooch
Rahman
Recasens
Rosse
Savova
Savova
Uzuner
van Deemter
Zheng
Zheng
Publication venue: 'Elsevier BV'
Publication date: 01/10/2012
Field of study

Generation of entity coreference chains provides a means to extract linked narrative events from clinical notes, but despite being a well-researched topic in natural language processing, general- purpose coreference tools perform poorly on clinical texts. This paper presents a knowledge-centric and pattern-based approach to resolving coreference across a wide variety of clinical records comprising discharge summaries, progress notes, pathology, radiology and surgical reports from two corpora (Ontology Development and Information Extraction (ODIE) and i2b2/VA). In addition, a method for generating coreference chains using progressively pruned linked lists is demonstrated that reduces the search space and facilitates evaluation by a number of metrics. Independent evaluation results show an F-measure for each corpus of 79.2% and 87.5%, respectively, which offers performance at least as good as human annotators, greatly increased performance over general- purpose tools, and improvement on previously reported clinical coreference systems. The system uses a number of open-source components that are available to download

City Research Online

Elsevier - Publisher Connector

Crossref

Automatic annotation of bioinformatics workflows with biomedical ontologies

Author: B. Smith
B.P. Vandervalk
D. Sáchez
D. Withers
J. Ison
M.D. Wilkinson
M.D. Wilkinson
P. Lord
P. Rice
S. Harispe
T. Oinn
U. Radetzki
Publication venue
Publication date: 01/01/2014
Field of study

Legacy scientific workflows, and the services within them, often present scarce and unstructured (i.e. textual) descriptions. This makes it difficult to find, share and reuse them, thus dramatically reducing their value to the community. This paper presents an approach to annotating workflows and their subcomponents with ontology terms, in an attempt to describe these artifacts in a structured way. Despite a dearth of even textual descriptions, we automatically annotated 530 myExperiment bioinformatics-related workflows, including more than 2600 workflow-associated services, with relevant ontological terms. Quantitative evaluation of the Information Content of these terms suggests that, in cases where annotation was possible at all, the annotation quality was comparable to manually curated bioinformatics resources.Comment: 6th International Symposium on Leveraging Applications (ISoLA 2014 conference), 15 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Building a semantically annotated corpus of clinical texts

Author: Andrea Setzer
Angus Roberts
Denny
Franzén
Friedman
Gennari
George Demetriou
Hersh
Hripcsak
Ian Roberts
Kim
Lindberg
Mark Hepple
Meystre
Pestian
Robert Gaizauskas
Roberts
Tanabe
Yikun Guo
Publication venue: 'Elsevier BV'
Publication date: 01/10/2009
Field of study

In this paper, we describe the construction of a semantically annotated corpus of clinical texts for use in the development and evaluation of systems for automatically extracting clinically significant information from the textual component of patient records. The paper details the sampling of textual material from a collection of 20,000 cancer patient records, the development of a semantic annotation scheme, the annotation methodology, the distribution of annotations in the final corpus, and the use of the corpus for development of an adaptive information extraction system. The resulting corpus is the most richly semantically annotated resource for clinical text processing built to date, whose value has been demonstrated through its use in developing an effective information extraction system. The detailed presentation of our corpus construction and annotation methodology will be of value to others seeking to build high-quality semantically annotated corpora in biomedical domains

Elsevier - Publisher Connector

Crossref

White Rose Research Online

Recommended from our members

Protected Health Information filter (Philter): accurately and securely de-identifying free-text clinical notes.

Author: Butte Atul J
Fan Xuancheng
Glicksberg Benjamin S
Goldstein Theodore
Ludwig Dana
Muenzen Kathleen
Norgeot Beau
Oskotsky Boris
Peterson Thomas A
Rutenberg Eugenia
Schenk Gundolf
Schmajuk Gabriela
Sirota Marina
Yazdany Jinoos
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

There is a great and growing need to ascertain what exactly is the state of a patient, in terms of disease progression, actual care practices, pathology, adverse events, and much more, beyond the paucity of data available in structured medical record data. Ascertaining these harder-to-reach data elements is now critical for the accurate phenotyping of complex traits, detection of adverse outcomes, efficacy of off-label drug use, and longitudinal patient surveillance. Clinical notes often contain the most detailed and relevant digital information about individual patients, the nuances of their diseases, the treatment strategies selected by physicians, and the resulting outcomes. However, notes remain largely unused for research because they contain Protected Health Information (PHI), which is synonymous with individually identifying data. Previous clinical note de-identification approaches have been rigid and still too inaccurate to see any substantial real-world use, primarily because they have been trained with too small medical text corpora. To build a new de-identification tool, we created the largest manually annotated clinical note corpus for PHI and develop a customizable open-source de-identification software called Philter ("Protected Health Information filter"). Here we describe the design and evaluation of Philter, and show how it offers substantial real-world improvements over prior methods

eScholarship - University of California

Event Representations for Automated Story Generation with Deep Neural Nets

Author: Ammanabrolu Prithviraj
Hancock William
Harrison Brent
Martin Lara J.
Riedl Mark O.
Singh Shruti
Wang Xinyu
Publication venue
Publication date: 12/09/2017
Field of study

Automated story generation is the problem of automatically selecting a sequence of events, actions, or words that can be told as a story. We seek to develop a system that can generate stories by learning everything it needs to know from textual story corpora. To date, recurrent neural networks that learn language models at character, word, or sentence levels have had little success generating coherent stories. We explore the question of event representations that provide a mid-level of abstraction between words and sentences in order to retain the semantic information of the original data while minimizing event sparsity. We present a technique for preprocessing textual story data into event sequences. We then present a technique for automated story generation whereby we decompose the problem into the generation of successive events (event2event) and the generation of natural language sentences from events (event2sentence). We give empirical results comparing different event representations and their effects on event successor generation and the translation of events to natural language.Comment: Submitted to AAAI'1

arXiv.org e-Print Archive

Extracting clinical knowledge from electronic medical records

Author: Ferreira J. C.
Lamy M.
Melo F.
Pereira R.
Velez I.
Publication venue: International Association of Engineers
Publication date: 01/01/2018
Field of study

As the adoption of Electronic Medical Records (EMRs) rises in the healthcare institutions, these resources' importance increases because of the clinical information they contain about patients. However, the unstructured information in the form of clinical narratives present in those records, makes it hard to extract and structure useful clinical knowledge. This unstructured information limits the potential of the EMRs, because the clinical information these records contain can be used to perform important tasks inside healthcare institutions such as searching, summarization, decision support and statistical analysis, as well as be used to support management decisions or serve for research. These tasks can only be done if the unstructured clinical information from the narratives is properly extracted, structured and transformed in clinical knowledge. Usually, this extraction is made manually by healthcare practitioners, which is not efficient and is error-prone. This research uses Natural Language Processing (NLP) and Information Extraction (IE) techniques, in order to develop a pipeline system that can extract clinical knowledge from unstructured clinical information present in Portuguese EMRs, in an automated way, in order to help EMRs to fulfil their potential.info:eu-repo/semantics/publishedVersio

Repositório Institucional do ISCTE-IUL