7 research outputs found
Recommended from our members
A lightweight, pattern-based approach to identification and formalisation of TimeML expressions in clinical narratives
General Architecture for Text Engineering (GATE) components for identifying clinical events and temporal expressions are developed and evaluated against a corpus of 120 discharge summaries
Automatically Recognizing Medication and Adverse Event Information From Food and Drug Administration\u27s Adverse Event Reporting System Narratives
BACKGROUND: The Food and Drug Administration\u27s (FDA) Adverse Event Reporting System (FAERS) is a repository of spontaneously-reported adverse drug events (ADEs) for FDA-approved prescription drugs. FAERS reports include both structured reports and unstructured narratives. The narratives often include essential information for evaluation of the severity, causality, and description of ADEs that are not present in the structured data. The timely identification of unknown toxicities of prescription drugs is an important, unsolved problem.
OBJECTIVE: The objective of this study was to develop an annotated corpus of FAERS narratives and biomedical named entity tagger to automatically identify ADE related information in the FAERS narratives.
METHODS: We developed an annotation guideline and annotate medication information and adverse event related entities on 122 FAERS narratives comprising approximately 23,000 word tokens. A named entity tagger using supervised machine learning approaches was built for detecting medication information and adverse event entities using various categories of features.
RESULTS: The annotated corpus had an agreement of over .9 Cohen\u27s kappa for medication and adverse event entities. The best performing tagger achieves an overall performance of 0.73 F1 score for detection of medication, adverse event and other named entities. C
ONCLUSIONS: In this study, we developed an annotated corpus of FAERS narratives and machine learning based models for automatically extracting medication and adverse event information from the FAERS narratives. Our study is an important step towards enriching the FAERS data for postmarketing pharmacovigilance
Integrating speculation detection and deep learning to extract lung cancer diagnosis from clinical notes
Despite efforts to develop models for extracting medical concepts from clinical notes, there are still some challenges in particular to be able to relate concepts to dates. The high number of clinical notes written for each single patient, the use of negation, speculation, and different date formats cause ambiguity that has to be solved to reconstruct the patient’s natural history. In this paper, we concentrate on extracting from clinical narratives the cancer diagnosis and relating it to the diagnosis date. To address this challenge, a hybrid approach that combines deep learning-based and rule-based methods is proposed. The approach integrates three steps: (i) lung cancer named entity recognition, (ii) negation and speculation detection, and (iii) relating the cancer diagnosis to a valid date. In particular, we apply the proposed approach to extract the lung cancer diagnosis and its diagnosis date from clinical narratives written in Spanish. Results obtained show an F-score of 90% in the named entity recognition task, and a 89% F-score in the task of relating the cancer diagnosis to the diagnosis date. Our findings suggest that speculation detection is together with negation detection a key component to properly extract cancer diagnosis from clinical notesThis work is supported by the EU Horizon 2020 innovation program under grant agreement
No. 780495, project BigMedilytics (Big Data for Medical Analytics). It has been also supported
by Fundación AECC and Instituto de Salud Carlos III (grant AC19/00034), under the frame of
ERA-NET PerMe
Self-mention and uncertain communication in the British Medical Journal (1840\u20132007): The decrease of subjectivity uncertainty markers
The communication of a scientific finding as certain or uncertain largely determines whether that information will be translated into practice. In this study, a corpus of 80 articles published in the British Medical Journal for over 167 years (1840\u20132007) is analysed by focusing on three categories of uncertainty markers, which explicitly reveal a writer\u2019s subjectivity: (1) I/we epistemic verbs; (2) I/we modal verbs; and (3) epistemic non-verbs conveying personal opinions. The quantitative analysis shows their progressive decrease over time, which can be due to several variables, including the evolution of medical knowledge and practice, changes in medical research and within the scientific community, and more stringent guidelines for the scientific writing (regarding types of articles, their structure and rhetorical style)
Science & Speculation
Despite wide recognition that speculation is critical for successful science, philosophers have attended little to it. When they have, speculation has been characterized in narrowly epistemic terms: a hypothesis is speculative due to its (lack of) evidential support. These ‘evidence-first’ accounts provide little guidance for what makes speculation productive or egregious, nor how to foster the former while avoiding the latter. I examine how scientists discuss speculation and identify various functions speculations play. On this basis, I develop a ‘function-first’ account of speculation. This analysis grounds a richer discussion of when speculation is egregious and when it is productive, based in both fine-grained analysis of the speculation’s purpose, and what I call the ‘epistemic situation’ scientists face
Recommended from our members
A modular, open-source information extraction framework for identifying clinical concepts and processes of care in clinical narratives
In this thesis, a synthesis is presented of the knowledge models required by clinical informa- tion systems that provide decision support for longitudinal processes of care. Qualitative research techniques and thematic analysis are novelly applied to a systematic review of the literature on the challenges in implementing such systems, leading to the development of an original conceptual framework. The thesis demonstrates how these process-oriented systems make use of a knowledge base derived from workflow models and clinical guidelines, and argues that one of the major barriers to implementation is the need to extract explicit and implicit information from diverse resources in order to construct the knowledge base. Moreover, concepts in both the knowledge base and in the electronic health record (EHR) must be mapped to a common ontological model. However, the majority of clinical guideline information remains in text form, and much of the useful clinical information residing in the EHR resides in the free text fields of progress notes and laboratory reports. In this thesis, it is shown how natural language processing and information extraction techniques provide a means to identify and formalise the knowledge components required by the knowledge base. Original contributions are made in the development of lexico-syntactic patterns and the use of external domain knowledge resources to tackle a variety of information extraction tasks in the clinical domain, such as recognition of clinical concepts, events, temporal relations, term disambiguation and abbreviation expansion. Methods are developed for adapting existing tools and resources in the biomedical domain to the processing of clinical texts, and approaches to improving the scalability of these tools are proposed and evalu- ated. These tools and techniques are then combined in the creation of a novel approach to identifying processes of care in the clinical narrative. It is demonstrated that resolution of coreferential and anaphoric relations as narratively and temporally ordered chains provides a means to extract linked narrative events and processes of care from clinical notes. Coreference performance in discharge summaries and progress notes is largely dependent on correct identification of protagonist chains (patient, clinician, family relation), pronominal resolution, and string matching that takes account of experiencer, temporal, spatial, and anatomical context; whereas for laboratory reports additional, external domain knowledge is required. The types of external knowledge and their effects on system performance are identified and evaluated. Results are compared against existing systems for solving these tasks and are found to improve on them, or to approach the performance of recently reported, state-of-the- art systems. Software artefacts developed in this research have been made available as open-source components within the General Architecture for Text Engineering framework