7,086 research outputs found
CCheXR-Attention: Clinical concept extraction and chest x-ray reports classification using modified Mogrifier and bidirectional LSTM with multihead attention
Radiology reports cover different aspects, from radiological observation to the diagnosis of an imaging examination, such as X-rays, MRI, and CT scans. Abundant patient information presented in radiology reports poses a few major challenges. First, radiology reports follow a free-text reporting format, which causes the loss of a large amount of information in unstructured text. Second, the extraction of important features from these reports is a huge bottleneck for machine learning models. These challenges are important, particularly the extraction of key features such as symptoms, comparison/priors, technique, finding, and impression because they facilitate the decision-making on patients’ health. To alleviate this issue, a novel architecture CCheXR-Attention is proposed to extract the clinical features from the radiological reports and classify each report into normal and abnormal categories based on the extracted information. We have proposed a modified mogrifier LSTM model and integrated a multihead attention method to extract the more relevant features. Experimental outcomes on two benchmark datasets demonstrated that the proposed model surpassed state-of-the-art models
Enhancing Clinical Concept Extraction with Contextual Embeddings
Neural network-based representations ("embeddings") have dramatically
advanced natural language processing (NLP) tasks, including clinical NLP tasks
such as concept extraction. Recently, however, more advanced embedding methods
and representations (e.g., ELMo, BERT) have further pushed the state-of-the-art
in NLP, yet there are no common best practices for how to integrate these
representations into clinical tasks. The purpose of this study, then, is to
explore the space of possible options in utilizing these new models for
clinical concept extraction, including comparing these to traditional word
embedding methods (word2vec, GloVe, fastText). Both off-the-shelf open-domain
embeddings and pre-trained clinical embeddings from MIMIC-III are evaluated. We
explore a battery of embedding methods consisting of traditional word
embeddings and contextual embeddings, and compare these on four concept
extraction corpora: i2b2 2010, i2b2 2012, SemEval 2014, and SemEval 2015. We
also analyze the impact of the pre-training time of a large language model like
ELMo or BERT on the extraction performance. Last, we present an intuitive way
to understand the semantic information encoded by contextual embeddings.
Contextual embeddings pre-trained on a large clinical corpus achieves new
state-of-the-art performances across all concept extraction tasks. The
best-performing model outperforms all state-of-the-art methods with respective
F1-measures of 90.25, 93.18 (partial), 80.74, and 81.65. We demonstrate the
potential of contextual embeddings through the state-of-the-art performance
these methods achieve on clinical concept extraction. Additionally, we
demonstrate contextual embeddings encode valuable semantic information not
accounted for in traditional word representations.Comment: Journal of the American Medical Informatics Associatio
A Practical Incremental Learning Framework For Sparse Entity Extraction
This work addresses challenges arising from extracting entities from textual
data, including the high cost of data annotation, model accuracy, selecting
appropriate evaluation criteria, and the overall quality of annotation. We
present a framework that integrates Entity Set Expansion (ESE) and Active
Learning (AL) to reduce the annotation cost of sparse data and provide an
online evaluation method as feedback. This incremental and interactive learning
framework allows for rapid annotation and subsequent extraction of sparse data
while maintaining high accuracy. We evaluate our framework on three publicly
available datasets and show that it drastically reduces the cost of sparse
entity annotation by an average of 85% and 45% to reach 0.9 and 1.0 F-Scores
respectively. Moreover, the method exhibited robust performance across all
datasets.Comment: https://www.aclweb.org/anthology/C18-1059
Comparative Analysis of Contextual Relation Extraction based on Deep Learning Models
Contextual Relation Extraction (CRE) is mainly used for constructing a
knowledge graph with a help of ontology. It performs various tasks such as
semantic search, query answering, and textual entailment. Relation extraction
identifies the entities from raw texts and the relations among them. An
efficient and accurate CRE system is essential for creating domain knowledge in
the biomedical industry. Existing Machine Learning and Natural Language
Processing (NLP) techniques are not suitable to predict complex relations from
sentences that consist of more than two relations and unspecified entities
efficiently. In this work, deep learning techniques have been used to identify
the appropriate semantic relation based on the context from multiple sentences.
Even though various machine learning models have been used for relation
extraction, they provide better results only for binary relations, i.e.,
relations occurred exactly between the two entities in a sentence. Machine
learning models are not suited for complex sentences that consist of the words
that have various meanings. To address these issues, hybrid deep learning
models have been used to extract the relations from complex sentence
effectively. This paper explores the analysis of various deep learning models
that are used for relation extraction.Comment: This Paper Presented in the International Conference on FOSS
Approaches towards Computational Intelligence and Language TTechnolog on
February 2023, Thiruvananthapura
Text Classification: A Review, Empirical, and Experimental Evaluation
The explosive and widespread growth of data necessitates the use of text
classification to extract crucial information from vast amounts of data.
Consequently, there has been a surge of research in both classical and deep
learning text classification methods. Despite the numerous methods proposed in
the literature, there is still a pressing need for a comprehensive and
up-to-date survey. Existing survey papers categorize algorithms for text
classification into broad classes, which can lead to the misclassification of
unrelated algorithms and incorrect assessments of their qualities and behaviors
using the same metrics. To address these limitations, our paper introduces a
novel methodological taxonomy that classifies algorithms hierarchically into
fine-grained classes and specific techniques. The taxonomy includes methodology
categories, methodology techniques, and methodology sub-techniques. Our study
is the first survey to utilize this methodological taxonomy for classifying
algorithms for text classification. Furthermore, our study also conducts
empirical evaluation and experimental comparisons and rankings of different
algorithms that employ the same specific sub-technique, different
sub-techniques within the same technique, different techniques within the same
category, and categorie
Recommended from our members
Coreference resolution in clinical discharge summaries, progress notes, surgical and pathology reports: a unified lexical approach
We developed a lexical rule-based system that uses a unified approach to resolving coreference across a wide variety of clinical records comprising discharge summaries, progress notes, pathology, radiology and surgical reports from two corpora (Ontology Development and Information Extraction (ODIE) and i2b2/VA) provided for the fifth i2b2/VA shared task. Taking the unweighted mean between 4 coreference metrics, validation of the system against the i2b2/VA corpus attained an overall F-score of 87.7% across all mention classes, with a maximum of 93.1% for coreference of persons, and a minimum of 77.2% for coreference of tests. For the ODIE corpus the overall F-score across all mention classes was 79.4%, with a maximum of 82.0% for coreference of persons and a minimum of 13.1% for coreference of diagnostic reagents. For the ODIE corpus our results are comparable to the mean reported inter-annotator agreement with the gold standard. We discuss the four categories of errors we identified, and how these might be addressed. The system uses a number of reusable modules and techniques that may be of benefit to the research community
- …