8,293 research outputs found
Intelligent Word Embeddings of Free-Text Radiology Reports
Radiology reports are a rich resource for advancing deep learning
applications in medicine by leveraging the large volume of data continuously
being updated, integrated, and shared. However, there are significant
challenges as well, largely due to the ambiguity and subtlety of natural
language. We propose a hybrid strategy that combines semantic-dictionary
mapping and word2vec modeling for creating dense vector embeddings of free-text
radiology reports. Our method leverages the benefits of both
semantic-dictionary mapping as well as unsupervised learning. Using the vector
representation, we automatically classify the radiology reports into three
classes denoting confidence in the diagnosis of intracranial hemorrhage by the
interpreting radiologist. We performed experiments with varying hyperparameter
settings of the word embeddings and a range of different classifiers. Best
performance achieved was a weighted precision of 88% and weighted recall of
90%. Our work offers the potential to leverage unstructured electronic health
record data by allowing direct analysis of narrative clinical notes.Comment: AMIA Annual Symposium 201
Recommended from our members
Information needs after stroke: What to include and how to structure it on a website. A qualitative study using focus groups and card sorting
Background: Use of the Internet to obtain health and other information is increasing. Previous studies have identified the specific information needs of people with stroke but not in relation to the Internet. People with aphasia (PwA) may face barriers in accessing the Internet: Navigating websites requires an ability to categorise information and this ability is often impaired in PwA. The website categorisation preferences of people with stroke and with aphasia have not yet been reported.
Aims: This study aimed: (a) to determine what information people who have had a stroke would like to see on a website about living with stroke; (b) to determine the most effective means of structuring information on the website so that it is accessible to people with stroke; and c) to identify any differences between people with and without aphasia in terms of preferences for structuring information on the website.
Methods & Procedures: Participants were recruited from a hospital's Stroke Database. Focus groups were used to elicit what information participants wanted on a website about living with stroke. The themes raised were depicted on 133 cards. To determine the most effective way of structuring information on the website, and whether there were any differences in preferences between PwA and PwoA, participants used a modified closed card-sorting technique to sort the cards under website categories.
Outcomes & Results: A total of 48 people were invited, and 12 (25%) agreed to take part. We ran three focus groups: one with PwA (n = 5) and two with people without aphasia (PwoA) (n = 3, n = 4). Participants wanted more information about stroke causes and effects (particularly emotional issues), roles of local agencies, and returning to previous activities (driving, going out). All participants completed the card-sorting exercise. Few cards (6%) were categorised identically by everyone. Cards relating to local agencies and groups were not consistently categorised together. Cards relating to emotions were segregated. The categorisation preferences for PwA were more fragmented than those for PwoA: 60% of PwA agreed on the categorisation of 51% of the cards, whereas 60% of PwoA agreed on the categorisation of 76% of the cards.
Conclusions: Information needs covered all stages of the stroke journey. The card sorting was accessible to everyone, and provided evidence of structuring preferences and of some of the categorisation difficulties faced by PwA. More research is needed on what an accessible website looks like for PwA
Towards a proteomics meta-classification
that can serve as a foundation for more refined ontologies in the field of proteomics. Standard data sources classify proteins in terms of just one or two specific aspects. Thus SCOP (Structural Classification of Proteins) is described as classifying proteins on the basis of structural features; SWISSPROT annotates proteins on the basis of their structure and of parameters like post-translational modifications. Such data sources are connected to each other by pairwise term-to-term mappings. However, there are obstacles which stand in the way of combining them together to form a robust meta-classification of the needed sort. We discuss some formal ontological principles which
should be taken into account within the existing datasources in order to make such a metaclassification possible, taking into account also the Gene Ontology (GO) and its application to the annotation of proteins
Improving Syntactic Parsing of Clinical Text Using Domain Knowledge
Syntactic parsing is one of the fundamental tasks of Natural Language Processing (NLP). However, few studies have explored syntactic parsing in the medical domain. This dissertation systematically investigated different methods to improve the performance of syntactic parsing of clinical text, including (1) Constructing two clinical treebanks of discharge summaries and progress notes by developing annotation guidelines that handle missing elements in clinical sentences; (2) Retraining four state-of-the-art parsers, including the Stanford parser, Berkeley parser, Charniak parser, and Bikel parser, using clinical treebanks, and comparing their performance to identify better parsing approaches; and (3) Developing new methods to reduce syntactic ambiguity caused by Prepositional Phrase (PP) attachment and coordination using semantic information.
Our evaluation showed that clinical treebanks greatly improved the performance of existing parsers. The Berkeley parser achieved the best F-1 score of 86.39% on the MiPACQ treebank. For PP attachment, our proposed methods improved the accuracies of PP attachment by 2.35% on the MiPACQ corpus and 1.77% on the I2b2 corpus. For coordination, our method achieved a precision of 94.9% and a precision of 90.3% for the MiPACQ and i2b2 corpus, respectively. To further demonstrate the effectiveness of the improved parsing approaches, we applied outputs of our parsers to two external NLP tasks: semantic role labeling and temporal relation extraction. The experimental results showed that performance of both tasks’ was improved by using the parse tree information from our optimized parsers, with an improvement of 3.26% in F-measure for semantic role labelling and an improvement of 1.5% in F-measure for temporal relation extraction
A Review of Negation in Clinical Texts
Negation is commonly seen in clinical documents [Chapman et al., 2001a] ”In clinical reports the presence of a term does not necessarily indicate the presence of the clinical condition represented by that term. In fact, many of the most frequently described findings and diseases in discharge summaries, radiology reports, history and physical exams, and other transcribed reports are denied in the patient” [Chapman et al., 2001b, page. 301]
- …