Chunking clinical text containing non-canonical language

Carroll, John; Cassell, Jackie; Savkov, Aleksandar

research

Chunking clinical text containing non-canonical language

Authors: John Carroll
Jackie Cassell
Aleksandar Savkov
Publication date: 1 January 2014
Publisher
Doi

Abstract

Free text notes typed by primary care physicians during patient consultations typically contain highly non-canonical language. Shallow syntactic analysis of free text notes can help to reveal valuable information for the study of disease and treatment. We present an exploratory study into chunking such text using off-the-shelf language processing tools and pre-trained statistical models. We evaluate chunking accuracy with respect to part-of-speech tagging quality, choice of chunk representation, and breadth of context features. Our results indicate that narrow context feature windows give the best results, but that chunk representation and minor differences in tagging quality do not have a significant impact on chunking accuracy

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Crossref

info:doi/10.3115%2Fv1%2Fw14-34...

Last time updated on 05/06/2019

CiteSeerX

oai:CiteSeerX.psu:10.1.1.646.9...

Last time updated on 29/10/2017

Sustaining member

Sussex Research Online

oai:figshare.com:article/23415...

Last time updated on 05/12/2023