8 research outputs found

    Clinical text data in machine learning: Systematic review

    Get PDF
    Background: Clinical narratives represent the main form of communication within healthcare providing a personalized account of patient history and assessments, offering rich information for clinical decision making. Natural language processing (NLP) has repeatedly demonstrated its feasibility to unlock evidence buried in clinical narratives. Machine learning can facilitate rapid development of NLP tools by leveraging large amounts of text data. Objective: The main aim of this study is to provide systematic evidence on the properties of text data used to train machine learning approaches to clinical NLP. We also investigate the types of NLP tasks that have been supported by machine learning and how they can be applied in clinical practice. Methods: Our methodology was based on the guidelines for performing systematic reviews. In August 2018, we used PubMed, a multi-faceted interface, to perform a literature search against MEDLINE. We identified a total of 110 relevant studies and extracted information about the text data used to support machine learning, the NLP tasks supported and their clinical applications. The data properties considered included their size, provenance, collection methods, annotation and any relevant statistics. Results: The vast majority of datasets used to train machine learning models included only hundreds or thousands of documents. Only 10 studies used tens of thousands of documents with a handful of studies utilizing more. Relatively small datasets were utilized for training even when much larger datasets were available. The main reason for such poor data utilization is the annotation bottleneck faced by supervised machine learning algorithms. Active learning was explored to iteratively sample a subset of data for manual annotation as a strategy for minimizing the annotation effort while maximizing predictive performance of the model. Supervised learning was successfully used where clinical codes integrated with free text notes into electronic health records were utilized as class labels. Similarly, distant supervision was used to utilize an existing knowledge base to automatically annotate raw text. Where manual annotation was unavoidable, crowdsourcing was explored, but it remains unsuitable due to sensitive nature of data considered. Beside the small volume, training data were typically sourced from a small number of institutions, thus offering no hard evidence about the transferability of machine learning models. The vast majority of studies focused on the task of text classification. Most commonly, the classification results were used to support phenotyping, prognosis, care improvement, resource management and surveillance. Conclusions: We identified the data annotation bottleneck as one of the key obstacles to machine learning approaches in clinical NLP. Active learning and distant supervision were explored as a way of saving the annotation efforts. Future research in this field would benefit from alternatives such as data augmentation and transfer learning, or unsupervised learning, which does not require data annotation

    Automatic detection of patients with invasive fungal disease from free-text computed tomography (CT) scans

    No full text
    BACKGROUND: Invasive fungal diseases (IFDs) are associated with considerable health and economic costs. Surveillance of the more diagnostically challenging invasive fungal diseases, specifically of the sino-pulmonary system, is not feasible for many hospitals because case finding is a costly and labour intensive exercise. We developed text classifiers for detecting such IFDs from free-text radiology (CT) reports, using machine-learning techniques. METHOD: We obtained free-text reports of CT scans performed over a specific hospitalisation period (2003-2011), for 264 IFD and 289 control patients from three tertiary hospitals. We analysed IFD evidence at patient, report, and sentence levels. Three infectious disease experts annotated the reports of 73 IFD-positive patients for language suggestive of IFD at sentence level, and graded the sentences as to whether they suggested or excluded the presence of IFD. Reliable agreement between annotators was obtained and this was used as training data for our classifiers. We tested a variety of Machine Learning (ML), rule based, and hybrid systems, with feature types including bags of words, bags of phrases, and bags of concepts, as well as report-level structured features. Evaluation was carried out over a robust framework with separate Development and Held-Out datasets. RESULTS: The best systems (using Support Vector Machines) achieved very high recall at report- and patient-levels over unseen data: 95% and 100% respectively. Precision at report-level over held-out data was 71%; however, most of the associated false-positive reports (53%) belonged to patients who had a previous positive report appropriately flagged by the classifier, reducing negative impact in practice. CONCLUSIONS: Our machine learning application holds the potential for developing systematic IFD surveillance systems for hospital populations

    Automatic detection of patients with invasive fungal disease from free-text computed tomography (CT) scans

    No full text
    Background: Invasive fungal diseases (IFDs) are associated with considerable health and economic costs. Surveillance of the more diagnostically challenging invasive fungal diseases, specifically of the sino-pulmonary system, is not feasible for many hospitals because case finding is a costly and labour intensive exercise. We developed text classifiers for detecting such IFDs from free-text radiology (CT) reports, using machine-learning techniques. Method: We obtained free-text reports of CT scans performed over a specific hospitalisation period (2003-2011), for 264 IFD and 289 control patients from three tertiary hospitals. We analysed IFD evidence at patient, report, and sentence levels. Three infectious disease experts annotated the reports of 73 IFD-positive patients for language suggestive of IFD at sentence level, and graded the sentences as to whether they suggested or excluded the presence of IFD. Reliable agreement between annotators was obtained and this was used as training data for our classifiers. We tested a variety of Machine Learning (ML), rule based, and hybrid systems, with feature types including bags of words, bags of phrases, and bags of concepts, as well as report-level structured features. Evaluation was carried out over a robust framework with separate Development and Held-Out datasets. Results: The best systems (using Support Vector Machines) achieved very high recall at report- and patient-levels over unseen data: 95% and 100% respectively. Precision at report-level over held-out data was 71%; however, most of the associated false-positive reports (53%) belonged to patients who had a previous positive report appropriately flagged by the classifier, reducing negative impact in practice. Conclusions: Our machine learning application holds the potential for developing systematic IFD surveillance systems for hospital populations
    corecore