11,200 research outputs found

    Classification of Radiology Reports Using Neural Attention Models

    Full text link
    The electronic health record (EHR) contains a large amount of multi-dimensional and unstructured clinical data of significant operational and research value. Distinguished from previous studies, our approach embraces a double-annotated dataset and strays away from obscure "black-box" models to comprehensive deep learning models. In this paper, we present a novel neural attention mechanism that not only classifies clinically important findings. Specifically, convolutional neural networks (CNN) with attention analysis are used to classify radiology head computed tomography reports based on five categories that radiologists would account for in assessing acute and communicable findings in daily practice. The experiments show that our CNN attention models outperform non-neural models, especially when trained on a larger dataset. Our attention analysis demonstrates the intuition behind the classifier's decision by generating a heatmap that highlights attended terms used by the CNN model; this is valuable when potential downstream medical decisions are to be performed by human experts or the classifier information is to be used in cohort construction such as for epidemiological studies

    Privacy in text documents

    Get PDF
    The process of sensitive data preservation is a manual and a semi-automatic procedure. Sensitive data preservation suffers various problems, in particular, affect the handling of confidential, sensitive and personal information, such as the identification of sensitive data in documents requiring human intervention that is costly and propense to generate error, and the identification of sensitive data in large-scale documents does not allow an approach that depends on human expertise for their identification and relationship. DataSense will be highly exportable software that will enable organizations to identify and understand the sensitive data in their possession in unstructured textual information (digital documents) in order to comply with legal, compliance and security purposes. The goal is to identify and classify sensitive data (Personal Data) present in large-scale structured and non-structured information in a way that allows entities and/or organizations to understand it without calling into question security or confidentiality issues. The DataSense project will be based on European-Portuguese text documents with different approaches of NLP (Natural Language Processing) technologies and the advances in machine learning, such as Named Entity Recognition, Disambiguation, Co-referencing (ARE) and Automatic Learning and Human Feedback. It will also be characterized by the ability to assist organizations in complying with standards such as the GDPR (General Data Protection Regulation), which regulate data protection in the European Union.info:eu-repo/semantics/acceptedVersio

    Enhanced Neurologic Concept Recognition using a Named Entity Recognition Model based on Transformers

    Get PDF
    Although Deep Learning Has Been Applied to the Recognition of Diseases and Drugs in Electronic Health Records and the Biomedical Literature, Relatively Little Study Has Been Devoted to the Utility of Deep Learning for the Recognition of Signs and Symptoms. the Recognition of Signs and Symptoms is Critical to the Success of Deep Phenotyping and Precision Medicine. We Have Developed a Named Entity Recognition Model that Uses Deep Learning to Identify Text Spans Containing Neurological Signs and Symptoms and Then Maps These Text Spans to the Clinical Concepts of a Neuro-Ontology. We Compared a Model based on Convolutional Neural Networks to One based on Bidirectional Encoder Representation from Transformers. Models Were Evaluated for Accuracy of Text Span Identification on Three Text Corpora: Physician Notes from an Electronic Health Record, Case Histories from Neurologic Textbooks, and Clinical Synopses from an Online Database of Genetic Diseases. Both Models Performed Best on the Professionally-Written Clinical Synopses and Worst on the Physician-Written Clinical Notes. Both Models Performed Better When Signs and Symptoms Were Represented as Shorter Text Spans. Consistent with Prior Studies that Examined the Recognition of Diseases and Drugs, the Model based on Bidirectional Encoder Representations from Transformers Outperformed the Model based on Convolutional Neural Networks for Recognizing Signs and Symptoms. Recall for Signs and Symptoms Ranged from 59.5% to 82.0% and Precision Ranged from 61.7% to 80.4%. with Further Advances in NLP, Fully Automated Recognition of Signs and Symptoms in Electronic Health Records and the Medical Literature Should Be Feasible

    An attentive neural architecture for joint segmentation and parsing and its application to real estate ads

    Get PDF
    In processing human produced text using natural language processing (NLP) techniques, two fundamental subtasks that arise are (i) segmentation of the plain text into meaningful subunits (e.g., entities), and (ii) dependency parsing, to establish relations between subunits. In this paper, we develop a relatively simple and effective neural joint model that performs both segmentation and dependency parsing together, instead of one after the other as in most state-of-the-art works. We will focus in particular on the real estate ad setting, aiming to convert an ad to a structured description, which we name property tree, comprising the tasks of (1) identifying important entities of a property (e.g., rooms) from classifieds and (2) structuring them into a tree format. In this work, we propose a new joint model that is able to tackle the two tasks simultaneously and construct the property tree by (i) avoiding the error propagation that would arise from the subtasks one after the other in a pipelined fashion, and (ii) exploiting the interactions between the subtasks. For this purpose, we perform an extensive comparative study of the pipeline methods and the new proposed joint model, reporting an improvement of over three percentage points in the overall edge F1 score of the property tree. Also, we propose attention methods, to encourage our model to focus on salient tokens during the construction of the property tree. Thus we experimentally demonstrate the usefulness of attentive neural architectures for the proposed joint model, showcasing a further improvement of two percentage points in edge F1 score for our application.Comment: Preprint - Accepted for publication in Expert Systems with Application

    Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples

    Full text link
    Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI). arXiv admin note: substantial text overlap with arXiv:1610.0770
    • …
    corecore