17,029 research outputs found

    Linking social media, medical literature, and clinical notes using deep learning.

    Get PDF
    Researchers analyze data, information, and knowledge through many sources, formats, and methods. The dominant data format includes text and images. In the healthcare industry, professionals generate a large quantity of unstructured data. The complexity of this data and the lack of computational power causes delays in analysis. However, with emerging deep learning algorithms and access to computational powers such as graphics processing unit (GPU) and tensor processing units (TPUs), processing text and images is becoming more accessible. Deep learning algorithms achieve remarkable results in natural language processing (NLP) and computer vision. In this study, we focus on NLP in the healthcare industry and collect data not only from electronic medical records (EMRs) but also medical literature and social media. We propose a framework for linking social media, medical literature, and EMRs clinical notes using deep learning algorithms. Connecting data sources requires defining a link between them, and our key is finding concepts in the medical text. The National Library of Medicine (NLM) introduces a Unified Medical Language System (UMLS) and we use this system as the foundation of our own system. We recognize social media’s dynamic nature and apply supervised and semi-supervised methodologies to generate concepts. Named entity recognition (NER) allows efficient extraction of information, or entities, from medical literature, and we extend the model to process the EMRs’ clinical notes via transfer learning. The results include an integrated, end-to-end, web-based system solution that unifies social media, literature, and clinical notes, and improves access to medical knowledge for the public and experts

    Extracting Medical Entities from Radiology Reports with Ontology-based Distant Supervision

    Get PDF
    Doctors need to review a substantial amount of medical documents, such as radiology reports, to make medical decisions. Named Entity Recognition~(NER) structuralizes the raw medical text by detecting and classifying medical-related entities. Structuralized documents with medical concepts improve the doctors' work effectiveness and contain medical information benefitting the extraction of important information. Nevertheless, deploying the NER on Finnish medical text is still challenging because of data annotation, in-domain adaptation, label in-completion problem, and label noise. To solve these problems, we develop a NER system called Auto-labeling and Noise-suppressed Network~(ANT). Automated annotation mechanism provides supervised signals for training samples of the NER dataset. Domain continual pretraining transfers in-domain knowledge to the NER model for better model performance. We leverage weak label completion scheme to complete weak labels generated by the automated annotation mechanism. Some noise suppression approaches are applied to further reduce the label noise. Experimental results show that our model has achieved relatively strong performance on a silver standard dataset. We also conduct ablation experiments to explore the effectiveness of our framework's components

    Spanish named entity recognition in the biomedical domain

    Get PDF
    Named Entity Recognition in the clinical domain and in languages different from English has the difficulty of the absence of complete dictionaries, the informality of texts, the polysemy of terms, the lack of accordance in the boundaries of an entity, the scarcity of corpora and of other resources available. We present a Named Entity Recognition method for poorly resourced languages. The method was tested with Spanish radiology reports and compared with a conditional random fields system.Peer ReviewedPostprint (author's final draft
    • …
    corecore