5,420 research outputs found

    Enhanced Neurologic Concept Recognition using a Named Entity Recognition Model based on Transformers

    Get PDF
    Although Deep Learning Has Been Applied to the Recognition of Diseases and Drugs in Electronic Health Records and the Biomedical Literature, Relatively Little Study Has Been Devoted to the Utility of Deep Learning for the Recognition of Signs and Symptoms. the Recognition of Signs and Symptoms is Critical to the Success of Deep Phenotyping and Precision Medicine. We Have Developed a Named Entity Recognition Model that Uses Deep Learning to Identify Text Spans Containing Neurological Signs and Symptoms and Then Maps These Text Spans to the Clinical Concepts of a Neuro-Ontology. We Compared a Model based on Convolutional Neural Networks to One based on Bidirectional Encoder Representation from Transformers. Models Were Evaluated for Accuracy of Text Span Identification on Three Text Corpora: Physician Notes from an Electronic Health Record, Case Histories from Neurologic Textbooks, and Clinical Synopses from an Online Database of Genetic Diseases. Both Models Performed Best on the Professionally-Written Clinical Synopses and Worst on the Physician-Written Clinical Notes. Both Models Performed Better When Signs and Symptoms Were Represented as Shorter Text Spans. Consistent with Prior Studies that Examined the Recognition of Diseases and Drugs, the Model based on Bidirectional Encoder Representations from Transformers Outperformed the Model based on Convolutional Neural Networks for Recognizing Signs and Symptoms. Recall for Signs and Symptoms Ranged from 59.5% to 82.0% and Precision Ranged from 61.7% to 80.4%. with Further Advances in NLP, Fully Automated Recognition of Signs and Symptoms in Electronic Health Records and the Medical Literature Should Be Feasible

    Utility-Preserving Anonymization of Textual Documents

    Get PDF
    Cada dia els éssers humans afegim una gran quantitat de dades a Internet, tals com piulades, opinions, fotos i vídeos. Les organitzacions que recullen aquestes dades tan diverses n'extreuen informació per tal de millorar llurs serveis o bé per a propòsits comercials. Tanmateix, si les dades recollides contenen informació personal sensible, hom no les pot compartir amb tercers ni les pot publicar sense el consentiment o una protecció adequada dels subjectes de les dades. Els mecanismes de preservació de la privadesa forneixen maneres de sanejar les dades per tal que no revelin identitats o atributs confidencials. S'ha proposat una gran varietat de mecanismes per anonimitzar bases de dades estructurades amb atributs numèrics i categòrics; en canvi, la protecció automàtica de dades textuals no estructurades ha rebut molta menys atenció. En general, l'anonimització de dades textuals exigeix, primer, detectar trossos del text que poden revelar informació sensible i, després, emmascarar aquests trossos mitjançant supressió o generalització. En aquesta tesi fem servir diverses tecnologies per anonimitzar documents textuals. De primer, millorem les tècniques existents basades en etiquetatge de seqüències. Després, estenem aquestes tècniques per alinear-les millor amb el risc de revelació i amb les exigències de privadesa. Finalment, proposem un marc complet basat en models d'immersió de paraules que captura un concepte més ampli de protecció de dades i que forneix una protecció flexible guiada per les exigències de privadesa. També recorrem a les ontologies per preservar la utilitat del text emmascarat, és a dir, la seva semàntica i la seva llegibilitat. La nostra experimentació extensa i detallada mostra que els nostres mètodes superen els mètodes existents a l'hora de proporcionar anonimització robusta tot preservant raonablement la utilitat del text protegit.Cada día las personas añadimos una gran cantidad de datos a Internet, tales como tweets, opiniones, fotos y vídeos. Las organizaciones que recogen dichos datos los usan para extraer información para mejorar sus servicios o para propósitos comerciales. Sin embargo, si los datos recogidos contienen información personal sensible, no pueden compartirse ni publicarse sin el consentimiento o una protección adecuada de los sujetos de los datos. Los mecanismos de protección de la privacidad proporcionan maneras de sanear los datos de forma que no revelen identidades ni atributos confidenciales. Se ha propuesto una gran variedad de mecanismos para anonimizar bases de datos estructuradas con atributos numéricos y categóricos; en cambio, la protección automática de datos textuales no estructurados ha recibido mucha menos atención. En general, la anonimización de datos textuales requiere, primero, detectar trozos de texto que puedan revelar información sensible, para luego enmascarar dichos trozos mediante supresión o generalización. En este trabajo empleamos varias tecnologías para anonimizar documentos textuales. Primero mejoramos las técnicas existentes basadas en etiquetaje de secuencias. Posteriormente las extendmos para alinearlas mejor con la noción de riesgo de revelación y con los requisitos de privacidad. Finalmente, proponemos un marco completo basado en modelos de inmersión de palabras que captura una noción más amplia de protección de datos y ofrece protección flexible guiada por los requisitos de privacidad. También recurrimos a las ontologías para preservar la utilidad del texto enmascarado, es decir, su semantica y legibilidad. Nuestra experimentación extensa y detallada muestra que nuestros métodos superan a los existentes a la hora de proporcionar una anonimización más robusta al tiempo que se preserva razonablemente la utilidad del texto protegido.Every day, people post a significant amount of data on the Internet, such as tweets, reviews, photos, and videos. Organizations collecting these types of data use them to extract information in order to improve their services or for commercial purposes. Yet, if the collected data contain sensitive personal information, they cannot be shared with third parties or released publicly without consent or adequate protection of the data subjects. Privacy-preserving mechanisms provide ways to sanitize data so that identities and/or confidential attributes are not disclosed. A great variety of mechanisms have been proposed to anonymize structured databases with numerical and categorical attributes; however, automatically protecting unstructured textual data has received much less attention. In general, textual data anonymization requires, first, to detect pieces of text that may disclose sensitive information and, then, to mask those pieces via suppression or generalization. In this work, we leverage several technologies to anonymize textual documents. We first improve state-of-the-art techniques based on sequence labeling. After that, we extend them to make them more aligned with the notion of privacy risk and the privacy requirements. Finally, we propose a complete framework based on word embedding models that captures a broader notion of data protection and provides flexible protection driven by privacy requirements. We also leverage ontologies to preserve the utility of the masked text, that is, its semantics and readability. Extensive experimental results show that our methods outperform the state of the art by providing more robust anonymization while reasonably preserving the utility of the protected outcome

    Using Case-Level Context to Classify Cancer Pathology Reports

    Get PDF
    Individual electronic health records (EHRs) and clinical reports are often part of a larger sequence-for example, a single patient may generate multiple reports over the trajectory of a disease. In applications such as cancer pathology reports, it is necessary not only to extract information from individual reports, but also to capture aggregate information regarding the entire cancer case based off case-level context from all reports in the sequence. In this paper, we introduce a simple modular add-on for capturing case-level context that is designed to be compatible with most existing deep learning architectures for text classification on individual reports. We test our approach on a corpus of 431,433 cancer pathology reports, and we show that incorporating case-level context significantly boosts classification accuracy across six classification tasks-site, subsite, laterality, histology, behavior, and grade. We expect that with minimal modifications, our add-on can be applied towards a wide range of other clinical text-based tasks

    Machine learning and electronic health records

    Get PDF
    In this work, we investigate the benefits and complications of using machine learning on EHR data. We survey some recent literature and conduct experiments on real data collected from hospital EHR systems.Masteroppgave i informatikkINF399MAMN-INFMAMN-PRO

    Identifying Locations of Violent Injuries in Las Vegas to Implement the Cardiff Violence Prevention Model

    Get PDF
    Public violence in the United States is a major health concern. Incidents involving violent crimes are often not reported to law enforcement. The Cardiff Model is a violence prevention program developed in the UK to identify and enable data sharing of violent injury locations between Emergency Rooms (ER) and local law enforcement to help identify areas for community improvements. The model is now in use in several major cities in the US to reduce violence. Las Vegas has seen an increase in public violence in recent years. As a result, researchers from the Southern Nevada Health District (SNHD) and University of Las Vegas (UNLV) believe the Cardiff Model is a viable solution to address this public health crisis. This research explores natural language processing and machine learning models to extract violent injury location information from ER records in preparation for introducing the Cardiff Violence Prevention Model in Clark County, Nevada

    Application of Machine Learning in Healthcare and Medicine: A Review

    Get PDF
    This extensive literature review investigates the integration of Machine Learning (ML) into the healthcare sector, uncovering its potential, challenges, and strategic resolutions. The main objective is to comprehensively explore how ML is incorporated into medical practices, demonstrate its impact, and provide relevant solutions. The research motivation stems from the necessity to comprehend the convergence of ML and healthcare services, given its intricate implications. Through meticulous analysis of existing research, this method elucidates the broad spectrum of ML applications in disease prediction and personalized treatment. The research's precision lies in dissecting methodologies, scrutinizing studies, and extrapolating critical insights. The article establishes that ML has succeeded in various aspects of medical care. In certain studies, ML algorithms, especially Convolutional Neural Networks (CNNs), have achieved high accuracy in diagnosing diseases such as lung cancer, colorectal cancer, brain tumors, and breast tumors. Apart from CNNs, other algorithms like SVM, RF, k-NN, and DT have also proven effective. Evaluations based on accuracy and F1-score indicate satisfactory results, with some studies exceeding 90% accuracy. This principal finding underscores the impressive accuracy of ML algorithms in diagnosing diverse medical conditions. This outcome signifies the transformative potential of ML in reshaping conventional diagnostic techniques. Discussions revolve around challenges like data quality, security risks, potential misinterpretations, and obstacles in integrating ML into clinical realms. To mitigate these, multifaceted solutions are proposed, encompassing standardized data formats, robust encryption, model interpretation, clinician training, and stakeholder collaboration
    corecore