5,744 research outputs found

    Extreme multi-label deep neural classification of Spanish health records according to the International Classification of Diseases

    Get PDF
    111 p.Este trabajo trata sobre la minería de textos clínicos, un campo del Procesamiento del Lenguaje Natural aplicado al dominio biomédico. El objetivo es automatizar la tarea de codificación médica. Los registros electrónicos de salud (EHR) son documentos que contienen información clínica sobre la salud de unpaciente. Los diagnósticos y procedimientos médicos plasmados en la Historia Clínica Electrónica están codificados con respecto a la Clasificación Internacional de Enfermedades (CIE). De hecho, la CIE es la base para identificar estadísticas de salud internacionales y el estándar para informar enfermedades y condiciones de salud. Desde la perspectiva del aprendizaje automático, el objetivo es resolver un problema extremo de clasificación de texto de múltiples etiquetas, ya que a cada registro de salud se le asignan múltiples códigos ICD de un conjunto de más de 70 000 términos de diagnóstico. Una cantidad importante de recursos se dedican a la codificación médica, una laboriosa tarea que actualmente se realiza de forma manual. Los EHR son narraciones extensas, y los codificadores médicos revisan los registros escritos por los médicos y asignan los códigos ICD correspondientes. Los textos son técnicos ya que los médicos emplean una jerga médica especializada, aunque rica en abreviaturas, acrónimos y errores ortográficos, ya que los médicos documentan los registros mientras realizan la práctica clínica real. Paraabordar la clasificación automática de registros de salud, investigamos y desarrollamos un conjunto de técnicas de clasificación de texto de aprendizaje profundo

    Clinical Risk Prediction Using Language Models: Benefits And Considerations

    Full text link
    The utilization of Electronic Health Records (EHRs) for clinical risk prediction is on the rise. However, strict privacy regulations limit access to comprehensive health records, making it challenging to apply standard machine learning algorithms in practical real-world scenarios. Previous research has addressed this data limitation by incorporating medical ontologies and employing transfer learning methods. In this study, we investigate the potential of leveraging language models (LMs) as a means to incorporate supplementary domain knowledge for improving the performance of various EHR-based risk prediction tasks. Unlike applying LMs to unstructured EHR data such as clinical notes, this study focuses on using textual descriptions within structured EHR to make predictions exclusively based on that information. We extensively compare against previous approaches across various data types and sizes. We find that employing LMs to represent structured EHRs, such as diagnostic histories, leads to improved or at least comparable performance in diverse risk prediction tasks. Furthermore, LM-based approaches offer numerous advantages, including few-shot learning, the capability to handle previously unseen medical concepts, and adaptability to various medical vocabularies. Nevertheless, we underscore, through various experiments, the importance of being cautious when employing such models, as concerns regarding the reliability of LMs persist.Comment: 12 pages, 6 figures, 4 table

    Representing Health Data and Medical Knowledge for Deep Learning

    Get PDF

    HEALTH OUTCOME PATHWAY PREDICTION. A GRAPH-BASED FRAMEWORK

    Get PDF
    This dissertation is part of the project FrailCare.AI, which aims to detect frailty in the elderly Portuguese population in order to optimize the SNS24 (telemonitoring) service, with the goal of suggesting health pathways to reduce the patients frailty. Frailty can be defined as the condition of being weak and delicate which normally increases with age and is the consequence of several health and non-health related factors. A patient health journey is recorded in Eletronic Health Record (EHR), which are rich but sparse, noisy and multi-modal sources of truth. These can be used to train predictive models to predict future health states, where frailty is just one of them. In this work, due to lack of data access we pivoted our focus to phenotype prediction, that is, predicting diagnosis. What is more, we tackle the problem of data-insufficiency and class imbalance (e.g. rare diseases and other infrequent occurrences in the training data) by integrating standardized healthcare ontologies within graph neural networks. We study the broad task of phenotype prediction, multi-task scenarios and as well few-shot scenarios - which is when a class rarely occurs in the training set. Furthermore, during the development of this work we detect some reproducibility issues in related literature which we detail, and also open-source all of our implementations introduding a framework to aid the development of similar systems.A presente dissertação insere-se no projecto FrailCare.AI, que visa detectar a fragilidade da população idosa portuguesa com o objectivo de optimizar o serviço de telemonitoriza- ção do Sistema Nacional de Saúde Português (SNS24), e também sugerir acções a tomar para reduzir a fragilidade dos doentes. A fragilidade é uma condição de risco composta por multiplos fatores. Hoje em dia, grande parte da história clinica de cada utente é gravada digitalmente. Estes dados diversos e vastos podem ser usados treinar modelos preditivos cujo objectivo é prever futuros estados de saúde, sendo que fragilidade é só um deles. Devido à falta de accesso a dados, alteramos a tarefa principal deste trabalho para previsão de diágnosticos, onde exploramos o problema de insuficiência de dados e dese- quilíbrio de classes (por exemplo, doenças raras e outras ocorrências pouco frequentes nos dados de treino), integrando ontologias de conceitos médicos por meio de redes neu- ronais de gráfos. Exploramos também outras tarefas e o impacto que elas têm entre si. Para além disso, durante o desenvolvimento desta dissertação identificamos questões a nivel de reproducibilidade da literatura estudada, onde detalhamos e implementamos os conceitos em falta. Com o objectivo de reproducibilidade em mente, nós libertamos o nosso código, introduzindo um biblioteca que permite desenvlver sistemas semelhantes ao nosso
    • …
    corecore