114 research outputs found

    Self-Supervised Time-to-Event Modeling with Structured Medical Records

    Full text link
    Time-to-event (TTE) models are used in medicine and other fields for estimating the probability distribution of the time until a specific event occurs. TTE models provide many advantages over classification using fixed time horizons, including naturally handling censored observations, but require more parameters and are challenging to train in settings with limited labeled data. Existing approaches, e.g. proportional hazards or accelerated failure time, employ distributional assumptions to reduce parameters but are vulnerable to model misspecification. In this work, we address these challenges with MOTOR (Many Outcome Time Oriented Representations), a self-supervised model that leverages temporal structure found in collections of timestamped events in electronic health records (EHR) and health insurance claims. MOTOR uses a TTE pretraining objective that predicts the probability distribution of times when events occur, making it well-suited to transfer learning for medical prediction tasks. Having pretrained on EHR and claims data of up to 55M patient records (9B clinical events), we evaluate performance after finetuning for 19 tasks across two datasets. Task-specific models built using MOTOR improve time-dependent C statistics by 4.6% over state-of-the-art while greatly improving sample efficiency, achieving comparable performance to existing methods using only 5% of available task data

    BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision

    Full text link
    We study the open-domain named entity recognition (NER) problem under distant supervision. The distant supervision, though does not require large amounts of manual annotations, yields highly incomplete and noisy distant labels via external knowledge bases. To address this challenge, we propose a new computational framework -- BOND, which leverages the power of pre-trained language models (e.g., BERT and RoBERTa) to improve the prediction performance of NER models. Specifically, we propose a two-stage training algorithm: In the first stage, we adapt the pre-trained language model to the NER tasks using the distant labels, which can significantly improve the recall and precision; In the second stage, we drop the distant labels, and propose a self-training approach to further improve the model performance. Thorough experiments on 5 benchmark datasets demonstrate the superiority of BOND over existing distantly supervised NER methods. The code and distantly labeled data have been released in https://github.com/cliang1453/BOND.Comment: Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '20

    The Shaky Foundations of Clinical Foundation Models: A Survey of Large Language Models and Foundation Models for EMRs

    Full text link
    The successes of foundation models such as ChatGPT and AlphaFold have spurred significant interest in building similar models for electronic medical records (EMRs) to improve patient care and hospital operations. However, recent hype has obscured critical gaps in our understanding of these models' capabilities. We review over 80 foundation models trained on non-imaging EMR data (i.e. clinical text and/or structured data) and create a taxonomy delineating their architectures, training data, and potential use cases. We find that most models are trained on small, narrowly-scoped clinical datasets (e.g. MIMIC-III) or broad, public biomedical corpora (e.g. PubMed) and are evaluated on tasks that do not provide meaningful insights on their usefulness to health systems. In light of these findings, we propose an improved evaluation framework for measuring the benefits of clinical foundation models that is more closely grounded to metrics that matter in healthcare.Comment: Reformatted figures, updated contribution
    • …
    corecore