1,955 research outputs found
Clairvoyance: A Pipeline Toolkit for Medical Time Series
Time-series learning is the bread and butter of data-driven *clinical
decision support*, and the recent explosion in ML research has demonstrated
great potential in various healthcare settings. At the same time, medical
time-series problems in the wild are challenging due to their highly
*composite* nature: They entail design choices and interactions among
components that preprocess data, impute missing values, select features, issue
predictions, estimate uncertainty, and interpret models. Despite exponential
growth in electronic patient data, there is a remarkable gap between the
potential and realized utilization of ML for clinical research and decision
support. In particular, orchestrating a real-world project lifecycle poses
challenges in engineering (i.e. hard to build), evaluation (i.e. hard to
assess), and efficiency (i.e. hard to optimize). Designed to address these
issues simultaneously, Clairvoyance proposes a unified, end-to-end,
autoML-friendly pipeline that serves as a (i) software toolkit, (ii) empirical
standard, and (iii) interface for optimization. Our ultimate goal lies in
facilitating transparent and reproducible experimentation with complex
inference workflows, providing integrated pathways for (1) personalized
prediction, (2) treatment-effect estimation, and (3) information acquisition.
Through illustrative examples on real-world data in outpatient, general wards,
and intensive-care settings, we illustrate the applicability of the pipeline
paradigm on core tasks in the healthcare journey. To the best of our knowledge,
Clairvoyance is the first to demonstrate viability of a comprehensive and
automatable pipeline for clinical time-series ML
Non-Imaging Medical Data Synthesis for Trustworthy AI: A Comprehensive Survey
Data quality is the key factor for the development of trustworthy AI in
healthcare. A large volume of curated datasets with controlled confounding
factors can help improve the accuracy, robustness and privacy of downstream AI
algorithms. However, access to good quality datasets is limited by the
technical difficulty of data acquisition and large-scale sharing of healthcare
data is hindered by strict ethical restrictions. Data synthesis algorithms,
which generate data with a similar distribution as real clinical data, can
serve as a potential solution to address the scarcity of good quality data
during the development of trustworthy AI. However, state-of-the-art data
synthesis algorithms, especially deep learning algorithms, focus more on
imaging data while neglecting the synthesis of non-imaging healthcare data,
including clinical measurements, medical signals and waveforms, and electronic
healthcare records (EHRs). Thus, in this paper, we will review the synthesis
algorithms, particularly for non-imaging medical data, with the aim of
providing trustworthy AI in this domain. This tutorial-styled review paper will
provide comprehensive descriptions of non-imaging medical data synthesis on
aspects including algorithms, evaluations, limitations and future research
directions.Comment: 35 pages, Submitted to ACM Computing Survey
- …