Synthesizing information from multiple data sources plays a crucial role in
the practice of modern medicine. Current applications of artificial
intelligence in medicine often focus on single-modality data due to a lack of
publicly available, multimodal medical datasets. To address this limitation, we
introduce INSPECT, which contains de-identified longitudinal records from a
large cohort of patients at risk for pulmonary embolism (PE), along with ground
truth labels for multiple outcomes. INSPECT contains data from 19,402 patients,
including CT images, radiology report impression sections, and structured
electronic health record (EHR) data (i.e. demographics, diagnoses, procedures,
vitals, and medications). Using INSPECT, we develop and release a benchmark for
evaluating several baseline modeling approaches on a variety of important PE
related tasks. We evaluate image-only, EHR-only, and multimodal fusion models.
Trained models and the de-identified dataset are made available for
non-commercial use under a data use agreement. To the best of our knowledge,
INSPECT is the largest multimodal dataset integrating 3D medical imaging and
EHR for reproducible methods evaluation and research