We present multimodal neural posterior estimation (MultiNPE), a method to
integrate heterogeneous data from different sources in simulation-based
inference with neural networks. Inspired by advances in deep fusion learning,
it empowers researchers to analyze data from different domains and infer the
parameters of complex mathematical models with increased accuracy. We formulate
multimodal fusion approaches for \hbox{MultiNPE} (early, late, hybrid) and
evaluate their performance in three challenging experiments. MultiNPE not only
outperforms single-source baselines on a reference task, but also achieves
superior inference on scientific models from neuroscience and cardiology. We
systematically investigate the impact of partially missing data on the
different fusion strategies. Across our experiments, late and hybrid fusion
techniques emerge as the methods of choice for practical applications of
multimodal simulation-based inference