Transfer learning represents a recent paradigm shift in the way we build
artificial intelligence (AI) systems. In contrast to training task-specific
models, transfer learning involves pre-training deep learning models on a large
corpus of data and minimally fine-tuning them for adaptation to specific tasks.
Even so, for 3D medical imaging tasks, we do not know if it is best to
pre-train models on natural images, medical images, or even synthetically
generated MRI scans or video data. To evaluate these alternatives, here we
benchmarked vision transformers (ViTs) and convolutional neural networks
(CNNs), initialized with varied upstream pre-training approaches. These methods
were then adapted to three unique downstream neuroimaging tasks with a range of
difficulty: Alzheimer's disease (AD) and Parkinson's disease (PD)
classification, "brain age" prediction. Experimental tests led to the following
key observations: 1. Pre-training improved performance across all tasks
including a boost of 7.4% for AD classification and 4.6% for PD classification
for the ViT and 19.1% for PD classification and reduction in brain age
prediction error by 1.26 years for CNNs, 2. Pre-training on large-scale video
or synthetic MRI data boosted performance of ViTs, 3. CNNs were robust in
limited-data settings, and in-domain pretraining enhanced their performances,
4. Pre-training improved generalization to out-of-distribution datasets and
sites. Overall, we benchmarked different vision architectures, revealing the
value of pre-training them with emerging datasets for model initialization. The
resulting pre-trained models can be adapted to a range of downstream
neuroimaging tasks, even when training data for the target task is limited