Many high-performance classification models utilize complex CNN-based
architectures for Alzheimer's Disease classification. We aim to investigate two
relevant questions regarding classification of Alzheimer's Disease using MRI:
"Do Vision Transformer-based models perform better than CNN-based models?" and
"Is it possible to use a shallow 3D CNN-based model to obtain satisfying
results?" To achieve these goals, we propose two models that can take in and
process 3D MRI scans: Convolutional Voxel Vision Transformer (CVVT)
architecture, and ConvNet3D-4, a shallow 4-block 3D CNN-based model. Our
results indicate that the shallow 3D CNN-based models are sufficient to achieve
good classification results for Alzheimer's Disease using MRI scans