The architecture of the brain is too complex to be intuitively surveyable
without the use of compressed representations that project its variation into a
compact, navigable space. The task is especially challenging with
high-dimensional data, such as gene expression, where the joint complexity of
anatomical and transcriptional patterns demands maximum compression.
Established practice is to use standard principal component analysis (PCA),
whose computational felicity is offset by limited expressivity, especially at
great compression ratios. Employing whole-brain, voxel-wise Allen Brain Atlas
transcription data, here we systematically compare compressed representations
based on the most widely supported linear and non-linear methods-PCA, kernel
PCA, non-negative matrix factorization (NMF), t-stochastic neighbour embedding
(t-SNE), uniform manifold approximation and projection (UMAP), and deep
auto-encoding-quantifying reconstruction fidelity, anatomical coherence, and
predictive utility with respect to signalling, microstructural, and metabolic
targets. We show that deep auto-encoders yield superior representations across
all metrics of performance and target domains, supporting their use as the
reference standard for representing transcription patterns in the human brain.Comment: 21 pages, 5 main figures, 1 supplementary figur