Designing more efficient, reliable, and explainable neural network
architectures is critical to studies that are based on artificial intelligence
(AI) techniques. Previous studies, by post-hoc analysis, have found that the
best-performing ANNs surprisingly resemble biological neural networks (BNN),
which indicates that ANNs and BNNs may share some common principles to achieve
optimal performance in either machine learning or cognitive/behavior tasks.
Inspired by this phenomenon, we proactively instill organizational principles
of BNNs to guide the redesign of ANNs. We leverage the Core-Periphery (CP)
organization, which is widely found in human brain networks, to guide the
information communication mechanism in the self-attention of vision transformer
(ViT) and name this novel framework as CP-ViT. In CP-ViT, the attention
operation between nodes is defined by a sparse graph with a Core-Periphery
structure (CP graph), where the core nodes are redesigned and reorganized to
play an integrative role and serve as a center for other periphery nodes to
exchange information. We evaluated the proposed CP-ViT on multiple public
datasets, including medical image datasets (INbreast) and natural image
datasets. Interestingly, by incorporating the BNN-derived principle (CP
structure) into the redesign of ViT, our CP-ViT outperforms other
state-of-the-art ANNs. In general, our work advances the state of the art in
three aspects: 1) This work provides novel insights for brain-inspired AI: we
can utilize the principles found in BNNs to guide and improve our ANN
architecture design; 2) We show that there exist sweet spots of CP graphs that
lead to CP-ViTs with significantly improved performance; and 3) The core nodes
in CP-ViT correspond to task-related meaningful and important image patches,
which can significantly enhance the interpretability of the trained deep model.Comment: Core-periphery, functional brain networks, Vi