Efficient Multi-order Gated Aggregation Network

Chen, Zhiyuan; Li, Siyuan; Li, Stan Z.; Lin, Haitao; Liu, Zicheng; Tan, Cheng; Wang, Zedong; Wu, Di; Zheng, Jiangbin

Efficient Multi-order Gated Aggregation Network

Authors: Zhiyuan Chen
Siyuan Li
Stan Z. Li
Haitao Lin
Zicheng Liu
Cheng Tan
Zedong Wang
Di Wu
Jiangbin Zheng
Publication date: 6 November 2022
Publisher

Abstract

Since the recent success of Vision Transformers (ViTs), explorations toward transformer-style architectures have triggered the resurgence of modern ConvNets. In this work, we explore the representation ability of DNNs through the lens of interaction complexities. We empirically show that interaction complexity is an overlooked but essential indicator for visual recognition. Accordingly, a new family of efficient ConvNets, named MogaNet, is presented to pursue informative context mining in pure ConvNet-based models, with preferable complexity-performance trade-offs. In MogaNet, interactions across multiple complexities are facilitated and contextualized by leveraging two specially designed aggregation blocks in both spatial and channel interaction spaces. Extensive studies are conducted on ImageNet classification, COCO object detection, and ADE20K semantic segmentation tasks. The results demonstrate that our MogaNet establishes new state-of-the-art over other popular methods in mainstream scenarios and all model scales. Typically, the lightweight MogaNet-T achieves 80.0\% top-1 accuracy with only 1.44G FLOPs using a refined training setup on ImageNet-1K, surpassing ParC-Net-S by 1.4\% accuracy but saving 59\% (2.04G) FLOPs.Comment: Preprint with 14 pages of main body and 5 pages of appendi

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2211.03295

Last time updated on 12/12/2022