19,427 research outputs found
GM-Net: Learning Features with More Efficiency
Deep Convolutional Neural Networks (CNNs) are capable of learning
unprecedentedly effective features from images. Some researchers have struggled
to enhance the parameters' efficiency using grouped convolution. However, the
relation between the optimal number of convolutional groups and the recognition
performance remains an open problem. In this paper, we propose a series of
Basic Units (BUs) and a two-level merging strategy to construct deep CNNs,
referred to as a joint Grouped Merging Net (GM-Net), which can produce joint
grouped and reused deep features while maintaining the feature discriminability
for classification tasks. Our GM-Net architectures with the proposed BU_A
(dense connection) and BU_B (straight mapping) lead to significant reduction in
the number of network parameters and obtain performance improvement in image
classification tasks. Extensive experiments are conducted to validate the
superior performance of the GM-Net than the state-of-the-arts on the benchmark
datasets, e.g., MNIST, CIFAR-10, CIFAR-100 and SVHN.Comment: 6 Pages, 5 figure
POSTER: A Pyramid Cross-Fusion Transformer Network for Facial Expression Recognition
Facial Expression Recognition (FER) has received increasing interest in the
computer vision community. As a challenging task, there are three key issues
especially prevalent in FER: inter-class similarity, intra-class discrepancy,
and scale sensitivity. Existing methods typically address some of these issues,
but do not tackle them all in a unified framework. Therefore, in this paper, we
propose a two-stream Pyramid crOss-fuSion TransformER network (POSTER) that
aims to holistically solve these issues. Specifically, we design a
transformer-based cross-fusion paradigm that enables effective collaboration of
facial landmark and direct image features to maximize proper attention to
salient facial regions. Furthermore, POSTER employs a pyramid structure to
promote scale invariance. Extensive experimental results demonstrate that our
POSTER outperforms SOTA methods on RAF-DB with 92.05%, FERPlus with 91.62%,
AffectNet (7 cls) with 67.31%, and AffectNet (8 cls) with 63.34%, respectively
- …