3 research outputs found
Mitigating Face Recognition Bias via Group Adaptive Classifier
Face recognition is known to exhibit bias - subjects in a certain demographic
group can be better recognized than other groups. This work aims to learn a
fair face representation, where faces of every group could be more equally
represented. Our proposed group adaptive classifier mitigates bias by using
adaptive convolution kernels and attention mechanisms on faces based on their
demographic attributes. The adaptive module comprises kernel masks and
channel-wise attention maps for each demographic group so as to activate
different facial regions for identification, leading to more discriminative
features pertinent to their demographics. Our introduced automated adaptation
strategy determines whether to apply adaptation to a certain layer by
iteratively computing the dissimilarity among demographic-adaptive parameters.
A new de-biasing loss function is proposed to mitigate the gap of average
intra-class distance between demographic groups. Experiments on face benchmarks
(RFW, LFW, IJB-A, and IJB-C) show that our work is able to mitigate face
recognition bias across demographic groups while maintaining the competitive
accuracy
Graph Structure of Neural Networks
Neural networks are often represented as graphs of connections between
neurons. However, despite their wide use, there is currently little
understanding of the relationship between the graph structure of the neural
network and its predictive performance. Here we systematically investigate how
does the graph structure of neural networks affect their predictive
performance. To this end, we develop a novel graph-based representation of
neural networks called relational graph, where layers of neural network
computation correspond to rounds of message exchange along the graph structure.
Using this representation we show that: (1) a "sweet spot" of relational graphs
leads to neural networks with significantly improved predictive performance;
(2) neural network's performance is approximately a smooth function of the
clustering coefficient and average path length of its relational graph; (3) our
findings are consistent across many different tasks and datasets; (4) the sweet
spot can be identified efficiently; (5) top-performing neural networks have
graph structure surprisingly similar to those of real biological neural
networks. Our work opens new directions for the design of neural architectures
and the understanding on neural networks in general.Comment: ICML 2020, with open-source cod
Channel Equilibrium Networks for Learning Deep Representation
Convolutional Neural Networks (CNNs) are typically constructed by stacking
multiple building blocks, each of which contains a normalization layer such as
batch normalization (BN) and a rectified linear function such as ReLU. However,
this work shows that the combination of normalization and rectified linear
function leads to inhibited channels, which have small magnitude and contribute
little to the learned feature representation, impeding the generalization
ability of CNNs. Unlike prior arts that simply removed the inhibited channels,
we propose to "wake them up" during training by designing a novel neural
building block, termed Channel Equilibrium (CE) block, which enables channels
at the same layer to contribute equally to the learned representation. We show
that CE is able to prevent inhibited channels both empirically and
theoretically. CE has several appealing benefits. (1) It can be integrated into
many advanced CNN architectures such as ResNet and MobileNet, outperforming
their original networks. (2) CE has an interesting connection with the Nash
Equilibrium, a well-known solution of a non-cooperative game. (3) Extensive
experiments show that CE achieves state-of-the-art performance on various
challenging benchmarks such as ImageNet and COCO.Comment: 19 pages, 8 figure