1 research outputs found
Towards Interpretable and Robust Hand Detection via Pixel-wise Prediction
The lack of interpretability of existing CNN-based hand detection methods
makes it difficult to understand the rationale behind their predictions. In
this paper, we propose a novel neural network model, which introduces
interpretability into hand detection for the first time. The main improvements
include: (1) Detect hands at pixel level to explain what pixels are the basis
for its decision and improve transparency of the model. (2) The explainable
Highlight Feature Fusion block highlights distinctive features among multiple
layers and learns discriminative ones to gain robust performance. (3) We
introduce a transparent representation, the rotation map, to learn rotation
features instead of complex and non-transparent rotation and derotation layers.
(4) Auxiliary supervision accelerates the training process, which saves more
than 10 hours in our experiments. Experimental results on the VIVA and Oxford
hand detection and tracking datasets show competitive accuracy of our method
compared with state-of-the-art methods with higher speed.Comment: Accepted to Pattern Recognitio