3 research outputs found
Iterative and Adaptive Sampling with Spatial Attention for Black-Box Model Explanations
Deep neural networks have achieved great success in many real-world
applications, yet it remains unclear and difficult to explain their
decision-making process to an end-user. In this paper, we address the
explainable AI problem for deep neural networks with our proposed framework,
named IASSA, which generates an importance map indicating how salient each
pixel is for the model's prediction with an iterative and adaptive sampling
module. We employ an affinity matrix calculated on multi-level deep learning
features to explore long-range pixel-to-pixel correlation, which can shift the
saliency values guided by our long-range and parameter-free spatial attention.
Extensive experiments on the MS-COCO dataset show that our proposed approach
matches or exceeds the performance of state-of-the-art black-box explanation
methods.Comment: The paper was accepted to the IEEE Winter Conference on Applications
of Computer Vision (WACV'2020
VITAL: A Visual Interpretation on Text with Adversarial Learning for Image Labeling
In this paper, we propose a novel way to interpret text information by
extracting visual feature presentation from multiple high-resolution and
photo-realistic synthetic images generated by Text-to-image Generative
Adversarial Network (GAN) to improve the performance of image labeling.
Firstly, we design a stacked Generative Multi-Adversarial Network (GMAN),
StackGMAN++, a modified version of the current state-of-the-art Text-to-image
GAN, StackGAN++, to generate multiple synthetic images with various prior
noises conditioned on a text. And then we extract deep visual features from the
generated synthetic images to explore the underlying visual concepts for text.
Finally, we combine image-level visual feature, text-level feature and visual
features based on synthetic images together to predict labels for images. We
conduct experiments on two benchmark datasets and the experimental results
clearly demonstrate the efficacy of our proposed approach
SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory Prediction
Pedestrian trajectory prediction is a key technology in autopilot, which
remains to be very challenging due to complex interactions between pedestrians.
However, previous works based on dense undirected interaction suffer from
modeling superfluous interactions and neglect of trajectory motion tendency,
and thus inevitably result in a considerable deviance from the reality. To cope
with these issues, we present a Sparse Graph Convolution Network~(SGCN) for
pedestrian trajectory prediction. Specifically, the SGCN explicitly models the
sparse directed interaction with a sparse directed spatial graph to capture
adaptive interaction pedestrians. Meanwhile, we use a sparse directed temporal
graph to model the motion tendency, thus to facilitate the prediction based on
the observed direction. Finally, parameters of a bi-Gaussian distribution for
trajectory prediction are estimated by fusing the above two sparse graphs. We
evaluate our proposed method on the ETH and UCY datasets, and the experimental
results show our method outperforms comparative state-of-the-art methods by 9%
in Average Displacement Error(ADE) and 13% in Final Displacement Error(FDE).
Notably, visualizations indicate that our method can capture adaptive
interactions between pedestrians and their effective motion tendencies.Comment: Accepted by CVPR202