38,821 research outputs found
Multi-scale Alignment and Contextual History for Attention Mechanism in Sequence-to-sequence Model
A sequence-to-sequence model is a neural network module for mapping two
sequences of different lengths. The sequence-to-sequence model has three core
modules: encoder, decoder, and attention. Attention is the bridge that connects
the encoder and decoder modules and improves model performance in many tasks.
In this paper, we propose two ideas to improve sequence-to-sequence model
performance by enhancing the attention module. First, we maintain the history
of the location and the expected context from several previous time-steps.
Second, we apply multiscale convolution from several previous attention vectors
to the current decoder state. We utilized our proposed framework for
sequence-to-sequence speech recognition and text-to-speech systems. The results
reveal that our proposed extension could improve performance significantly
compared to a standard attention baseline
Attention-Aware Face Hallucination via Deep Reinforcement Learning
Face hallucination is a domain-specific super-resolution problem with the
goal to generate high-resolution (HR) faces from low-resolution (LR) input
images. In contrast to existing methods that often learn a single
patch-to-patch mapping from LR to HR images and are regardless of the
contextual interdependency between patches, we propose a novel Attention-aware
Face Hallucination (Attention-FH) framework which resorts to deep reinforcement
learning for sequentially discovering attended patches and then performing the
facial part enhancement by fully exploiting the global interdependency of the
image. Specifically, in each time step, the recurrent policy network is
proposed to dynamically specify a new attended region by incorporating what
happened in the past. The state (i.e., face hallucination result for the whole
image) can thus be exploited and updated by the local enhancement network on
the selected region. The Attention-FH approach jointly learns the recurrent
policy network and local enhancement network through maximizing the long-term
reward that reflects the hallucination performance over the whole image.
Therefore, our proposed Attention-FH is capable of adaptively personalizing an
optimal searching path for each face image according to its own characteristic.
Extensive experiments show our approach significantly surpasses the
state-of-the-arts on in-the-wild faces with large pose and illumination
variations
- …