1 research outputs found
Sensor Transformation Attention Networks
Recent work on encoder-decoder models for sequence-to-sequence mapping has
shown that integrating both temporal and spatial attention mechanisms into
neural networks increases the performance of the system substantially. In this
work, we report on the application of an attentional signal not on temporal and
spatial regions of the input, but instead as a method of switching among inputs
themselves. We evaluate the particular role of attentional switching in the
presence of dynamic noise in the sensors, and demonstrate how the attentional
signal responds dynamically to changing noise levels in the environment to
achieve increased performance on both audio and visual tasks in three
commonly-used datasets: TIDIGITS, Wall Street Journal, and GRID. Moreover, the
proposed sensor transformation network architecture naturally introduces a
number of advantages that merit exploration, including ease of adding new
sensors to existing architectures, attentional interpretability, and increased
robustness in a variety of noisy environments not seen during training.
Finally, we demonstrate that the sensor selection attention mechanism of a
model trained only on the small TIDIGITS dataset can be transferred directly to
a pre-existing larger network trained on the Wall Street Journal dataset,
maintaining functionality of switching between sensors to yield a dramatic
reduction of error in the presence of noise.Comment: 8 pages, 5 figures, 3 table