1,707 research outputs found
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks
An important goal of computer vision is to build systems that learn visual
representations over time that can be applied to many tasks. In this paper, we
investigate a vision-language embedding as a core representation and show that
it leads to better cross-task transfer than standard multi-task learning. In
particular, the task of visual recognition is aligned to the task of visual
question answering by forcing each to use the same word-region embeddings. We
show this leads to greater inductive transfer from recognition to VQA than
standard multitask learning. Visual recognition also improves, especially for
categories that have relatively few recognition training labels but appear
often in the VQA setting. Thus, our paper takes a small step towards creating
more general vision systems by showing the benefit of interpretable, flexible,
and trainable core representations.Comment: Accepted in ICCV 2017. The arxiv version has an extra analysis on
correlation with human attentio
MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-based Protein Structure Prediction
Predicting protein properties such as solvent accessibility and secondary
structure from its primary amino acid sequence is an important task in
bioinformatics. Recently, a few deep learning models have surpassed the
traditional window based multilayer perceptron. Taking inspiration from the
image classification domain we propose a deep convolutional neural network
architecture, MUST-CNN, to predict protein properties. This architecture uses a
novel multilayer shift-and-stitch (MUST) technique to generate fully dense
per-position predictions on protein sequences. Our model is significantly
simpler than the state-of-the-art, yet achieves better results. By combining
MUST and the efficient convolution operation, we can consider far more
parameters while retaining very fast prediction speeds. We beat the
state-of-the-art performance on two large protein property prediction datasets.Comment: 8 pages ; 3 figures ; deep learning based sequence-sequence
prediction. in AAAI 201
- …