487 research outputs found
OSU Multimodal Machine Translation System Report
This paper describes Oregon State University's submissions to the shared
WMT'17 task "multimodal translation task I". In this task, all the sentence
pairs are image captions in different languages. The key difference between
this task and conventional machine translation is that we have corresponding
images as additional information for each sentence pair. In this paper, we
introduce a simple but effective system which takes an image shared between
different languages, feeding it into the both encoding and decoding side. We
report our system's performance for English-French and English-German with
Flickr30K (in-domain) and MSCOCO (out-of-domain) datasets. Our system achieves
the best performance in TER for English-German for MSCOCO dataset.Comment: 5, WMT 201
Structural heterogeneity in the megathrust zone and mechanism of the 2011 Tohoku-oki earthquake (Mw 9.0)
The great 2011 Tohoku-oki earthquake (Mw 9.0) and its 339 foreshocks and 5,609 aftershocks (9â27 March 2011) were relocated using a three-dimensional seismic velocity model and local P and S wave arrival times. The distribution of relocated hypocenters was compared with a tomographic image of the Northeast Japan forearc. The comparison indicates that the rupture nucleation of the largest events in the Tohoku-oki sequence, including the mainshock, was controlled by structural heterogeneities in the megathrust zone
Tensor Product Generation Networks for Deep NLP Modeling
We present a new approach to the design of deep networks for natural language
processing (NLP), based on the general technique of Tensor Product
Representations (TPRs) for encoding and processing symbol structures in
distributed neural networks. A network architecture --- the Tensor Product
Generation Network (TPGN) --- is proposed which is capable in principle of
carrying out TPR computation, but which uses unconstrained deep learning to
design its internal representations. Instantiated in a model for image-caption
generation, TPGN outperforms LSTM baselines when evaluated on the COCO dataset.
The TPR-capable structure enables interpretation of internal representations
and operations, which prove to contain considerable grammatical content. Our
caption-generation model can be interpreted as generating sequences of
grammatical categories and retrieving words by their categories from a plan
encoded as a distributed representation
Attentive Tensor Product Learning
This paper proposes a new architecture - Attentive Tensor Product Learning
(ATPL) - to represent grammatical structures in deep learning models. ATPL is a
new architecture to bridge this gap by exploiting Tensor Product
Representations (TPR), a structured neural-symbolic model developed in
cognitive science, aiming to integrate deep learning with explicit language
structures and rules. The key ideas of ATPL are: 1) unsupervised learning of
role-unbinding vectors of words via TPR-based deep neural network; 2) employing
attention modules to compute TPR; and 3) integration of TPR with typical deep
learning architectures including Long Short-Term Memory (LSTM) and Feedforward
Neural Network (FFNN). The novelty of our approach lies in its ability to
extract the grammatical structure of a sentence by using role-unbinding
vectors, which are obtained in an unsupervised manner. This ATPL approach is
applied to 1) image captioning, 2) part of speech (POS) tagging, and 3)
constituency parsing of a sentence. Experimental results demonstrate the
effectiveness of the proposed approach
Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation
We propose a hierarchically structured reinforcement learning approach to
address the challenges of planning for generating coherent multi-sentence
stories for the visual storytelling task. Within our framework, the task of
generating a story given a sequence of images is divided across a two-level
hierarchical decoder. The high-level decoder constructs a plan by generating a
semantic concept (i.e., topic) for each image in sequence. The low-level
decoder generates a sentence for each image using a semantic compositional
network, which effectively grounds the sentence generation conditioned on the
topic. The two decoders are jointly trained end-to-end using reinforcement
learning. We evaluate our model on the visual storytelling (VIST) dataset.
Empirical results from both automatic and human evaluations demonstrate that
the proposed hierarchically structured reinforced training achieves
significantly better performance compared to a strong flat deep reinforcement
learning baseline.Comment: Accepted to AAAI 201
IMPROVING DETERMINISM FOR WIRELESS: SETTING DYNAMIC CLEAR CHANNEL ASSESSMENT THRESHOLD IN LOW-POWER AND LOSSY NETWORKS
Techniques are described herein for a mechanism to determine Clear Channel Assessment (CCA) thresholds in the scope of link neighbors in low power and lossy wireless networks. Without adding too much extra traffic, a dynamic, newly designed CCA threshold is unique for each link neighbor and helps to improve the wireless network performance
- âŠ