Search CORE

283,056 research outputs found

Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning

Author: Gao Lianli
Guo Zhao
Liu Wu
Shen Heng Tao
Song Jingkuan
Zhang Dongxiang
Publication venue
Publication date: 01/01/2017
Field of study

Recent progress has been made in using attention based encoder-decoder framework for video captioning. However, most existing decoders apply the attention mechanism to every generated word including both visual words (e.g., "gun" and "shooting") and non-visual words (e.g. "the", "a"). However, these non-visual words can be easily predicted using natural language model without considering visual signals or attention. Imposing attention mechanism on non-visual words could mislead and decrease the overall performance of video captioning. To address this issue, we propose a hierarchical LSTM with adjusted temporal attention (hLSTMat) approach for video captioning. Specifically, the proposed framework utilizes the temporal attention for selecting specific frames to predict the related words, while the adjusted temporal attention is for deciding whether to depend on the visual information or the language context information. Also, a hierarchical LSTMs is designed to simultaneously consider both low-level visual information and high-level language context information to support the video caption generation. To demonstrate the effectiveness of our proposed framework, we test our method on two prevalent datasets: MSVD and MSR-VTT, and experimental results show that our approach outperforms the state-of-the-art methods on both two datasets

arXiv.org e-Print Archive

Crossref

OPUS - University of Technology Sydney

Finite-element implementation for electron transport in nanostructures

Author: A. S. Foster
George A.
M. H. Hakala
M. J. Puska
P. Havu
R. M. Nieminen
V. Havu
Publication venue: 'AIP Publishing'
Publication date: 01/01/2006
Field of study

We have modeled transport properties of nanostructures using Green’s-function method within the framework of the density-functional theory. The scheme is computationally demanding, so numerical methods have to be chosen carefully. A typical solution to the numerical burden is to use a special basis-function set, which is tailored to the problem in question, for example, the atomic-orbital basis. In this paper we present our solution to the problem. We have used the finite-element method with a hierarchical high-order polynomial basis, the so-called p elements. This method allows the discretation error to be controlled in a systematic way. The p elements work so efficiently that they can be used to solve interesting nanosystems described by nonlocal pseudopotentials. We demonstrate the potential of the implementation with two different systems. As a test system a simple Na-atom chain between two leads is modeled and the results are compared with several previous calculations. Secondly, we consider a thin hafnium dioxide (HfO2) layer on a silicon surface as a model for a gate structure of the next generation of microelectronics.Peer reviewe

Crossref

Aaltodoc Publication Archive

Long Text Generation via Adversarial Training with Leaked Information

Author: Cai Han
Guo Jiaxian
Lu Sidi
Wang Jun
Yu Yong
Zhang Weinan
Publication venue
Publication date: 08/12/2017
Field of study

Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc. Recently, by combining with policy gradient, Generative Adversarial Nets (GAN) that use a discriminative model to guide the training of the generative model as a reinforcement learning policy has shown promising results in text generation. However, the scalar guiding signal is only available after the entire text has been generated and lacks intermediate information about text structure during the generative process. As such, it limits its success when the length of the generated text samples is long (more than 20 words). In this paper, we propose a new framework, called LeakGAN, to address the problem for long text generation. We allow the discriminative net to leak its own high-level extracted features to the generative net to further help the guidance. The generator incorporates such informative signals into all generation steps through an additional Manager module, which takes the extracted features of current generated words and outputs a latent vector to guide the Worker module for next-word generation. Our extensive experiments on synthetic data and various real-world tasks with Turing test demonstrate that LeakGAN is highly effective in long text generation and also improves the performance in short text generation scenarios. More importantly, without any supervision, LeakGAN would be able to implicitly learn sentence structures only through the interaction between Manager and Worker.Comment: 14 pages, AAAI 201

arXiv.org e-Print Archive

UCL Discovery