283,056 research outputs found

    Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning

    Full text link
    Recent progress has been made in using attention based encoder-decoder framework for video captioning. However, most existing decoders apply the attention mechanism to every generated word including both visual words (e.g., "gun" and "shooting") and non-visual words (e.g. "the", "a"). However, these non-visual words can be easily predicted using natural language model without considering visual signals or attention. Imposing attention mechanism on non-visual words could mislead and decrease the overall performance of video captioning. To address this issue, we propose a hierarchical LSTM with adjusted temporal attention (hLSTMat) approach for video captioning. Specifically, the proposed framework utilizes the temporal attention for selecting specific frames to predict the related words, while the adjusted temporal attention is for deciding whether to depend on the visual information or the language context information. Also, a hierarchical LSTMs is designed to simultaneously consider both low-level visual information and high-level language context information to support the video caption generation. To demonstrate the effectiveness of our proposed framework, we test our method on two prevalent datasets: MSVD and MSR-VTT, and experimental results show that our approach outperforms the state-of-the-art methods on both two datasets

    Finite-element implementation for electron transport in nanostructures

    Get PDF
    We have modeled transport properties of nanostructures using Green’s-function method within the framework of the density-functional theory. The scheme is computationally demanding, so numerical methods have to be chosen carefully. A typical solution to the numerical burden is to use a special basis-function set, which is tailored to the problem in question, for example, the atomic-orbital basis. In this paper we present our solution to the problem. We have used the finite-element method with a hierarchical high-order polynomial basis, the so-called p elements. This method allows the discretation error to be controlled in a systematic way. The p elements work so efficiently that they can be used to solve interesting nanosystems described by nonlocal pseudopotentials. We demonstrate the potential of the implementation with two different systems. As a test system a simple Na-atom chain between two leads is modeled and the results are compared with several previous calculations. Secondly, we consider a thin hafnium dioxide (HfO2) layer on a silicon surface as a model for a gate structure of the next generation of microelectronics.Peer reviewe

    Long Text Generation via Adversarial Training with Leaked Information

    Get PDF
    Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc. Recently, by combining with policy gradient, Generative Adversarial Nets (GAN) that use a discriminative model to guide the training of the generative model as a reinforcement learning policy has shown promising results in text generation. However, the scalar guiding signal is only available after the entire text has been generated and lacks intermediate information about text structure during the generative process. As such, it limits its success when the length of the generated text samples is long (more than 20 words). In this paper, we propose a new framework, called LeakGAN, to address the problem for long text generation. We allow the discriminative net to leak its own high-level extracted features to the generative net to further help the guidance. The generator incorporates such informative signals into all generation steps through an additional Manager module, which takes the extracted features of current generated words and outputs a latent vector to guide the Worker module for next-word generation. Our extensive experiments on synthetic data and various real-world tasks with Turing test demonstrate that LeakGAN is highly effective in long text generation and also improves the performance in short text generation scenarios. More importantly, without any supervision, LeakGAN would be able to implicitly learn sentence structures only through the interaction between Manager and Worker.Comment: 14 pages, AAAI 201
    • …
    corecore