Search CORE

1,730,041 research outputs found

Semantically Invariant Text-to-Image Generation

Author: Dominguez Miguel
Peri Dheeraj
Ptucha Ray
Sah Shagan
Savakis Andreas
Shringi Ameya
Zhang Chi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/09/2018
Field of study

Image captioning has demonstrated models that are capable of generating plausible text given input images or videos. Further, recent work in image generation has shown significant improvements in image quality when text is used as a prior. Our work ties these concepts together by creating an architecture that can enable bidirectional generation of images and text. We call this network Multi-Modal Vector Representation (MMVR). Along with MMVR, we propose two improvements to the text conditioned image generation. Firstly, a n-gram metric based cost function is introduced that generalizes the caption with respect to the image. Secondly, multiple semantically similar sentences are shown to help in generating better images. Qualitative and quantitative evaluations demonstrate that MMVR improves upon existing text conditioned image generation results by over 20%, while integrating visual and text modalities.Comment: 5 papers, 5 figures, Published in 2018 25th IEEE International Conference on Image Processing (ICIP

arXiv.org e-Print Archive

Crossref

Computer image generation: Reconfigurability as a strategy in high fidelity space applications

Author: Bartholomew Michael J.
Publication venue
Publication date
Field of study

The demand for realistic, high fidelity, computer image generation systems to support space simulation is well established. However, as the number and diversity of space applications increase, the complexity and cost of computer image generation systems also increase. One strategy used to harmonize cost with varied requirements is establishment of a reconfigurable image generation system that can be adapted rapidly and easily to meet new and changing requirements. The reconfigurability strategy through the life cycle of system conception, specification, design, implementation, operation, and support for high fidelity computer image generation systems are discussed. The discussion is limited to those issues directly associated with reconfigurability and adaptability of a specialized scene generation system in a multi-faceted space applications environment. Examples and insights gained through the recent development and installation of the Improved Multi-function Scene Generation System at Johnson Space Center, Systems Engineering Simulator are reviewed and compared with current simulator industry practices. The results are clear; the strategy of reconfigurability applied to space simulation requirements provides a viable path to supporting diverse applications with an adaptable computer image generation system

NASA Technical Reports Server

Learning a Recurrent Visual Representation for Image Caption Generation

Author: Chen Xinlei
Zitnick C. Lawrence
Publication venue
Publication date: 20/11/2014
Field of study

In this paper we explore the bi-directional mapping between images and their sentence-based descriptions. We propose learning this mapping using a recurrent neural network. Unlike previous approaches that map both sentences and images to a common embedding, we enable the generation of novel sentences given an image. Using the same model, we can also reconstruct the visual features associated with an image given its visual description. We use a novel recurrent visual memory that automatically learns to remember long-term visual concepts to aid in both sentence generation and visual feature reconstruction. We evaluate our approach on several tasks. These include sentence generation, sentence retrieval and image retrieval. State-of-the-art results are shown for the task of generating novel image descriptions. When compared to human generated captions, our automatically generated captions are preferred by humans over

19.8\%

of the time. Results are better than or comparable to state-of-the-art results on the image and sentence retrieval tasks for methods using similar visual features

arXiv.org e-Print Archive

Crossref