Search CORE

3,771 research outputs found

Long and Diverse Text Generation with Planning-based Hierarchical Variational Model

Author: Huang Minlie
Shao Zhihong
Wen Jiangtao
Xu Wenfei
Zhu Xiaoyan
Publication venue
Publication date: 01/01/2019
Field of study

Existing neural methods for data-to-text generation are still struggling to produce long and diverse texts: they are insufficient to model input data dynamically during generation, to capture inter-sentence coherence, or to generate diversified expressions. To address these issues, we propose a Planning-based Hierarchical Variational Model (PHVM). Our model first plans a sequence of groups (each group is a subset of input items to be covered by a sentence) and then realizes each sentence conditioned on the planning result and the previously generated context, thereby decomposing long text generation into dependent sentence generation sub-tasks. To capture expression diversity, we devise a hierarchical latent structure where a global planning latent variable models the diversity of reasonable planning and a sequence of local latent variables controls sentence realization. Experiments show that our model outperforms state-of-the-art baselines in long and diverse text generation.Comment: To appear in EMNLP 201

arXiv.org e-Print Archive

Crossref

SALSA-TEXT : self attentive latent space based adversarial text generation

Author: Gagnon-Marchand Jules
Haidar Md. Akmal
Rezagholizadeh Mehdi
Sadeghi Hamed
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/10/2018
Field of study

Inspired by the success of self attention mechanism and Transformer architecture in sequence transduction and image generation applications, we propose novel self attention-based architectures to improve the performance of adversarial latent code- based schemes in text generation. Adversarial latent code-based text generation has recently gained a lot of attention due to their promising results. In this paper, we take a step to fortify the architectures used in these setups, specifically AAE and ARAE. We benchmark two latent code-based methods (AAE and ARAE) designed based on adversarial setups. In our experiments, the Google sentence compression dataset is utilized to compare our method with these methods using various objective and subjective measures. The experiments demonstrate the proposed (self) attention-based models outperform the state-of-the-art in adversarial code-based text generation.Comment: 10 pages, 3 figures, under review at ICLR 201

arXiv.org e-Print Archive

Crossref

Neural representation in active inference: using generative models to interact with -- and understand -- the lived world

Author: D'Amato Leo
Friston Karl
Mannella Francesco
Pezzulo Giovanni
Priorelli Matteo
Stoianov Ivilin Peev
Van de Maele Toon
Publication venue
Publication date: 23/10/2023
Field of study

This paper considers neural representation through the lens of active inference, a normative framework for understanding brain function. It delves into how living organisms employ generative models to minimize the discrepancy between predictions and observations (as scored with variational free energy). The ensuing analysis suggests that the brain learns generative models to navigate the world adaptively, not (or not solely) to understand it. Different living organisms may possess an array of generative models, spanning from those that support action-perception cycles to those that underwrite planning and imagination; namely, from "explicit" models that entail variables for predicting concurrent sensations, like objects, faces, or people - to "action-oriented models" that predict action outcomes. It then elucidates how generative models and belief dynamics might link to neural representation and the implications of different types of generative models for understanding an agent's cognitive capabilities in relation to its ecological niche. The paper concludes with open questions regarding the evolution of generative models and the development of advanced cognitive abilities - and the gradual transition from "pragmatic" to "detached" neural representations. The analysis on offer foregrounds the diverse roles that generative models play in cognitive processes and the evolution of neural representation

arXiv.org e-Print Archive

Predictive World Models from Real-World Partial Observations

Author: Carballo Alexander
Fujii Keisuke
Karlsson Robin
Ohtani Kento
Takeda Kazuya
Publication venue
Publication date: 25/04/2023
Field of study

Cognitive scientists believe adaptable intelligent agents like humans perform reasoning through learned causal mental simulations of agents and environments. The problem of learning such simulations is called predictive world modeling. Recently, reinforcement learning (RL) agents leveraging world models have achieved SOTA performance in game environments. However, understanding how to apply the world modeling approach in complex real-world environments relevant to mobile robots remains an open question. In this paper, we present a framework for learning a probabilistic predictive world model for real-world road environments. We implement the model using a hierarchical VAE (HVAE) capable of predicting a diverse set of fully observed plausible worlds from accumulated sensor observations. While prior HVAE methods require complete states as ground truth for learning, we present a novel sequential training method to allow HVAEs to learn to predict complete states from partially observed states only. We experimentally demonstrate accurate spatial structure prediction of deterministic regions achieving 96.21 IoU, and close the gap to perfect prediction by 62% for stochastic regions using the best prediction. By extending HVAEs to cases where complete ground truth states do not exist, we facilitate continual learning of spatial prediction as a step towards realizing explainable and comprehensive predictive world models for real-world mobile robotics applications. Code is available at https://github.com/robin-karlsson0/predictive-world-models.Comment: Accepted for IEEE MOST 202

arXiv.org e-Print Archive