Search CORE

4,061 research outputs found

Variational Template Machine for Data-to-Text Generation

Author: Li Lei
Shi Wenxian
Wei Zhongyu
Ye Rong
Zhou Hao
Publication venue
Publication date: 13/02/2020
Field of study

How to generate descriptions from structured data organized in tables? Existing approaches using neural encoder-decoder models often suffer from lacking diversity. We claim that an open set of templates is crucial for enriching the phrase constructions and realizing varied generations. Learning such templates is prohibitive since it often requires a large paired <table, description> corpus, which is seldom available. This paper explores the problem of automatically learning reusable "templates" from paired and non-paired data. We propose the variational template machine (VTM), a novel method to generate text descriptions from data tables. Our contributions include: a) we carefully devise a specific model architecture and losses to explicitly disentangle text template and semantic content information, in the latent spaces, and b)we utilize both small parallel data and large raw text without aligned tables to enrich the template learning. Experiments on datasets from a variety of different domains show that VTM is able to generate more diversely while keeping a good fluency and quality

arXiv.org e-Print Archive

Deep Generative Models with Learnable Knowledge Constraints

Author: Dong Haoye
Hu Zhiting
Liang Xiaodan
Qin Lianhui
Salakhutdinov Ruslan
Xing Eric
Yang Zichao
Publication venue
Publication date: 19/11/2018
Field of study

The broad set of deep generative models (DGMs) has achieved remarkable advances. However, it is often difficult to incorporate rich structured domain knowledge with the end-to-end DGMs. Posterior regularization (PR) offers a principled framework to impose structured constraints on probabilistic models, but has limited applicability to the diverse DGMs that can lack a Bayesian formulation or even explicit density evaluation. PR also requires constraints to be fully specified a priori, which is impractical or suboptimal for complex knowledge with learnable uncertain parts. In this paper, we establish mathematical correspondence between PR and reinforcement learning (RL), and, based on the connection, expand PR to learn constraints as the extrinsic reward in RL. The resulting algorithm is model-agnostic to apply to any DGMs, and is flexible to adapt arbitrary constraints with the model jointly. Experiments on human image generation and templated sentence generation show models with learned knowledge constraints by our algorithm greatly improve over base generative models.Comment: Neural Information Processing Systems (NeurIPS) 201

arXiv.org e-Print Archive

Data Generation as Sequential Decision Making

Author: Bachman Philip
Precup Doina
Publication venue
Publication date: 02/11/2015
Field of study

We connect a broad class of generative models through their shared reliance on sequential decision making. Motivated by this view, we develop extensions to an existing model, and then explore the idea further in the context of data imputation -- perhaps the simplest setting in which to investigate the relation between unconditional and conditional generative modelling. We formulate data imputation as an MDP and develop models capable of representing effective policies for it. We construct the models using neural networks and train them using a form of guided policy search. Our models generate predictions through an iterative process of feedback and refinement. We show that this approach can learn effective policies for imputation problems of varying difficulty and across multiple datasets.Comment: Accepted for publication at Advances in Neural Information Processing Systems (NIPS) 201

arXiv.org e-Print Archive

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation

Author: He Junxian
Hu Zhiting
Liang Xiaodan
Liu Zhengzhong
Ma Xuezhe
Qin Lianhui
Sachan Devendra Singh
Shi Haoran
Tan Bowen
Wang Di
Wang Wentao
Xing Eric P.
Yang Zichao
Zhao Tiancheng
Zhu Wangrong
Publication venue
Publication date: 03/07/2019
Field of study

We introduce Texar, an open-source toolkit aiming to support the broad set of text generation tasks that transform any inputs into natural language, such as machine translation, summarization, dialog, content manipulation, and so forth. With the design goals of modularity, versatility, and extensibility in mind, Texar extracts common patterns underlying the diverse tasks and methodologies, creates a library of highly reusable modules, and allows arbitrary model architectures and algorithmic paradigms. In Texar, model architecture, inference, and learning processes are properly decomposed. Modules at a high concept level can be freely assembled and plugged in/swapped out. The toolkit also supports a rich set of large-scale pretrained models. Texar is thus particularly suitable for researchers and practitioners to do fast prototyping and experimentation. The versatile toolkit also fosters technique sharing across different text generation tasks. Texar supports both TensorFlow and PyTorch, and is released under Apache License 2.0 at https://www.texar.io.Comment: ACL 2019 demo, expanded versio

arXiv.org e-Print Archive

Syntax-guided Controlled Generation of Paraphrases

Author: Ahuja Kabir
Kumar Ashutosh
Talukdar Partha
Vadapalli Raghuram
Publication venue
Publication date: 17/05/2020
Field of study

Given a sentence (e.g., "I like mangoes") and a constraint (e.g., sentiment flip), the goal of controlled text generation is to produce a sentence that adapts the input sentence to meet the requirements of the constraint (e.g., "I hate mangoes"). Going beyond such simple constraints, recent works have started exploring the incorporation of complex syntactic-guidance as constraints in the task of controlled paraphrase generation. In these methods, syntactic-guidance is sourced from a separate exemplar sentence. However, these prior works have only utilized limited syntactic information available in the parse tree of the exemplar sentence. We address this limitation in the paper and propose Syntax Guided Controlled Paraphraser (SGCP), an end-to-end framework for syntactic paraphrase generation. We find that SGCP can generate syntax conforming sentences while not compromising on relevance. We perform extensive automated and human evaluations over multiple real-world English language datasets to demonstrate the efficacy of SGCP over state-of-the-art baselines. To drive future research, we have made SGCP's source code availableComment: 16 pages, 3 figures, Accepted to TACL 202

arXiv.org e-Print Archive

Text Generation with Exemplar-based Adaptive Decoding

Author: Das Dipanjan
Dhingra Bhuwan
Faruqui Manaal
Parikh Ankur P.
Peng Hao
Publication venue
Publication date: 10/04/2019
Field of study

We propose a novel conditioned text generation model. It draws inspiration from traditional template-based text generation techniques, where the source provides the content (i.e., what to say), and the template influences how to say it. Building on the successful encoder-decoder paradigm, it first encodes the content representation from the given input text; to produce the output, it retrieves exemplar text from the training data as "soft templates," which are then used to construct an exemplar-specific decoder. We evaluate the proposed model on abstractive text summarization and data-to-text generation. Empirical results show that this model achieves strong performance and outperforms comparable baselines.Comment: NAACL 201

arXiv.org e-Print Archive

Music Generation by Deep Learning - Challenges and Directions

Author: Briot Jean-Pierre
Pachet François
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/09/2018
Field of study

In addition to traditional tasks such as prediction, classification and translation, deep learning is receiving growing attention as an approach for music generation, as witnessed by recent research groups such as Magenta at Google and CTRL (Creator Technology Research Lab) at Spotify. The motivation is in using the capacity of deep learning architectures and training techniques to automatically learn musical styles from arbitrary musical corpora and then to generate samples from the estimated distribution. However, a direct application of deep learning to generate content rapidly reaches limits as the generated content tends to mimic the training set without exhibiting true creativity. Moreover, deep learning architectures do not offer direct ways for controlling generation (e.g., imposing some tonality or other arbitrary constraints). Furthermore, deep learning architectures alone are autistic automata which generate music autonomously without human user interaction, far from the objective of interactively assisting musicians to compose and refine music. Issues such as: control, structure, creativity and interactivity are the focus of our analysis. In this paper, we select some limitations of a direct application of deep learning to music generation, analyze why the issues are not fulfilled and how to address them by possible approaches. Various examples of recent systems are cited as examples of promising directions.Comment: 17 pages. arXiv admin note: substantial text overlap with arXiv:1709.01620. Accepted for publication in Special Issue on Deep learning for music and audio, Neural Computing & Applications, Springer Nature, 201

arXiv.org e-Print Archive

InferSpark: Statistical Inference at Scale

Author: Liu Chris
Lo Eric
Pei Jialing
Zhao Zhuoyue
Zhu Kenny Q.
Publication venue
Publication date: 09/10/2017
Field of study

The Apache Spark stack has enabled fast large-scale data processing. Despite a rich library of statistical models and inference algorithms, it does not give domain users the ability to develop their own models. The emergence of probabilistic programming languages has showed the promise of developing sophisticated probabilistic models in a succinct and programmatic way. These frameworks have the potential of automatically generating inference algorithms for the user defined models and answering various statistical queries about the model. It is a perfect time to unite these two great directions to produce a programmable big data analysis framework. We thus propose, InferSpark, a probabilistic programming framework on top of Apache Spark. Efficient statistical inference can be easily implemented on this framework and inference process can leverage the distributed main memory processing power of Spark. This framework makes statistical inference on big data possible and speed up the penetration of probabilistic programming into the data engineering domain.Comment: 13 pages, 22 figure

arXiv.org e-Print Archive

Russian Natural Language Generation: Creation of a Language Modelling Dataset and Evaluation with Modern Neural Architectures

Author: Mouromtsev Dmitry
Pak Vadim
Shaheen Zein
Wohlgenannt Gerhard
Zaity Bassel
Publication venue
Publication date: 05/05/2020
Field of study

Generating coherent, grammatically correct, and meaningful text is very challenging, however, it is crucial to many modern NLP systems. So far, research has mostly focused on English language, for other languages both standardized datasets, as well as experiments with state-of-the-art models, are rare. In this work, we i) provide a novel reference dataset for Russian language modeling, ii) experiment with popular modern methods for text generation, namely variational autoencoders, and generative adversarial networks, which we trained on the new dataset. We evaluate the generated text regarding metrics such as perplexity, grammatical correctness and lexical diversity

arXiv.org e-Print Archive

Variation Network: Learning High-level Attributes for Controlled Input Manipulation

Author: Hadjeres Gaëtan
Nielsen Frank
Publication venue
Publication date: 16/09/2019
Field of study

This paper presents the Variation Network (VarNet), a generative model providing means to manipulate the high-level attributes of a given input. The originality of our approach is that VarNet is not only capable of handling pre-defined attributes but can also learn the relevant attributes of the dataset by itself. These two settings can also be easily considered at the same time, which makes this model applicable to a wide variety of tasks. Further, VarNet has a sound information-theoretic interpretation which grants us with interpretable means to control how these high-level attributes are learned. We demonstrate experimentally that this model is capable of performing interesting input manipulation and that the learned attributes are relevant and meaningful.Comment: 15 pages, 7 figure

arXiv.org e-Print Archive