4,061 research outputs found
Variational Template Machine for Data-to-Text Generation
How to generate descriptions from structured data organized in tables?
Existing approaches using neural encoder-decoder models often suffer from
lacking diversity. We claim that an open set of templates is crucial for
enriching the phrase constructions and realizing varied generations. Learning
such templates is prohibitive since it often requires a large paired <table,
description> corpus, which is seldom available. This paper explores the problem
of automatically learning reusable "templates" from paired and non-paired data.
We propose the variational template machine (VTM), a novel method to generate
text descriptions from data tables. Our contributions include: a) we carefully
devise a specific model architecture and losses to explicitly disentangle text
template and semantic content information, in the latent spaces, and b)we
utilize both small parallel data and large raw text without aligned tables to
enrich the template learning. Experiments on datasets from a variety of
different domains show that VTM is able to generate more diversely while
keeping a good fluency and quality
Deep Generative Models with Learnable Knowledge Constraints
The broad set of deep generative models (DGMs) has achieved remarkable
advances. However, it is often difficult to incorporate rich structured domain
knowledge with the end-to-end DGMs. Posterior regularization (PR) offers a
principled framework to impose structured constraints on probabilistic models,
but has limited applicability to the diverse DGMs that can lack a Bayesian
formulation or even explicit density evaluation. PR also requires constraints
to be fully specified a priori, which is impractical or suboptimal for complex
knowledge with learnable uncertain parts. In this paper, we establish
mathematical correspondence between PR and reinforcement learning (RL), and,
based on the connection, expand PR to learn constraints as the extrinsic reward
in RL. The resulting algorithm is model-agnostic to apply to any DGMs, and is
flexible to adapt arbitrary constraints with the model jointly. Experiments on
human image generation and templated sentence generation show models with
learned knowledge constraints by our algorithm greatly improve over base
generative models.Comment: Neural Information Processing Systems (NeurIPS) 201
Data Generation as Sequential Decision Making
We connect a broad class of generative models through their shared reliance
on sequential decision making. Motivated by this view, we develop extensions to
an existing model, and then explore the idea further in the context of data
imputation -- perhaps the simplest setting in which to investigate the relation
between unconditional and conditional generative modelling. We formulate data
imputation as an MDP and develop models capable of representing effective
policies for it. We construct the models using neural networks and train them
using a form of guided policy search. Our models generate predictions through
an iterative process of feedback and refinement. We show that this approach can
learn effective policies for imputation problems of varying difficulty and
across multiple datasets.Comment: Accepted for publication at Advances in Neural Information Processing
Systems (NIPS) 201
Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
We introduce Texar, an open-source toolkit aiming to support the broad set of
text generation tasks that transform any inputs into natural language, such as
machine translation, summarization, dialog, content manipulation, and so forth.
With the design goals of modularity, versatility, and extensibility in mind,
Texar extracts common patterns underlying the diverse tasks and methodologies,
creates a library of highly reusable modules, and allows arbitrary model
architectures and algorithmic paradigms. In Texar, model architecture,
inference, and learning processes are properly decomposed. Modules at a high
concept level can be freely assembled and plugged in/swapped out. The toolkit
also supports a rich set of large-scale pretrained models. Texar is thus
particularly suitable for researchers and practitioners to do fast prototyping
and experimentation. The versatile toolkit also fosters technique sharing
across different text generation tasks. Texar supports both TensorFlow and
PyTorch, and is released under Apache License 2.0 at https://www.texar.io.Comment: ACL 2019 demo, expanded versio
Syntax-guided Controlled Generation of Paraphrases
Given a sentence (e.g., "I like mangoes") and a constraint (e.g., sentiment
flip), the goal of controlled text generation is to produce a sentence that
adapts the input sentence to meet the requirements of the constraint (e.g., "I
hate mangoes"). Going beyond such simple constraints, recent works have started
exploring the incorporation of complex syntactic-guidance as constraints in the
task of controlled paraphrase generation. In these methods, syntactic-guidance
is sourced from a separate exemplar sentence. However, these prior works have
only utilized limited syntactic information available in the parse tree of the
exemplar sentence. We address this limitation in the paper and propose Syntax
Guided Controlled Paraphraser (SGCP), an end-to-end framework for syntactic
paraphrase generation. We find that SGCP can generate syntax conforming
sentences while not compromising on relevance. We perform extensive automated
and human evaluations over multiple real-world English language datasets to
demonstrate the efficacy of SGCP over state-of-the-art baselines. To drive
future research, we have made SGCP's source code availableComment: 16 pages, 3 figures, Accepted to TACL 202
Text Generation with Exemplar-based Adaptive Decoding
We propose a novel conditioned text generation model. It draws inspiration
from traditional template-based text generation techniques, where the source
provides the content (i.e., what to say), and the template influences how to
say it. Building on the successful encoder-decoder paradigm, it first encodes
the content representation from the given input text; to produce the output, it
retrieves exemplar text from the training data as "soft templates," which are
then used to construct an exemplar-specific decoder. We evaluate the proposed
model on abstractive text summarization and data-to-text generation. Empirical
results show that this model achieves strong performance and outperforms
comparable baselines.Comment: NAACL 201
Music Generation by Deep Learning - Challenges and Directions
In addition to traditional tasks such as prediction, classification and
translation, deep learning is receiving growing attention as an approach for
music generation, as witnessed by recent research groups such as Magenta at
Google and CTRL (Creator Technology Research Lab) at Spotify. The motivation is
in using the capacity of deep learning architectures and training techniques to
automatically learn musical styles from arbitrary musical corpora and then to
generate samples from the estimated distribution. However, a direct application
of deep learning to generate content rapidly reaches limits as the generated
content tends to mimic the training set without exhibiting true creativity.
Moreover, deep learning architectures do not offer direct ways for controlling
generation (e.g., imposing some tonality or other arbitrary constraints).
Furthermore, deep learning architectures alone are autistic automata which
generate music autonomously without human user interaction, far from the
objective of interactively assisting musicians to compose and refine music.
Issues such as: control, structure, creativity and interactivity are the focus
of our analysis. In this paper, we select some limitations of a direct
application of deep learning to music generation, analyze why the issues are
not fulfilled and how to address them by possible approaches. Various examples
of recent systems are cited as examples of promising directions.Comment: 17 pages. arXiv admin note: substantial text overlap with
arXiv:1709.01620. Accepted for publication in Special Issue on Deep learning
for music and audio, Neural Computing & Applications, Springer Nature, 201
InferSpark: Statistical Inference at Scale
The Apache Spark stack has enabled fast large-scale data processing. Despite
a rich library of statistical models and inference algorithms, it does not give
domain users the ability to develop their own models. The emergence of
probabilistic programming languages has showed the promise of developing
sophisticated probabilistic models in a succinct and programmatic way. These
frameworks have the potential of automatically generating inference algorithms
for the user defined models and answering various statistical queries about the
model. It is a perfect time to unite these two great directions to produce a
programmable big data analysis framework. We thus propose, InferSpark, a
probabilistic programming framework on top of Apache Spark. Efficient
statistical inference can be easily implemented on this framework and inference
process can leverage the distributed main memory processing power of Spark.
This framework makes statistical inference on big data possible and speed up
the penetration of probabilistic programming into the data engineering domain.Comment: 13 pages, 22 figure
Russian Natural Language Generation: Creation of a Language Modelling Dataset and Evaluation with Modern Neural Architectures
Generating coherent, grammatically correct, and meaningful text is very
challenging, however, it is crucial to many modern NLP systems. So far,
research has mostly focused on English language, for other languages both
standardized datasets, as well as experiments with state-of-the-art models, are
rare. In this work, we i) provide a novel reference dataset for Russian
language modeling, ii) experiment with popular modern methods for text
generation, namely variational autoencoders, and generative adversarial
networks, which we trained on the new dataset. We evaluate the generated text
regarding metrics such as perplexity, grammatical correctness and lexical
diversity
Variation Network: Learning High-level Attributes for Controlled Input Manipulation
This paper presents the Variation Network (VarNet), a generative model
providing means to manipulate the high-level attributes of a given input. The
originality of our approach is that VarNet is not only capable of handling
pre-defined attributes but can also learn the relevant attributes of the
dataset by itself. These two settings can also be easily considered at the same
time, which makes this model applicable to a wide variety of tasks. Further,
VarNet has a sound information-theoretic interpretation which grants us with
interpretable means to control how these high-level attributes are learned. We
demonstrate experimentally that this model is capable of performing interesting
input manipulation and that the learned attributes are relevant and meaningful.Comment: 15 pages, 7 figure
- …