1,369 research outputs found
Controlling Output Length in Neural Encoder-Decoders
Neural encoder-decoder models have shown great success in many sequence
generation tasks. However, previous work has not investigated situations in
which we would like to control the length of encoder-decoder outputs. This
capability is crucial for applications such as text summarization, in which we
have to generate concise summaries with a desired length. In this paper, we
propose methods for controlling the output sequence length for neural
encoder-decoder models: two decoding-based methods and two learning-based
methods. Results show that our learning-based methods have the capability to
control length without degrading summary quality in a summarization task.Comment: 11 pages. To appear in EMNLP 201
Closed loop interactions between spiking neural network and robotic simulators based on MUSIC and ROS
In order to properly assess the function and computational properties of
simulated neural systems, it is necessary to account for the nature of the
stimuli that drive the system. However, providing stimuli that are rich and yet
both reproducible and amenable to experimental manipulations is technically
challenging, and even more so if a closed-loop scenario is required. In this
work, we present a novel approach to solve this problem, connecting robotics
and neural network simulators. We implement a middleware solution that bridges
the Robotic Operating System (ROS) to the Multi-Simulator Coordinator (MUSIC).
This enables any robotic and neural simulators that implement the corresponding
interfaces to be efficiently coupled, allowing real-time performance for a wide
range of configurations. This work extends the toolset available for
researchers in both neurorobotics and computational neuroscience, and creates
the opportunity to perform closed-loop experiments of arbitrary complexity to
address questions in multiple areas, including embodiment, agency, and
reinforcement learning
Improving Variational Encoder-Decoders in Dialogue Generation
Variational encoder-decoders (VEDs) have shown promising results in dialogue
generation. However, the latent variable distributions are usually approximated
by a much simpler model than the powerful RNN structure used for encoding and
decoding, yielding the KL-vanishing problem and inconsistent training
objective. In this paper, we separate the training step into two phases: The
first phase learns to autoencode discrete texts into continuous embeddings,
from which the second phase learns to generalize latent representations by
reconstructing the encoded embedding. In this case, latent variables are
sampled by transforming Gaussian noise through multi-layer perceptrons and are
trained with a separate VED model, which has the potential of realizing a much
more flexible distribution. We compare our model with current popular models
and the experiment demonstrates substantial improvement in both metric-based
and human evaluations.Comment: Accepted by AAAI201
Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction
Labeled sequence transduction is a task of transforming one sequence into
another sequence that satisfies desiderata specified by a set of labels. In
this paper we propose multi-space variational encoder-decoders, a new model for
labeled sequence transduction with semi-supervised learning. The generative
model can use neural networks to handle both discrete and continuous latent
variables to exploit various features of data. Experiments show that our model
provides not only a powerful supervised framework but also can effectively take
advantage of the unlabeled data. On the SIGMORPHON morphological inflection
benchmark, our model outperforms single-model state-of-art results by a large
margin for the majority of languages.Comment: Accepted by ACL 201
SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression
Neural sequence-to-sequence models are currently the dominant approach in
several natural language processing tasks, but require large parallel corpora.
We present a sequence-to-sequence-to-sequence autoencoder (SEQ^3), consisting
of two chained encoder-decoder pairs, with words used as a sequence of discrete
latent variables. We apply the proposed model to unsupervised abstractive
sentence compression, where the first and last sequences are the input and
reconstructed sentences, respectively, while the middle sequence is the
compressed sentence. Constraining the length of the latent word sequences
forces the model to distill important information from the input. A pretrained
language model, acting as a prior over the latent sequences, encourages the
compressed sentences to be human-readable. Continuous relaxations enable us to
sample from categorical distributions, allowing gradient-based optimization,
unlike alternatives that rely on reinforcement learning. The proposed model
does not require parallel text-summary pairs, achieving promising results in
unsupervised sentence compression on benchmark datasets.Comment: Accepted to NAACL 201
Style Transfer in Text: Exploration and Evaluation
Style transfer is an important problem in natural language processing (NLP).
However, the progress in language style transfer is lagged behind other
domains, such as computer vision, mainly because of the lack of parallel data
and principle evaluation metrics. In this paper, we propose to learn style
transfer with non-parallel data. We explore two models to achieve this goal,
and the key idea behind the proposed models is to learn separate content
representations and style representations using adversarial networks. We also
propose novel evaluation metrics which measure two aspects of style transfer:
transfer strength and content preservation. We access our models and the
evaluation metrics on two tasks: paper-news title transfer, and
positive-negative review transfer. Results show that the proposed content
preservation metric is highly correlate to human judgments, and the proposed
models are able to generate sentences with higher style transfer strength and
similar content preservation score comparing to auto-encoder.Comment: To appear in AAAI-1
- …