8,403 research outputs found
Unsupervised Sentence Compression using Denoising Auto-Encoders
In sentence compression, the task of shortening sentences while retaining the
original meaning, models tend to be trained on large corpora containing pairs
of verbose and compressed sentences. To remove the need for paired corpora, we
emulate a summarization task and add noise to extend sentences and train a
denoising auto-encoder to recover the original, constructing an end-to-end
training regime without the need for any examples of compressed sentences. We
conduct a human evaluation of our model on a standard text summarization
dataset and show that it performs comparably to a supervised baseline based on
grammatical correctness and retention of meaning. Despite being exposed to no
target data, our unsupervised models learn to generate imperfect but reasonably
readable sentence summaries. Although we underperform supervised models based
on ROUGE scores, our models are competitive with a supervised baseline based on
human evaluation for grammatical correctness and retention of meaning.Comment: CoNLL 201
Modelling Computational Resources for Next Generation Sequencing Bioinformatics Analysis of 16S rRNA Samples
In the rapidly evolving domain of next generation sequencing and
bioinformatics analysis, data generation is one aspect that is increasing at a
concomitant rate. The burden associated with processing large amounts of
sequencing data has emphasised the need to allocate sufficient computing
resources to complete analyses in the shortest possible time with manageable
and predictable costs. A novel method for predicting time to completion for a
popular bioinformatics software (QIIME), was developed using key variables
characteristic of the input data assumed to impact processing time. Multiple
Linear Regression models were developed to determine run time for two denoising
algorithms and a general bioinformatics pipeline. The models were able to
accurately predict clock time for denoising sequences from a naturally
assembled community dataset, but not an artificial community. Speedup and
efficiency tests for AmpliconNoise also highlighted that caution was needed
when allocating resources for parallel processing of data. Accurate modelling
of computational processing time using easily measurable predictors can assist
NGS analysts in determining resource requirements for bioinformatics software
and pipelines. Whilst demonstrated on a specific group of scripts, the
methodology can be extended to encompass other packages running on multiple
architectures, either in parallel or sequentially.Comment: 23 pages, 8 figure
Deconvolutional Paragraph Representation Learning
Learning latent representations from long text sequences is an important
first step in many natural language processing applications. Recurrent Neural
Networks (RNNs) have become a cornerstone for this challenging task. However,
the quality of sentences during RNN-based decoding (reconstruction) decreases
with the length of the text. We propose a sequence-to-sequence, purely
convolutional and deconvolutional autoencoding framework that is free of the
above issue, while also being computationally efficient. The proposed method is
simple, easy to implement and can be leveraged as a building block for many
applications. We show empirically that compared to RNNs, our framework is
better at reconstructing and correcting long paragraphs. Quantitative
evaluation on semi-supervised text classification and summarization tasks
demonstrate the potential for better utilization of long unlabeled text data.Comment: Accepted by NIPS 201
Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer
Unsupervised style transfer aims to change the style of an input sentence
while preserving its original content without using parallel training data. In
current dominant approaches, owing to the lack of fine-grained control on the
influence from the target style,they are unable to yield desirable output
sentences. In this paper, we propose a novel attentional sequence-to-sequence
(Seq2seq) model that dynamically exploits the relevance of each output word to
the target style for unsupervised style transfer. Specifically, we first
pretrain a style classifier, where the relevance of each input word to the
original style can be quantified via layer-wise relevance propagation. In a
denoising auto-encoding manner, we train an attentional Seq2seq model to
reconstruct input sentences and repredict word-level previously-quantified
style relevance simultaneously. In this way, this model is endowed with the
ability to automatically predict the style relevance of each output word. Then,
we equip the decoder of this model with a neural style component to exploit the
predicted wordlevel style relevance for better style transfer. Particularly, we
fine-tune this model using a carefully-designed objective function involving
style transfer, style relevance consistency, content preservation and fluency
modeling loss terms. Experimental results show that our proposed model achieves
state-of-the-art performance in terms of both transfer accuracy and content
preservation.Comment: Accepted by ACL202
Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction
Automatic sentence summarization produces a shorter version of a sentence,
while preserving its most important information. A good summary is
characterized by language fluency and high information overlap with the source
sentence. We model these two aspects in an unsupervised objective function,
consisting of language modeling and semantic similarity metrics. We search for
a high-scoring summary by discrete optimization. Our proposed method achieves a
new state-of-the art for unsupervised sentence summarization according to ROUGE
scores. Additionally, we demonstrate that the commonly reported ROUGE F1 metric
is sensitive to summary length. Since this is unwillingly exploited in recent
work, we emphasize that future evaluation should explicitly group summarization
systems by output length brackets.Comment: Accepted at ACL 202
Educating Text Autoencoders: Latent Representation Guidance via Denoising
Generative autoencoders offer a promising approach for controllable text
generation by leveraging their latent sentence representations. However,
current models struggle to maintain coherent latent spaces required to perform
meaningful text manipulations via latent vector operations. Specifically, we
demonstrate by example that neural encoders do not necessarily map similar
sentences to nearby latent vectors. A theoretical explanation for this
phenomenon establishes that high capacity autoencoders can learn an arbitrary
mapping between sequences and associated latent representations. To remedy this
issue, we augment adversarial autoencoders with a denoising objective where
original sentences are reconstructed from perturbed versions (referred to as
DAAE). We prove that this simple modification guides the latent space geometry
of the resulting model by encouraging the encoder to map similar texts to
similar latent representations. In empirical comparisons with various types of
autoencoders, our model provides the best trade-off between generation quality
and reconstruction capacity. Moreover, the improved geometry of the DAAE latent
space enables zero-shot text style transfer via simple latent vector
arithmetic.Comment: ICML 2020 camera-read
Latent Variable Algorithms for Multimodal Learning and Sensor Fusion
Multimodal learning has been lacking principled ways of combining information
from different modalities and learning a low-dimensional manifold of meaningful
representations. We study multimodal learning and sensor fusion from a latent
variable perspective. We first present a regularized recurrent attention filter
for sensor fusion. This algorithm can dynamically combine information from
different types of sensors in a sequential decision making task. Each sensor is
bonded with a modular neural network to maximize utility of its own
information. A gating modular neural network dynamically generates a set of
mixing weights for outputs from sensor networks by balancing utility of all
sensors' information. We design a co-learning mechanism to encourage
co-adaption and independent learning of each sensor at the same time, and
propose a regularization based co-learning method. In the second part, we focus
on recovering the manifold of latent representation. We propose a co-learning
approach using probabilistic graphical model which imposes a structural prior
on the generative model: multimodal variational RNN (MVRNN) model, and derive a
variational lower bound for its objective functions. In the third part, we
extend the siamese structure to sensor fusion for robust acoustic event
detection. We perform experiments to investigate the latent representations
that are extracted; works will be done in the following months. Our experiments
show that the recurrent attention filter can dynamically combine different
sensor inputs according to the information carried in the inputs. We consider
MVRNN can identify latent representations that are useful for many downstream
tasks such as speech synthesis, activity recognition, and control and planning.
Both algorithms are general frameworks which can be applied to other tasks
where different types of sensors are jointly used for decision making
Unsupervised Neural Machine Translation
In spite of the recent success of neural machine translation (NMT) in
standard benchmarks, the lack of large parallel corpora poses a major practical
problem for many language pairs. There have been several proposals to alleviate
this issue with, for instance, triangulation and semi-supervised learning
techniques, but they still require a strong cross-lingual signal. In this work,
we completely remove the need of parallel data and propose a novel method to
train an NMT system in a completely unsupervised manner, relying on nothing but
monolingual corpora. Our model builds upon the recent work on unsupervised
embedding mappings, and consists of a slightly modified attentional
encoder-decoder model that can be trained on monolingual corpora alone using a
combination of denoising and backtranslation. Despite the simplicity of the
approach, our system obtains 15.56 and 10.21 BLEU points in WMT 2014
French-to-English and German-to-English translation. The model can also profit
from small parallel corpora, and attains 21.81 and 15.24 points when combined
with 100,000 parallel sentences, respectively. Our implementation is released
as an open source project.Comment: Published as a conference paper at ICLR 201
Unsupervised Neural Text Simplification
The paper presents a first attempt towards unsupervised neural text
simplification that relies only on unlabeled text corpora. The core framework
is composed of a shared encoder and a pair of attentional-decoders and gains
knowledge of simplification through discrimination based-losses and denoising.
The framework is trained using unlabeled text collected from en-Wikipedia dump.
Our analysis (both quantitative and qualitative involving human evaluators) on
a public test data shows that the proposed model can perform
text-simplification at both lexical and syntactic levels, competitive to
existing supervised methods. Addition of a few labelled pairs also improves the
performance further.Comment: ACL 201
A Lightweight Music Texture Transfer System
Deep learning researches on the transformation problems for image and text
have raised great attention. However, present methods for music feature
transfer using neural networks are far from practical application. In this
paper, we initiate a novel system for transferring the texture of music, and
release it as an open source project. Its core algorithm is composed of a
converter which represents sounds as texture spectra, a corresponding
reconstructor and a feed-forward transfer network. We evaluate this system from
multiple perspectives, and experimental results reveal that it achieves
convincing results in both sound effects and computational performance.Comment: 12 page
- …