2,511 research outputs found
Unsupervised Paraphrasing without Translation
Paraphrasing exemplifies the ability to abstract semantic content from
surface forms. Recent work on automatic paraphrasing is dominated by methods
leveraging Machine Translation (MT) as an intermediate step. This contrasts
with humans, who can paraphrase without being bilingual. This work proposes to
learn paraphrasing models from an unlabeled monolingual corpus only. To that
end, we propose a residual variant of vector-quantized variational
auto-encoder.
We compare with MT-based approaches on paraphrase identification, generation,
and training augmentation. Monolingual paraphrasing outperforms unsupervised
translation in all settings. Comparisons with supervised translation are more
mixed: monolingual paraphrasing is interesting for identification and
augmentation; supervised translation is superior for generation.Comment: ACL 201
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Statistical Parametric Speech Synthesis Using Bottleneck Representation From Sequence Auto-encoder
In this paper, we describe a statistical parametric speech synthesis approach
with unit-level acoustic representation. In conventional deep neural network
based speech synthesis, the input text features are repeated for the entire
duration of phoneme for mapping text and speech parameters. This mapping is
learnt at the frame-level which is the de-facto acoustic representation.
However much of this computational requirement can be drastically reduced if
every unit can be represented with a fixed-dimensional representation. Using
recurrent neural network based auto-encoder, we show that it is indeed possible
to map units of varying duration to a single vector. We then use this acoustic
representation at unit-level to synthesize speech using deep neural network
based statistical parametric speech synthesis technique. Results show that the
proposed approach is able to synthesize at the same quality as the conventional
frame based approach at a highly reduced computational cost.Comment: 5 pages (with references
How Generative Adversarial Networks and Their Variants Work: An Overview
Generative Adversarial Networks (GAN) have received wide attention in the
machine learning field for their potential to learn high-dimensional, complex
real data distribution. Specifically, they do not rely on any assumptions about
the distribution and can generate real-like samples from latent space in a
simple manner. This powerful property leads GAN to be applied to various
applications such as image synthesis, image attribute editing, image
translation, domain adaptation and other academic fields. In this paper, we aim
to discuss the details of GAN for those readers who are familiar with, but do
not comprehend GAN deeply or who wish to view GAN from various perspectives. In
addition, we explain how GAN operates and the fundamental meaning of various
objective functions that have been suggested recently. We then focus on how the
GAN can be combined with an autoencoder framework. Finally, we enumerate the
GAN variants that are applied to various tasks and other fields for those who
are interested in exploiting GAN for their research.Comment: 41 pages, 16 figures, Published in ACM Computing Surveys (CSUR
Contextual Parameter Generation for Universal Neural Machine Translation
We propose a simple modification to existing neural machine translation (NMT)
models that enables using a single universal model to translate between
multiple languages while allowing for language specific parameterization, and
that can also be used for domain adaptation. Our approach requires no changes
to the model architecture of a standard NMT system, but instead introduces a
new component, the contextual parameter generator (CPG), that generates the
parameters of the system (e.g., weights in a neural network). This parameter
generator accepts source and target language embeddings as input, and generates
the parameters for the encoder and the decoder, respectively. The rest of the
model remains unchanged and is shared across all languages. We show how this
simple modification enables the system to use monolingual data for training and
also perform zero-shot translation. We further show it is able to surpass
state-of-the-art performance for both the IWSLT-15 and IWSLT-17 datasets and
that the learned language embeddings are able to uncover interesting
relationships between languages.Comment: Published in the proceedings of Empirical Methods in Natural Language
Processing (EMNLP), 201
Semi-supervised Learning with Contrastive Predicative Coding
Semi-supervised learning (SSL) provides a powerful framework for leveraging
unlabeled data when labels are limited or expensive to obtain. SSL algorithms
based on deep neural networks have recently proven successful on standard
benchmark tasks. However, many of them have thus far been either inflexible,
inefficient or non-scalable. This paper explores recently developed contrastive
predictive coding technique to improve discriminative power of deep learning
models when a large portion of labels are absent. Two models, cpc-SSL and a
class conditional variant~(ccpc-SSL) are presented. They effectively exploit
the unlabeled data by extracting shared information between different parts of
the (high-dimensional) data. The proposed approaches are inductive, and scale
well to very large datasets like ImageNet, making them good candidates in
real-world large scale applications.Comment: 6 pages, 4 figures, conferenc
Semi-Supervised Convolutional Neural Networks for Human Activity Recognition
Labeled data used for training activity recognition classifiers are usually
limited in terms of size and diversity. Thus, the learned model may not
generalize well when used in real-world use cases. Semi-supervised learning
augments labeled examples with unlabeled examples, often resulting in improved
performance. However, the semi-supervised methods studied in the activity
recognition literatures assume that feature engineering is already done. In
this paper, we lift this assumption and present two semi-supervised methods
based on convolutional neural networks (CNNs) to learn discriminative hidden
features. Our semi-supervised CNNs learn from both labeled and unlabeled data
while also performing feature learning on raw sensor data. In experiments on
three real world datasets, we show that our CNNs outperform supervised methods
and traditional semi-supervised learning methods by up to 18% in mean F1-score
(Fm).Comment: Accepted by BigData201
Paraphrase Thought: Sentence Embedding Module Imitating Human Language Recognition
Sentence embedding is an important research topic in natural language
processing. It is essential to generate a good embedding vector that fully
reflects the semantic meaning of a sentence in order to achieve an enhanced
performance for various natural language processing tasks, such as machine
translation and document classification. Thus far, various sentence embedding
models have been proposed, and their feasibility has been demonstrated through
good performances on tasks following embedding, such as sentiment analysis and
sentence classification. However, because the performances of sentence
classification and sentiment analysis can be enhanced by using a simple
sentence representation method, it is not sufficient to claim that these models
fully reflect the meanings of sentences based on good performances for such
tasks. In this paper, inspired by human language recognition, we propose the
following concept of semantic coherence, which should be satisfied for a good
sentence embedding method: similar sentences should be located close to each
other in the embedding space. Then, we propose the Paraphrase-Thought
(P-thought) model to pursue semantic coherence as much as possible.
Experimental results on two paraphrase identification datasets (MS COCO and STS
benchmark) show that the P-thought models outperform the benchmarked sentence
embedding methods.Comment: 10 page
Learning Audio Sequence Representations for Acoustic Event Classification
Acoustic Event Classification (AEC) has become a significant task for
machines to perceive the surrounding auditory scene. However, extracting
effective representations that capture the underlying characteristics of the
acoustic events is still challenging. Previous methods mainly focused on
designing the audio features in a 'hand-crafted' manner. Interestingly,
data-learnt features have been recently reported to show better performance. Up
to now, these were only considered on the frame-level. In this paper, we
propose an unsupervised learning framework to learn a vector representation of
an audio sequence for AEC. This framework consists of a Recurrent Neural
Network (RNN) encoder and a RNN decoder, which respectively transforms the
variable-length audio sequence into a fixed-length vector and reconstructs the
input sequence on the generated vector. After training the encoder-decoder, we
feed the audio sequences to the encoder and then take the learnt vectors as the
audio sequence representations. Compared with previous methods, the proposed
method can not only deal with the problem of arbitrary-lengths of audio
streams, but also learn the salient information of the sequence. Extensive
evaluation on a large-size acoustic event database is performed, and the
empirical results demonstrate that the learnt audio sequence representation
yields a significant performance improvement by a large margin compared with
other state-of-the-art hand-crafted sequence features for AEC
A Context-Aware Citation Recommendation Model with BERT and Graph Convolutional Networks
With the tremendous growth in the number of scientific papers being
published, searching for references while writing a scientific paper is a
time-consuming process. A technique that could add a reference citation at the
appropriate place in a sentence will be beneficial. In this perspective,
context-aware citation recommendation has been researched upon for around two
decades. Many researchers have utilized the text data called the context
sentence, which surrounds the citation tag, and the metadata of the target
paper to find the appropriate cited research. However, the lack of
well-organized benchmarking datasets and no model that can attain high
performance has made the research difficult.
In this paper, we propose a deep learning based model and well-organized
dataset for context-aware paper citation recommendation. Our model comprises a
document encoder and a context encoder, which uses Graph Convolutional Networks
(GCN) layer and Bidirectional Encoder Representations from Transformers (BERT),
which is a pre-trained model of textual data. By modifying the related PeerRead
dataset, we propose a new dataset called FullTextPeerRead containing context
sentences to cited references and paper metadata. To the best of our knowledge,
This dataset is the first well-organized dataset for context-aware paper
recommendation. The results indicate that the proposed model with the proposed
datasets can attain state-of-the-art performance and achieve a more than 28%
improvement in mean average precision (MAP) and [email protected]: 7 pages, 5 figure
- …