2,983 research outputs found
Representation Learning by Reconstructing Neighborhoods
Since its introduction, unsupervised representation learning has attracted a
lot of attention from the research community, as it is demonstrated to be
highly effective and easy-to-apply in tasks such as dimension reduction,
clustering, visualization, information retrieval, and semi-supervised learning.
In this work, we propose a novel unsupervised representation learning framework
called neighbor-encoder, in which domain knowledge can be easily incorporated
into the learning process without modifying the general encoder-decoder
architecture of the classic autoencoder.In contrast to autoencoder, which
reconstructs the input data itself, neighbor-encoder reconstructs the input
data's neighbors. As the proposed representation learning problem is
essentially a neighbor reconstruction problem, domain knowledge can be easily
incorporated in the form of an appropriate definition of similarity between
objects. Based on that observation, our framework can leverage any
off-the-shelf similarity search algorithms or side information to find the
neighbor of an input object. Applications of other algorithms (e.g.,
association rule mining) in our framework are also possible, given that the
appropriate definition of neighbor can vary in different contexts. We have
demonstrated the effectiveness of our framework in many diverse domains,
including images, text, and time series, and for various data mining tasks
including classification, clustering, and visualization. Experimental results
show that neighbor-encoder not only outperforms autoencoder in most of the
scenarios we consider, but also achieves the state-of-the-art performance on
text document clustering
Variational Autoencoders for Semi-supervised Text Classification
Although semi-supervised variational autoencoder (SemiVAE) works in image
classification task, it fails in text classification task if using vanilla LSTM
as its decoder. From a perspective of reinforcement learning, it is verified
that the decoder's capability to distinguish between different categorical
labels is essential. Therefore, Semi-supervised Sequential Variational
Autoencoder (SSVAE) is proposed, which increases the capability by feeding
label into its decoder RNN at each time-step. Two specific decoder structures
are investigated and both of them are verified to be effective. Besides, in
order to reduce the computational complexity in training, a novel optimization
method is proposed, which estimates the gradient of the unlabeled objective
function by sampling, along with two variance reduction techniques.
Experimental results on Large Movie Review Dataset (IMDB) and AG's News corpus
show that the proposed approach significantly improves the classification
accuracy compared with pure-supervised classifiers, and achieves competitive
performance against previous advanced methods. State-of-the-art results can be
obtained by integrating other pretraining-based methods.Comment: 8 pages, 4 figur
Improved Variational Autoencoders for Text Modeling using Dilated Convolutions
Recent work on generative modeling of text has found that variational
auto-encoders (VAE) incorporating LSTM decoders perform worse than simpler LSTM
language models (Bowman et al., 2015). This negative result is so far poorly
understood, but has been attributed to the propensity of LSTM decoders to
ignore conditioning information from the encoder. In this paper, we experiment
with a new type of decoder for VAE: a dilated CNN. By changing the decoder's
dilation architecture, we control the effective context from previously
generated words. In experiments, we find that there is a trade off between the
contextual capacity of the decoder and the amount of encoding information used.
We show that with the right decoder, VAE can outperform LSTM language models.
We demonstrate perplexity gains on two datasets, representing the first
positive experimental result on the use VAE for generative modeling of text.
Further, we conduct an in-depth investigation of the use of VAE (with our new
decoding architecture) for semi-supervised and unsupervised labeling tasks,
demonstrating gains over several strong baselines.Comment: camera read
Representation Learning with Autoencoders for Electronic Health Records: A Comparative Study
Increasing volume of Electronic Health Records (EHR) in recent years provides
great opportunities for data scientists to collaborate on different aspects of
healthcare research by applying advanced analytics to these EHR clinical data.
A key requirement however is obtaining meaningful insights from high
dimensional, sparse and complex clinical data. Data science approaches
typically address this challenge by performing feature learning in order to
build more reliable and informative feature representations from clinical data
followed by supervised learning. In this paper, we propose a predictive
modeling approach based on deep learning based feature representations and word
embedding techniques. Our method uses different deep architectures (stacked
sparse autoencoders, deep belief network, adversarial autoencoders and
variational autoencoders) for feature representation in higher-level
abstraction to obtain effective and robust features from EHRs, and then build
prediction models on top of them. Our approach is particularly useful when the
unlabeled data is abundant whereas labeled data is scarce. We investigate the
performance of representation learning through a supervised learning approach.
Our focus is to present a comparative study to evaluate the performance of
different deep architectures through supervised learning and provide insights
in the choice of deep feature representation techniques. Our experiments
demonstrate that for small data sets, stacked sparse autoencoder demonstrates a
superior generality performance in prediction due to sparsity regularization
whereas variational autoencoders outperform the competing approaches for large
data sets due to its capability of learning the representation distribution
Deconvolutional Latent-Variable Model for Text Sequence Matching
A latent-variable model is introduced for text matching, inferring sentence
representations by jointly optimizing generative and discriminative objectives.
To alleviate typical optimization challenges in latent-variable models for
text, we employ deconvolutional networks as the sequence decoder (generator),
providing learned latent codes with more semantic information and better
generalization. Our model, trained in an unsupervised manner, yields stronger
empirical predictive performance than a decoder based on Long Short-Term Memory
(LSTM), with less parameters and considerably faster training. Further, we
apply it to text sequence-matching problems. The proposed model significantly
outperforms several strong sentence-encoding baselines, especially in the
semi-supervised setting.Comment: Accepted by AAAI-201
Variational Semi-supervised Aspect-term Sentiment Analysis via Transformer
Aspect-term sentiment analysis (ATSA) is a longstanding challenge in natural
language understanding. It requires fine-grained semantical reasoning about a
target entity appeared in the text. As manual annotation over the aspects is
laborious and time-consuming, the amount of labeled data is limited for
supervised learning. This paper proposes a semi-supervised method for the ATSA
problem by using the Variational Autoencoder based on Transformer (VAET), which
models the latent distribution via variational inference. By disentangling the
latent representation into the aspect-specific sentiment and the lexical
context, our method induces the underlying sentiment prediction for the
unlabeled data, which then benefits the ATSA classifier. Our method is
classifier agnostic, i.e., the classifier is an independent module and various
advanced supervised models can be integrated. Experimental results are obtained
on the SemEval 2014 task 4 and show that our method is effective with four
classical classifiers. The proposed method outperforms two general
semisupervised methods and achieves state-of-the-art performance.Comment: Accepted by CoNLL 201
Adversarial Autoencoders
In this paper, we propose the "adversarial autoencoder" (AAE), which is a
probabilistic autoencoder that uses the recently proposed generative
adversarial networks (GAN) to perform variational inference by matching the
aggregated posterior of the hidden code vector of the autoencoder with an
arbitrary prior distribution. Matching the aggregated posterior to the prior
ensures that generating from any part of prior space results in meaningful
samples. As a result, the decoder of the adversarial autoencoder learns a deep
generative model that maps the imposed prior to the data distribution. We show
how the adversarial autoencoder can be used in applications such as
semi-supervised classification, disentangling style and content of images,
unsupervised clustering, dimensionality reduction and data visualization. We
performed experiments on MNIST, Street View House Numbers and Toronto Face
datasets and show that adversarial autoencoders achieve competitive results in
generative modeling and semi-supervised classification tasks
PixelGAN Autoencoders
In this paper, we describe the "PixelGAN autoencoder", a generative
autoencoder in which the generative path is a convolutional autoregressive
neural network on pixels (PixelCNN) that is conditioned on a latent code, and
the recognition path uses a generative adversarial network (GAN) to impose a
prior distribution on the latent code. We show that different priors result in
different decompositions of information between the latent code and the
autoregressive decoder. For example, by imposing a Gaussian distribution as the
prior, we can achieve a global vs. local decomposition, or by imposing a
categorical distribution as the prior, we can disentangle the style and content
information of images in an unsupervised fashion. We further show how the
PixelGAN autoencoder with a categorical prior can be directly used in
semi-supervised settings and achieve competitive semi-supervised classification
results on the MNIST, SVHN and NORB datasets
Deconvolutional Paragraph Representation Learning
Learning latent representations from long text sequences is an important
first step in many natural language processing applications. Recurrent Neural
Networks (RNNs) have become a cornerstone for this challenging task. However,
the quality of sentences during RNN-based decoding (reconstruction) decreases
with the length of the text. We propose a sequence-to-sequence, purely
convolutional and deconvolutional autoencoding framework that is free of the
above issue, while also being computationally efficient. The proposed method is
simple, easy to implement and can be leveraged as a building block for many
applications. We show empirically that compared to RNNs, our framework is
better at reconstructing and correcting long paragraphs. Quantitative
evaluation on semi-supervised text classification and summarization tasks
demonstrate the potential for better utilization of long unlabeled text data.Comment: Accepted by NIPS 201
Learning Graph Embedding with Adversarial Training Methods
Graph embedding aims to transfer a graph into vectors to facilitate
subsequent graph analytics tasks like link prediction and graph clustering.
Most approaches on graph embedding focus on preserving the graph structure or
minimizing the reconstruction errors for graph data. They have mostly
overlooked the embedding distribution of the latent codes, which unfortunately
may lead to inferior representation in many cases. In this paper, we present a
novel adversarially regularized framework for graph embedding. By employing the
graph convolutional network as an encoder, our framework embeds the topological
information and node content into a vector representation, from which a graph
decoder is further built to reconstruct the input graph. The adversarial
training principle is applied to enforce our latent codes to match a prior
Gaussian or Uniform distribution. Based on this framework, we derive two
variants of adversarial models, the adversarially regularized graph autoencoder
(ARGA) and its variational version, adversarially regularized variational graph
autoencoder (ARVGA), to learn the graph embedding effectively. We also exploit
other potential variations of ARGA and ARVGA to get a deeper understanding on
our designs. Experimental results compared among twelve algorithms for link
prediction and twenty algorithms for graph clustering validate our solutions.Comment: To appear in IEEE Transactions on Cybernetics. arXiv admin note:
substantial text overlap with arXiv:1802.0440
- …