1,083 research outputs found
Neural Random Projections for Language Modelling
Neural network-based language models deal with data sparsity problems by
mapping the large discrete space of words into a smaller continuous space of
real-valued vectors. By learning distributed vector representations for words,
each training sample informs the neural network model about a combinatorial
number of other patterns. In this paper, we exploit the sparsity in natural
language even further by encoding each unique input word using a fixed sparse
random representation. These sparse codes are then projected onto a smaller
embedding space which allows for the encoding of word occurrences from a
possibly unknown vocabulary, along with the creation of more compact language
models using a reduced number of parameters. We investigate the properties of
our encoding mechanism empirically, by evaluating its performance on the widely
used Penn Treebank corpus. We show that guaranteeing approximately equidistant
(nearly orthogonal) vector representations for unique discrete inputs is enough
to provide the neural network model with enough information to learn --and make
use-- of distributed representations for these inputs
Hierarchical Text Generation and Planning for Strategic Dialogue
End-to-end models for goal-orientated dialogue are challenging to train,
because linguistic and strategic aspects are entangled in latent state vectors.
We introduce an approach to learning representations of messages in dialogues
by maximizing the likelihood of subsequent sentences and actions, which
decouples the semantics of the dialogue utterance from its linguistic
realization. We then use these latent sentence representations for hierarchical
language generation, planning and reinforcement learning. Experiments show that
our approach increases the end-task reward achieved by the model, improves the
effectiveness of long-term planning using rollouts, and allows self-play
reinforcement learning to improve decision making without diverging from human
language. Our hierarchical latent-variable model outperforms previous work both
linguistically and strategically
Semi-Amortized Variational Autoencoders
Amortized variational inference (AVI) replaces instance-specific local
inference with a global inference network. While AVI has enabled efficient
training of deep generative models such as variational autoencoders (VAE),
recent empirical work suggests that inference networks can produce suboptimal
variational parameters. We propose a hybrid approach, to use AVI to initialize
the variational parameters and run stochastic variational inference (SVI) to
refine them. Crucially, the local SVI procedure is itself differentiable, so
the inference network and generative model can be trained end-to-end with
gradient-based optimization. This semi-amortized approach enables the use of
rich generative models without experiencing the posterior-collapse phenomenon
common in training VAEs for problems like text generation. Experiments show
this approach outperforms strong autoregressive and variational baselines on
standard text and image datasets.Comment: ICML 201
Strongly-Typed Recurrent Neural Networks
Recurrent neural networks are increasing popular models for sequential
learning. Unfortunately, although the most effective RNN architectures are
perhaps excessively complicated, extensive searches have not found simpler
alternatives. This paper imports ideas from physics and functional programming
into RNN design to provide guiding principles. From physics, we introduce type
constraints, analogous to the constraints that forbids adding meters to
seconds. From functional programming, we require that strongly-typed
architectures factorize into stateless learnware and state-dependent firmware,
reducing the impact of side-effects. The features learned by strongly-typed
nets have a simple semantic interpretation via dynamic average-pooling on
one-dimensional convolutions. We also show that strongly-typed gradients are
better behaved than in classical architectures, and characterize the
representational power of strongly-typed nets. Finally, experiments show that,
despite being more constrained, strongly-typed architectures achieve lower
training and comparable generalization error to classical architectures.Comment: 10 pages, final version, ICML 201
dpUGC: Learn Differentially Private Representation for User Generated Contents
This paper firstly proposes a simple yet efficient generalized approach to
apply differential privacy to text representation (i.e., word embedding). Based
on it, we propose a user-level approach to learn personalized differentially
private word embedding model on user generated contents (UGC). To our best
knowledge, this is the first work of learning user-level differentially private
word embedding model from text for sharing. The proposed approaches protect the
privacy of the individual from re-identification, especially provide better
trade-off of privacy and data utility on UGC data for sharing. The experimental
results show that the trained embedding models are applicable for the classic
text analysis tasks (e.g., regression). Moreover, the proposed approaches of
learning differentially private embedding models are both framework- and data-
independent, which facilitates the deployment and sharing. The source code is
available at https://github.com/sonvx/dpText
Generative Adversarial Networks: An Overview
Generative adversarial networks (GANs) provide a way to learn deep
representations without extensively annotated training data. They achieve this
through deriving backpropagation signals through a competitive process
involving a pair of networks. The representations that can be learned by GANs
may be used in a variety of applications, including image synthesis, semantic
image editing, style transfer, image super-resolution and classification. The
aim of this review paper is to provide an overview of GANs for the signal
processing community, drawing on familiar analogies and concepts where
possible. In addition to identifying different methods for training and
constructing GANs, we also point to remaining challenges in their theory and
application.Comment: Accepted in the IEEE Signal Processing Magazine Special Issue on Deep
Learning for Visual Understandin
Emulating malware authors for proactive protection using GANs over a distributed image visualization of dynamic file behavior
Malware authors have always been at an advantage of being able to
adversarially test and augment their malicious code, before deploying the
payload, using anti-malware products at their disposal. The anti-malware
developers and threat experts, on the other hand, do not have such a privilege
of tuning anti-malware products against zero-day attacks pro-actively. This
allows the malware authors to being a step ahead of the anti-malware products,
fundamentally biasing the cat and mouse game played by the two parties. In this
paper, we propose a way that would enable machine learning based threat
prevention models to bridge that gap by being able to tune against a deep
generative adversarial network (GAN), which takes up the role of a malware
author and generates new types of malware. The GAN is trained over a reversible
distributed RGB image representation of known malware behaviors, encoding the
sequence of API call ngrams and the corresponding term frequencies. The
generated images represent synthetic malware that can be decoded back to the
underlying API call sequence information. The image representation is not only
demonstrated as a general technique of incorporating necessary priors for
exploiting convolutional neural network architectures for generative or
discriminative modeling, but also as a visualization method for easy manual
software or malware categorization, by having individual API ngram information
distributed across the image space. In addition, we also propose using
smart-definitions for detecting malwares based on perceptual hashing of these
images. Such hashes are potentially more effective than cryptographic hashes
that do not carry any meaningful similarity metric, and hence, do not
generalize well.Comment: 22 pages, 12 figures, 4 table
Learning Universal Sentence Representations with Mean-Max Attention Autoencoder
In order to learn universal sentence representations, previous methods focus
on complex recurrent neural networks or supervised learning. In this paper, we
propose a mean-max attention autoencoder (mean-max AAE) within the
encoder-decoder framework. Our autoencoder rely entirely on the MultiHead
self-attention mechanism to reconstruct the input sequence. In the encoding we
propose a mean-max strategy that applies both mean and max pooling operations
over the hidden vectors to capture diverse information of the input. To enable
the information to steer the reconstruction process dynamically, the decoder
performs attention over the mean-max representation. By training our model on a
large collection of unlabelled data, we obtain high-quality representations of
sentences. Experimental results on a broad range of 10 transfer tasks
demonstrate that our model outperforms the state-of-the-art unsupervised single
methods, including the classical skip-thoughts and the advanced
skip-thoughts+LN model. Furthermore, compared with the traditional recurrent
neural network, our mean-max AAE greatly reduce the training time.Comment: EMNLP 201
Creative Procedural-Knowledge Extraction From Web Design Tutorials
Complex design tasks often require performing diverse actions in a specific
order. To (semi-)autonomously accomplish these tasks, applications need to
understand and learn a wide range of design procedures, i.e., Creative
Procedural-Knowledge (CPK). Prior knowledge base construction and mining have
not typically addressed the creative fields, such as design and arts. In this
paper, we formalize an ontology of CPK using five components: goal, workflow,
action, command and usage; and extract components' values from online design
tutorials. We scraped 19.6K tutorial-related webpages and built a web
application for professional designers to identify and summarize CPK
components. The annotated dataset consists of 819 unique commands, 47,491
actions, and 2,022 workflows and goals. Based on this dataset, we propose a
general CPK extraction pipeline and demonstrate that existing text
classification and sequence-to-sequence models are limited in identifying,
predicting and summarizing complex operations described in heterogeneous
styles. Through quantitative and qualitative error analysis, we discuss CPK
extraction challenges that need to be addressed by future research
Learning a bidirectional mapping between human whole-body motion and natural language using deep recurrent neural networks
Linking human whole-body motion and natural language is of great interest for
the generation of semantic representations of observed human behaviors as well
as for the generation of robot behaviors based on natural language input. While
there has been a large body of research in this area, most approaches that
exist today require a symbolic representation of motions (e.g. in the form of
motion primitives), which have to be defined a-priori or require complex
segmentation algorithms. In contrast, recent advances in the field of neural
networks and especially deep learning have demonstrated that sub-symbolic
representations that can be learned end-to-end usually outperform more
traditional approaches, for applications such as machine translation. In this
paper we propose a generative model that learns a bidirectional mapping between
human whole-body motion and natural language using deep recurrent neural
networks (RNNs) and sequence-to-sequence learning. Our approach does not
require any segmentation or manual feature engineering and learns a distributed
representation, which is shared for all motions and descriptions. We evaluate
our approach on 2,846 human whole-body motions and 6,187 natural language
descriptions thereof from the KIT Motion-Language Dataset. Our results clearly
demonstrate the effectiveness of the proposed model: We show that our model
generates a wide variety of realistic motions only from descriptions thereof in
form of a single sentence. Conversely, our model is also capable of generating
correct and detailed natural language descriptions from human motions
- …