30,021 research outputs found
SparseGAN: Sparse Generative Adversarial Network for Text Generation
It is still a challenging task to learn a neural text generation model under
the framework of generative adversarial networks (GANs) since the entire
training process is not differentiable. The existing training strategies either
suffer from unreliable gradient estimations or imprecise sentence
representations. Inspired by the principle of sparse coding, we propose a
SparseGAN that generates semantic-interpretable, but sparse sentence
representations as inputs to the discriminator. The key idea is that we treat
an embedding matrix as an over-complete dictionary, and use a linear
combination of very few selected word embeddings to approximate the output
feature representation of the generator at each time step. With such
semantic-rich representations, we not only reduce unnecessary noises for
efficient adversarial training, but also make the entire training process fully
differentiable. Experiments on multiple text generation datasets yield
performance improvements, especially in sequence-level metrics, such as BLEU
Semantic bottleneck for computer vision tasks
This paper introduces a novel method for the representation of images that is
semantic by nature, addressing the question of computation intelligibility in
computer vision tasks. More specifically, our proposition is to introduce what
we call a semantic bottleneck in the processing pipeline, which is a crossing
point in which the representation of the image is entirely expressed with
natural language , while retaining the efficiency of numerical representations.
We show that our approach is able to generate semantic representations that
give state-of-the-art results on semantic content-based image retrieval and
also perform very well on image classification tasks. Intelligibility is
evaluated through user centered experiments for failure detection
Semantic Source Code Models Using Identifier Embeddings
The emergence of online open source repositories in the recent years has led
to an explosion in the volume of openly available source code, coupled with
metadata that relate to a variety of software development activities. As an
effect, in line with recent advances in machine learning research, software
maintenance activities are switching from symbolic formal methods to
data-driven methods. In this context, the rich semantics hidden in source code
identifiers provide opportunities for building semantic representations of code
which can assist tasks of code search and reuse. To this end, we deliver in the
form of pretrained vector space models, distributed code representations for
six popular programming languages, namely, Java, Python, PHP, C, C++, and C#.
The models are produced using fastText, a state-of-the-art library for learning
word representations. Each model is trained on data from a single programming
language; the code mined for producing all models amounts to over 13.000
repositories. We indicate dissimilarities between natural language and source
code, as well as variations in coding conventions in between the different
programming languages we processed. We describe how these heterogeneities
guided the data preprocessing decisions we took and the selection of the
training parameters in the released models. Finally, we propose potential
applications of the models and discuss limitations of the models.Comment: 16th International Conference on Mining Software Repositories (MSR
2019): Data Showcase Trac
Exploiting Rich Syntactic Information for Semantic Parsing with Graph-to-Sequence Model
Existing neural semantic parsers mainly utilize a sequence encoder, i.e., a
sequential LSTM, to extract word order features while neglecting other valuable
syntactic information such as dependency graph or constituent trees. In this
paper, we first propose to use the \textit{syntactic graph} to represent three
types of syntactic information, i.e., word order, dependency and constituency
features. We further employ a graph-to-sequence model to encode the syntactic
graph and decode a logical form. Experimental results on benchmark datasets
show that our model is comparable to the state-of-the-art on Jobs640, ATIS and
Geo880. Experimental results on adversarial examples demonstrate the robustness
of the model is also improved by encoding more syntactic information.Comment: EMNLP'1
- …