18,389 research outputs found
Feature Weight Tuning for Recursive Neural Networks
This paper addresses how a recursive neural network model can automatically
leave out useless information and emphasize important evidence, in other words,
to perform "weight tuning" for higher-level representation acquisition. We
propose two models, Weighted Neural Network (WNN) and Binary-Expectation Neural
Network (BENN), which automatically control how much one specific unit
contributes to the higher-level representation. The proposed model can be
viewed as incorporating a more powerful compositional function for embedding
acquisition in recursive neural networks. Experimental results demonstrate the
significant improvement over standard neural models
Augmenting Compositional Models for Knowledge Base Completion Using Gradient Representations
Neural models of Knowledge Base data have typically employed compositional
representations of graph objects: entity and relation embeddings are
systematically combined to evaluate the truth of a candidate Knowedge Base
entry. Using a model inspired by Harmonic Grammar, we propose to tokenize
triplet embeddings by subjecting them to a process of optimization with respect
to learned well-formedness conditions on Knowledge Base triplets. The resulting
model, known as Gradient Graphs, leads to sizable improvements when implemented
as a companion to compositional models. Also, we show that the
"supracompositional" triplet token embeddings it produces have interpretable
properties that prove helpful in performing inference on the resulting triplet
representations.Comment: 10 pages, 2 figures, To appear in proceedings of the Society for
Computation in Linguistics (SCIL 2019
Learning Phrase Embeddings from Paraphrases with GRUs
Learning phrase representations has been widely explored in many Natural
Language Processing (NLP) tasks (e.g., Sentiment Analysis, Machine Translation)
and has shown promising improvements. Previous studies either learn
non-compositional phrase representations with general word embedding learning
techniques or learn compositional phrase representations based on syntactic
structures, which either require huge amounts of human annotations or cannot be
easily generalized to all phrases. In this work, we propose to take advantage
of large-scaled paraphrase database and present a pair-wise gated recurrent
units (pairwise-GRU) framework to generate compositional phrase
representations. Our framework can be re-used to generate representations for
any phrases. Experimental results show that our framework achieves
state-of-the-art results on several phrase similarity tasks.Comment: IJCNLP'2017 Workshop on Curation and Applications of Parallel and
Comparable Corpor
Neural Lattice Language Models
In this work, we propose a new language modeling paradigm that has the
ability to perform both prediction and moderation of information flow at
multiple granularities: neural lattice language models. These models construct
a lattice of possible paths through a sentence and marginalize across this
lattice to calculate sequence probabilities or optimize parameters. This
approach allows us to seamlessly incorporate linguistic intuitions - including
polysemy and existence of multi-word lexical items - into our language model.
Experiments on multiple language modeling tasks show that English neural
lattice language models that utilize polysemous embeddings are able to improve
perplexity by 9.95% relative to a word-level baseline, and that a Chinese model
that handles multi-character tokens is able to improve perplexity by 20.94%
relative to a character-level baseline
Learning Chinese Word Representations From Glyphs Of Characters
In this paper, we propose new methods to learn Chinese word representations.
Chinese characters are composed of graphical components, which carry rich
semantics. It is common for a Chinese learner to comprehend the meaning of a
word from these graphical components. As a result, we propose models that
enhance word representations by character glyphs. The character glyph features
are directly learned from the bitmaps of characters by convolutional
auto-encoder(convAE), and the glyph features improve Chinese word
representations which are already enhanced by character embeddings. Another
contribution in this paper is that we created several evaluation datasets in
traditional Chinese and made them public
A Compositional Approach to Network Algorithms
We present elements of a typing theory for flow networks, where "types",
"typings", and "type inference" are formulated in terms of familiar notions
from polyhedral analysis and convex optimization. Based on this typing theory,
we develop an alternative approach to the design and analysis of network
algorithms, which we illustrate by applying it to the max-flow problem in
multiple-source, multiple-sink, capacited directed planar graphs.Comment: 44 pages, 12 figures, 47 reference
Learning Compositional Representations for Few-Shot Recognition
One of the key limitations of modern deep learning approaches lies in the
amount of data required to train them. Humans, by contrast, can learn to
recognize novel categories from just a few examples. Instrumental to this rapid
learning ability is the compositional structure of concept representations in
the human brain --- something that deep learning models are lacking. In this
work, we make a step towards bridging this gap between human and machine
learning by introducing a simple regularization technique that allows the
learned representation to be decomposable into parts. Our method uses
category-level attribute annotations to disentangle the feature space of a
network into subspaces corresponding to the attributes. These attributes can be
either purely visual, like object parts, or more abstract, like openness and
symmetry. We demonstrate the value of compositional representations on three
datasets: CUB-200-2011, SUN397, and ImageNet, and show that they require fewer
examples to learn classifiers for novel categories
Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms
Many deep learning architectures have been proposed to model the
compositionality in text sequences, requiring a substantial number of
parameters and expensive computations. However, there has not been a rigorous
evaluation regarding the added value of sophisticated compositional functions.
In this paper, we conduct a point-by-point comparative study between Simple
Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling
operations, relative to word-embedding-based RNN/CNN models. Surprisingly,
SWEMs exhibit comparable or even superior performance in the majority of cases
considered. Based upon this understanding, we propose two additional pooling
strategies over learned word embeddings: (i) a max-pooling operation for
improved interpretability; and (ii) a hierarchical pooling operation, which
preserves spatial (n-gram) information within text sequences. We present
experiments on 17 datasets encompassing three tasks: (i) (long) document
classification; (ii) text sequence matching; and (iii) short text tasks,
including classification and tagging. The source code and datasets can be
obtained from https:// github.com/dinghanshen/SWEM.Comment: To appear at ACL 2018 (code: https://github.com/dinghanshen/SWEM
Task-Driven Modular Networks for Zero-Shot Compositional Learning
One of the hallmarks of human intelligence is the ability to compose learned
knowledge into novel concepts which can be recognized without a single training
example. In contrast, current state-of-the-art methods require hundreds of
training examples for each possible category to build reliable and accurate
classifiers. To alleviate this striking difference in efficiency, we propose a
task-driven modular architecture for compositional reasoning and sample
efficient learning. Our architecture consists of a set of neural network
modules, which are small fully connected layers operating in semantic concept
space. These modules are configured through a gating function conditioned on
the task to produce features representing the compatibility between the input
image and the concept under consideration. This enables us to express tasks as
a combination of sub-tasks and to generalize to unseen categories by
reweighting a set of small modules. Furthermore, the network can be trained
efficiently as it is fully differentiable and its modules operate on small
sub-spaces. We focus our study on the problem of compositional zero-shot
classification of object-attribute categories. We show in our experiments that
current evaluation metrics are flawed as they only consider unseen
object-attribute pairs. When extending the evaluation to the generalized
setting which accounts also for pairs seen during training, we discover that
naive baseline methods perform similarly or better than current approaches.
However, our modular network is able to outperform all existing approaches on
two widely-used benchmark datasets.Comment: http://www.cs.cmu.edu/~spurushw/projects/compositional.htm
Neural Enquirer: Learning to Query Tables with Natural Language
We proposed Neural Enquirer as a neural network architecture to execute a
natural language (NL) query on a knowledge-base (KB) for answers. Basically,
Neural Enquirer finds the distributed representation of a query and then
executes it on knowledge-base tables to obtain the answer as one of the values
in the tables. Unlike similar efforts in end-to-end training of semantic
parsers, Neural Enquirer is fully "neuralized": it not only gives
distributional representation of the query and the knowledge-base, but also
realizes the execution of compositional queries as a series of differentiable
operations, with intermediate results (consisting of annotations of the tables
at different levels) saved on multiple layers of memory. Neural Enquirer can be
trained with gradient descent, with which not only the parameters of the
controlling components and semantic parsing component, but also the embeddings
of the tables and query words can be learned from scratch. The training can be
done in an end-to-end fashion, but it can take stronger guidance, e.g., the
step-by-step supervision for complicated queries, and benefit from it. Neural
Enquirer is one step towards building neural network systems which seek to
understand language by executing it on real-world. Our experiments show that
Neural Enquirer can learn to execute fairly complicated NL queries on tables
with rich structures
- …