5 research outputs found
Exploiting Multi-typed Treebanks for Parsing with Deep Multi-task Learning
Various treebanks have been released for dependency parsing. Despite that
treebanks may belong to different languages or have different annotation
schemes, they contain syntactic knowledge that is potential to benefit each
other. This paper presents an universal framework for exploiting these
multi-typed treebanks to improve parsing with deep multi-task learning. We
consider two kinds of treebanks as source: the multilingual universal treebanks
and the monolingual heterogeneous treebanks. Multiple treebanks are trained
jointly and interacted with multi-level parameter sharing. Experiments on
several benchmark datasets in various languages demonstrate that our approach
can make effective use of arbitrary source treebanks to improve target parsing
models.Comment: 11 pages, 4 figure
Deep Temporal Sigmoid Belief Networks for Sequence Modeling
Deep dynamic generative models are developed to learn sequential dependencies
in time-series data. The multi-layered model is designed by constructing a
hierarchy of temporal sigmoid belief networks (TSBNs), defined as a sequential
stack of sigmoid belief networks (SBNs). Each SBN has a contextual hidden
state, inherited from the previous SBNs in the sequence, and is used to
regulate its hidden bias. Scalable learning and inference algorithms are
derived by introducing a recognition model that yields fast sampling from the
variational posterior. This recognition model is trained jointly with the
generative model, by maximizing its variational lower bound on the
log-likelihood. Experimental results on bouncing balls, polyphonic music,
motion capture, and text streams show that the proposed approach achieves
state-of-the-art predictive performance, and has the capacity to synthesize
various sequences.Comment: to appear in NIPS 201
Structured Generative Models of Natural Source Code
We study the problem of building generative models of natural source code
(NSC); that is, source code written and understood by humans. Our primary
contribution is to describe a family of generative models for NSC that have
three key properties: First, they incorporate both sequential and hierarchical
structure. Second, we learn a distributed representation of source code
elements. Finally, they integrate closely with a compiler, which allows
leveraging compiler logic and abstractions when building structure into the
model. We also develop an extension that includes more complex structure,
refining how the model generates identifier tokens based on what variables are
currently in scope. Our models can be learned efficiently, and we show
empirically that including appropriate structure greatly improves the models,
measured by the probability of generating test programs
Many Languages, One Parser
We train one multilingual model for dependency parsing and use it to parse
sentences in several languages. The parsing model uses (i) multilingual word
clusters and embeddings; (ii) token-level language information; and (iii)
language-specific features (fine-grained POS tags). This input representation
enables the parser not only to parse effectively in multiple languages, but
also to generalize across languages based on linguistic universals and
typological similarities, making it more effective to learn from limited
annotations. Our parser's performance compares favorably to strong baselines in
a range of data scenarios, including when the target language has a large
treebank, a small treebank, or no treebank for training
Incremental sigmoid belief networks for grammar learning
We propose a class of Bayesian networks appropriate for structured prediction problems where the Bayesian network’s model structure is a function of the predicted output structure. These incremental sigmoid belief networks (ISBNs) make decoding possible because inference with partial output structures does not require summing over the unboundedly many compatible model structures, due to their directed edges and incrementally specified model structure. ISBNs are specifically targeted at challenging structured prediction problems such as natural language parsing, where learning the domain’s complex statistical dependencies benefits from large numbers of latent variables. While exact inference in ISBNs with large numbers of latent variables is not tractable, we propose two efficient approximations. First, we demonstrate that a previous neural network parsing model can be viewed as a coarse mean-field approximation to inference with ISBNs. We then derive a more accurate but still tractable variational approximation, which proves effective in artificial experiments. We compare the effectiveness of these models on a benchmark natural language parsing task, where they achieve accuracy competitive with the state-of-the-art. The model which is a closer approximation model of natural language grammar learning