8,816 research outputs found
The OS* Algorithm: a Joint Approach to Exact Optimization and Sampling
Most current sampling algorithms for high-dimensional distributions are based
on MCMC techniques and are approximate in the sense that they are valid only
asymptotically. Rejection sampling, on the other hand, produces valid samples,
but is unrealistically slow in high-dimension spaces. The OS* algorithm that we
propose is a unified approach to exact optimization and sampling, based on
incremental refinements of a functional upper bound, which combines ideas of
adaptive rejection sampling and of A* optimization search. We show that the
choice of the refinement can be done in a way that ensures tractability in
high-dimension spaces, and we present first experiments in two different
settings: inference in high-order HMMs and in large discrete graphical models.Comment: 21 page
Discriminative Recurrent Sparse Auto-Encoders
We present the discriminative recurrent sparse auto-encoder model, comprising
a recurrent encoder of rectified linear units, unrolled for a fixed number of
iterations, and connected to two linear decoders that reconstruct the input and
predict its supervised classification. Training via
backpropagation-through-time initially minimizes an unsupervised sparse
reconstruction error; the loss function is then augmented with a discriminative
term on the supervised classification. The depth implicit in the
temporally-unrolled form allows the system to exhibit all the power of deep
networks, while substantially reducing the number of trainable parameters.
From an initially unstructured network the hidden units differentiate into
categorical-units, each of which represents an input prototype with a
well-defined class; and part-units representing deformations of these
prototypes. The learned organization of the recurrent encoder is hierarchical:
part-units are driven directly by the input, whereas the activity of
categorical-units builds up over time through interactions with the part-units.
Even using a small number of hidden units per layer, discriminative recurrent
sparse auto-encoders achieve excellent performance on MNIST.Comment: Added clarifications suggested by reviewers. 15 pages, 10 figure
BSDAR: Beam Search Decoding with Attention Reward in Neural Keyphrase Generation
This study mainly investigates two decoding problems in neural keyphrase
generation: sequence length bias and beam diversity. We introduce an extension
of beam search inference based on word-level and n-gram level attention score
to adjust and constrain Seq2Seq prediction at test time. Results show that our
proposed solution can overcome the algorithm bias to shorter and nearly
identical sequences, resulting in a significant improvement of the decoding
performance on generating keyphrases that are present and absent in source
text
Turkish handwritten text recognition: a case of agglutinative languages
We describe a system for recognizing unconstrained Turkish handwritten text. Turkish has agglutinative morphology and theoretically an infinite number of words that can be generated by adding more suffixes to the word. This makes lexicon-based recognition approaches, where the most likely word is selected among all the alternatives in a lexicon, unsuitable for Turkish. We describe our approach to the problem using a Turkish prefix recognizer. First results of the system demonstrates the promise of this approach, with top-10 word recognition rate of about 40% for a small test data of mixed handprint and cursive writing. The lexicon-based approach with a 17,000 word-lexicon (with test words added) achieves 56% top-10 word recognition rate
Improving Abstraction in Text Summarization
Abstractive text summarization aims to shorten long text documents into a
human readable form that contains the most important facts from the original
document. However, the level of actual abstraction as measured by novel phrases
that do not appear in the source document remains low in existing approaches.
We propose two techniques to improve the level of abstraction of generated
summaries. First, we decompose the decoder into a contextual network that
retrieves relevant parts of the source document, and a pretrained language
model that incorporates prior knowledge about language generation. Second, we
propose a novelty metric that is optimized directly through policy learning to
encourage the generation of novel phrases. Our model achieves results
comparable to state-of-the-art models, as determined by ROUGE scores and human
evaluations, while achieving a significantly higher level of abstraction as
measured by n-gram overlap with the source document
A Neural Model for Generating Natural Language Summaries of Program Subroutines
Source code summarization -- creating natural language descriptions of source
code behavior -- is a rapidly-growing research topic with applications to
automatic documentation generation, program comprehension, and software
maintenance. Traditional techniques relied on heuristics and templates built
manually by human experts. Recently, data-driven approaches based on neural
machine translation have largely overtaken template-based systems. But nearly
all of these techniques rely almost entirely on programs having good internal
documentation; without clear identifier names, the models fail to create good
summaries. In this paper, we present a neural model that combines words from
code with code structure from an AST. Unlike previous approaches, our model
processes each data source as a separate input, which allows the model to learn
code structure independent of the text in code. This process helps our approach
provide coherent summaries in many cases even when zero internal documentation
is provided. We evaluate our technique with a dataset we created from 2.1m Java
methods. We find improvement over two baseline techniques from SE literature
and one from NLP literature
- …