6,977 research outputs found
Efficient Analysis of Complex Diagrams using Constraint-Based Parsing
This paper describes substantial advances in the analysis (parsing) of
diagrams using constraint grammars. The addition of set types to the grammar
and spatial indexing of the data make it possible to efficiently parse real
diagrams of substantial complexity. The system is probably the first to
demonstrate efficient diagram parsing using grammars that easily be retargeted
to other domains. The work assumes that the diagrams are available as a flat
collection of graphics primitives: lines, polygons, circles, Bezier curves and
text. This is appropriate for future electronic documents or for vectorized
diagrams converted from scanned images. The classes of diagrams that we have
analyzed include x,y data graphs and genetic diagrams drawn from the biological
literature, as well as finite state automata diagrams (states and arcs). As an
example, parsing a four-part data graph composed of 133 primitives required 35
sec using Macintosh Common Lisp on a Macintosh Quadra 700.Comment: 9 pages, Postscript, no fonts, compressed, uuencoded. Composed in
MSWord 5.1a for the Mac. To appear in ICDAR '95. Other versions at
ftp://ftp.ccs.neu.edu/pub/people/futrell
Robust Grammatical Analysis for Spoken Dialogue Systems
We argue that grammatical analysis is a viable alternative to concept
spotting for processing spoken input in a practical spoken dialogue system. We
discuss the structure of the grammar, and a model for robust parsing which
combines linguistic sources of information and statistical sources of
information. We discuss test results suggesting that grammatical processing
allows fast and accurate processing of spoken input.Comment: Accepted for JNL
Parsing of Spoken Language under Time Constraints
Spoken language applications in natural dialogue settings place serious
requirements on the choice of processing architecture. Especially under adverse
phonetic and acoustic conditions parsing procedures have to be developed which
do not only analyse the incoming speech in a time-synchroneous and incremental
manner, but which are able to schedule their resources according to the varying
conditions of the recognition process. Depending on the actual degree of local
ambiguity the parser has to select among the available constraints in order to
narrow down the search space with as little effort as possible.
A parsing approach based on constraint satisfaction techniques is discussed.
It provides important characteristics of the desired real-time behaviour and
attempts to mimic some of the attention focussing capabilities of the human
speech comprehension mechanism.Comment: 19 pages, LaTe
Interpretable Structure-Evolving LSTM
This paper develops a general framework for learning interpretable data
representation via Long Short-Term Memory (LSTM) recurrent neural networks over
hierarchal graph structures. Instead of learning LSTM models over the pre-fixed
structures, we propose to further learn the intermediate interpretable
multi-level graph structures in a progressive and stochastic way from data
during the LSTM network optimization. We thus call this model the
structure-evolving LSTM. In particular, starting with an initial element-level
graph representation where each node is a small data element, the
structure-evolving LSTM gradually evolves the multi-level graph representations
by stochastically merging the graph nodes with high compatibilities along the
stacked LSTM layers. In each LSTM layer, we estimate the compatibility of two
connected nodes from their corresponding LSTM gate outputs, which is used to
generate a merging probability. The candidate graph structures are accordingly
generated where the nodes are grouped into cliques with their merging
probabilities. We then produce the new graph structure with a
Metropolis-Hasting algorithm, which alleviates the risk of getting stuck in
local optimums by stochastic sampling with an acceptance probability. Once a
graph structure is accepted, a higher-level graph is then constructed by taking
the partitioned cliques as its nodes. During the evolving process,
representation becomes more abstracted in higher-levels where redundant
information is filtered out, allowing more efficient propagation of long-range
data dependencies. We evaluate the effectiveness of structure-evolving LSTM in
the application of semantic object parsing and demonstrate its advantage over
state-of-the-art LSTM models on standard benchmarks.Comment: To appear in CVPR 2017 as a spotlight pape
Weighted universal image compression
We describe a general coding strategy leading to a family of universal image compression systems designed to give good performance in applications where the statistics of the source to be compressed are not available at design time or vary over time or space. The basic approach considered uses a two-stage structure in which the single source code of traditional image compression systems is replaced with a family of codes designed to cover a large class of possible sources. To illustrate this approach, we consider the optimal design and use of two-stage codes containing collections of vector quantizers (weighted universal vector quantization), bit allocations for JPEG-style coding (weighted universal bit allocation), and transform codes (weighted universal transform coding). Further, we demonstrate the benefits to be gained from the inclusion of perceptual distortion measures and optimal parsing. The strategy yields two-stage codes that significantly outperform their single-stage predecessors. On a sequence of medical images, weighted universal vector quantization outperforms entropy coded vector quantization by over 9 dB. On the same data sequence, weighted universal bit allocation outperforms a JPEG-style code by over 2.5 dB. On a collection of mixed test and image data, weighted universal transform coding outperforms a single, data-optimized transform code (which gives performance almost identical to that of JPEG) by over 6 dB
- …