17,615 research outputs found

    A Robust Parsing Algorithm For Link Grammars

    Full text link
    In this paper we present a robust parsing algorithm based on the link grammar formalism for parsing natural languages. Our algorithm is a natural extension of the original dynamic programming recognition algorithm which recursively counts the number of linkages between two words in the input sentence. The modified algorithm uses the notion of a null link in order to allow a connection between any pair of adjacent words, regardless of their dictionary definitions. The algorithm proceeds by making three dynamic programming passes. In the first pass, the input is parsed using the original algorithm which enforces the constraints on links to ensure grammaticality. In the second pass, the total cost of each substring of words is computed, where cost is determined by the number of null links necessary to parse the substring. The final pass counts the total number of parses with minimal cost. All of the original pruning techniques have natural counterparts in the robust algorithm. When used together with memoization, these techniques enable the algorithm to run efficiently with cubic worst-case complexity. We have implemented these ideas and tested them by parsing the Switchboard corpus of conversational English. This corpus is comprised of approximately three million words of text, corresponding to more than 150 hours of transcribed speech collected from telephone conversations restricted to 70 different topics. Although only a small fraction of the sentences in this corpus are "grammatical" by standard criteria, the robust link grammar parser is able to extract relevant structure for a large portion of the sentences. We present the results of our experiments using this system, including the analyses of selected and random sentences from the corpus.Comment: 17 pages, compressed postscrip

    Hierarchical Object Parsing from Structured Noisy Point Clouds

    Full text link
    Object parsing and segmentation from point clouds are challenging tasks because the relevant data is available only as thin structures along object boundaries or other features, and is corrupted by large amounts of noise. To handle this kind of data, flexible shape models are desired that can accurately follow the object boundaries. Popular models such as Active Shape and Active Appearance models lack the necessary flexibility for this task, while recent approaches such as the Recursive Compositional Models make model simplifications in order to obtain computational guarantees. This paper investigates a hierarchical Bayesian model of shape and appearance in a generative setting. The input data is explained by an object parsing layer, which is a deformation of a hidden PCA shape model with Gaussian prior. The paper also introduces a novel efficient inference algorithm that uses informed data-driven proposals to initialize local searches for the hidden variables. Applied to the problem of object parsing from structured point clouds such as edge detection images, the proposed approach obtains state of the art parsing errors on two standard datasets without using any intensity information.Comment: 13 pages, 16 figure

    Lightweight Call-Graph Construction for Multilingual Software Analysis

    Full text link
    Analysis of multilingual codebases is a topic of increasing importance. In prior work, we have proposed the MLSA (MultiLingual Software Analysis) architecture, an approach to the lightweight analysis of multilingual codebases, and have shown how it can be used to address the challenge of constructing a single call graph from multilingual software with mutual calls. This paper addresses the challenge of constructing monolingual call graphs in a lightweight manner (consistent with the objective of MLSA) which nonetheless yields sufficient information for resolving language interoperability calls. A novel approach is proposed which leverages information from a compiler-generated AST to provide the quality of call graph necessary, while the program itself is written using an Island Grammar that parses the AST providing the lightweight aspect necessary. Performance results are presented for a C/C++ implementation of the approach, PAIGE (Parsing AST using Island Grammar Call Graph Emitter) showing that despite its lightweight nature, it outperforms Doxgen, is robust to changes in the (Clang) AST, and is not restricted to C/C++.Comment: 10 page

    Polyglot Semantic Parsing in APIs

    Full text link
    Traditional approaches to semantic parsing (SP) work by training individual models for each available parallel dataset of text-meaning pairs. In this paper, we explore the idea of polyglot semantic translation, or learning semantic parsing models that are trained on multiple datasets and natural languages. In particular, we focus on translating text to code signature representations using the software component datasets of Richardson and Kuhn (2017a,b). The advantage of such models is that they can be used for parsing a wide variety of input natural languages and output programming languages, or mixed input languages, using a single unified model. To facilitate modeling of this type, we develop a novel graph-based decoding framework that achieves state-of-the-art performance on the above datasets, and apply this method to two other benchmark SP tasks.Comment: accepted for NAACL-2018 (camera ready version

    Dense Piecewise Planar RGB-D SLAM for Indoor Environments

    Full text link
    The paper exploits weak Manhattan constraints to parse the structure of indoor environments from RGB-D video sequences in an online setting. We extend the previous approach for single view parsing of indoor scenes to video sequences and formulate the problem of recovering the floor plan of the environment as an optimal labeling problem solved using dynamic programming. The temporal continuity is enforced in a recursive setting, where labeling from previous frames is used as a prior term in the objective function. In addition to recovery of piecewise planar weak Manhattan structure of the extended environment, the orthogonality constraints are also exploited by visual odometry and pose graph optimization. This yields reliable estimates in the presence of large motions and absence of distinctive features to track. We evaluate our method on several challenging indoors sequences demonstrating accurate SLAM and dense mapping of low texture environments. On existing TUM benchmark we achieve competitive results with the alternative approaches which fail in our environments.Comment: International Conference on Intelligent Robots and Systems (IROS) 201
    • …
    corecore