167 research outputs found

    The ModelCC Model-Driven Parser Generator

    Full text link
    Syntax-directed translation tools require the specification of a language by means of a formal grammar. This grammar must conform to the specific requirements of the parser generator to be used. This grammar is then annotated with semantic actions for the resulting system to perform its desired function. In this paper, we introduce ModelCC, a model-based parser generator that decouples language specification from language processing, avoiding some of the problems caused by grammar-driven parser generators. ModelCC receives a conceptual model as input, along with constraints that annotate it. It is then able to create a parser for the desired textual syntax and the generated parser fully automates the instantiation of the language conceptual model. ModelCC also includes a reference resolution mechanism so that ModelCC is able to instantiate abstract syntax graphs, rather than mere abstract syntax trees.Comment: In Proceedings PROLE 2014, arXiv:1501.0169

    Authoring XML Documents with XHTML and MATHML Support

    Get PDF
    Since the late 1970s, a large number of scientific documents have been authored in TeX or its derivations such as LaTeX. These typesetting systems allow anybody to write highquality books and articles. But the TeX syntax is not compatible with HTML or XML. So the WWW consortium\u27s answer is MathML. The primary goal of MathML is to enable mathematical documents to be communicated, exchanged, and processed on the Web. Therefore, MathML documents are usually embedded with XHTML documents. Currently, there are several XHTML+MathML editors. The most popular editors use two common approaches. The first approach offers a WhatYouSeeIsWhatYouGet (WYSIWYG) interface. But experts often find it is difficult to have precise control. For example, font attribute is determined by the direction of the mouse movement during the event of insertion. The second approach uses a textbased form. The entire document is presented as a treelike structure. The treelike structure is unintuitive and extremely inefficient to comprehend, particularly for twodimensional structures such as tables or equations. Here, I present a WhatYouSeeIsWhatYouNeed (WYSIWYN) editing interface that satisfies the needs of experts who have knowledge of XHML+MathML. The WYSIWYN interface is presented in a form that simultaneously makes editing operations unambiguous and that looks recognizable. It avoids unexpected errors by showing enough structure, but still maintain enough visual presentation to avoid confusion. This report presents a test bench, an XHTML+MathML editor with a new navigation model that demonstrates the WYSIWYN user interface. Similar to a WYSIWYG editor, XHTML+MathML documents can be visualized during editing, and users can check the current XPath position by viewing the status bar. In contrast to the WYSIWYG editor, the new approach offers users the ability to view local structure of the current element with a selected style. In this way, users can magnify any ambiguous positions and still be able to visualize mathematical documents. In addition, the test bench offers multiple WYSIWYN modes with different levels of magnification

    Parsing for agile modeling

    Get PDF
    Agile modeling refers to a set of methods that allow for a quick initial development of an importer and its further refinement. These requirements are not met simultaneously by the current parsing technology. Problems with parsing became a bottleneck in our research of agile modeling. In this thesis we introduce a novel approach to specify and build parsers. Our approach allows for expressive, tolerant and composable parsers without sacrificing performance. The approach is based on a context-sensitive extension of parsing expression grammars that allows a grammar engineer to specify complex language restrictions. To insure high parsing performance we automatically analyze a grammar definition and choose different parsing strategies for different parts of the grammar. We show that context-sensitive parsing expression grammars allow for highly composable, tolerant and variable-grained parsers that can be easily refined. Different parsing strategies significantly insure high-performance of parsers without sacrificing expressiveness of the underlying grammars

    A Natural Proof System for Natural Language

    Get PDF

    Geospatial database generation from digital newspapers: use case for risk and disaster domains.

    Get PDF
    Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies.The generation of geospatial databases is expensive in terms of time and money. Many geospatial users still lack spatial data. Geographic Information Extraction and Retrieval systems can alleviate this problem. This work proposes a method to populate spatial databases automatically from the Web. It applies the approach to the risk and disaster domain taking digital newspapers as a data source. News stories on digital newspapers contain rich thematic information that can be attached to places. The use case of automating spatial database generation is applied to Mexico using placenames. In Mexico, small and medium disasters occur most years. The facts about these are frequently mentioned in newspapers but rarely stored as records in national databases. Therefore, it is difficult to estimate human and material losses of those events. This work present two ways to extract information from digital news using natural languages techniques for distilling the text, and the national gazetteer codes to achieve placename-attribute disambiguation. Two outputs are presented; a general one that exposes highly relevant news, and another that attaches attributes of interest to placenames. The later achieved a 75% rate of thematic relevance under qualitative analysis
    corecore