444 research outputs found

    Program extraction applied to monadic parsing

    Get PDF

    Designing a CPU model: from a pseudo-formal document to fast code

    Get PDF
    For validating low level embedded software, engineers use simulators that take the real binary as input. Like the real hardware, these full-system simulators are organized as a set of components. The main component is the CPU simulator (ISS), because it is the usual bottleneck for the simulation speed, and its development is a long and repetitive task. Previous work showed that an ISS can be generated from an Architecture Description Language (ADL). In the work reported in this paper, we generate a CPU simulator directly from the pseudo-formal descriptions of the reference manual. For each instruction, we extract the information describing its behavior, its binary encoding, and its assembly syntax. Next, after automatically applying many optimizations on the extracted information, we generate a SystemC/TLM ISS. We also generate tests for the decoder and a formal specification in Coq. Experiments show that the generated ISS is as fast and stable as our previous hand-written ISS.Comment: 3rd Workshop on: Rapid Simulation and Performance Evaluation: Methods and Tools (2011

    Monadic compositional parsing with context using Maltese as a case study

    Get PDF
    Combinator-based parsing using functional programming provides an elegant, and compositional approach to parser design and development. By its very nature, sensitivity to context usually fails to be properly addressed in these approaches. We identify two techniques to deal with context compositionally, particularly suited for natural language parsing. As case studies, we present parsers for Maltese definite nouns and conjugated verbs of the first form.peer-reviewe

    Automatic annotation of the Penn-treebank with LFG f-structure information

    Get PDF
    Lexical-Functional Grammar f-structures are abstract syntactic representations approximating basic predicate-argument structure. Treebanks annotated with f-structure information are required as training resources for stochastic versions of unification and constraint-based grammars and for the automatic extraction of such resources. In a number of papers (Frank, 2000; Sadler, van Genabith and Way, 2000) have developed methods for automatically annotating treebank resources with f-structure information. However, to date, these methods have only been applied to treebank fragments of the order of a few hundred trees. In the present paper we present a new method that scales and has been applied to a complete treebank, in our case the WSJ section of Penn-II (Marcus et al, 1994), with more than 1,000,000 words in about 50,000 sentences

    Type classes for efficient exact real arithmetic in Coq

    Get PDF
    Floating point operations are fast, but require continuous effort on the part of the user in order to ensure that the results are correct. This burden can be shifted away from the user by providing a library of exact analysis in which the computer handles the error estimates. Previously, we [Krebbers/Spitters 2011] provided a fast implementation of the exact real numbers in the Coq proof assistant. Our implementation improved on an earlier implementation by O'Connor by using type classes to describe an abstract specification of the underlying dense set from which the real numbers are built. In particular, we used dyadic rationals built from Coq's machine integers to obtain a 100 times speed up of the basic operations already. This article is a substantially expanded version of [Krebbers/Spitters 2011] in which the implementation is extended in the various ways. First, we implement and verify the sine and cosine function. Secondly, we create an additional implementation of the dense set based on Coq's fast rational numbers. Thirdly, we extend the hierarchy to capture order on undecidable structures, while it was limited to decidable structures before. This hierarchy, based on type classes, allows us to share theory on the naturals, integers, rationals, dyadics, and reals in a convenient way. Finally, we obtain another dramatic speed-up by avoiding evaluation of termination proofs at runtime.Comment: arXiv admin note: text overlap with arXiv:1105.275

    Efficient combinator parsing for natural-language.

    Get PDF

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

    Purely functional GLL parsing

    Get PDF
    Generalised parsing has become increasingly important in the context of software language design and several compiler generators and language workbenches have adopted generalised parsing algorithms such as GLR and GLL. The original GLL parsing algorithms are described in low-level pseudo-code as the output of a parser generator. This paper explains GLL parsing differently, defining the FUN-GLL algorithm as a collection of pure, mathematical functions and focussing on the logic of the algorithm by omitting implementation details. In particular, the data structures are modelled by abstract sets and relations rather than specialised implementations. The description is further simplified by omitting lookahead and adopting the binary subtree representation of derivations to avoid the clerical overhead of graph construction. Conventional parser combinators inherit the drawbacks from the recursive descent algorithms they implement. Based on FUN-GLL, this paper defines generalised parser combinators that overcome these problems. Th
    • 

    corecore