444 research outputs found
Designing a CPU model: from a pseudo-formal document to fast code
For validating low level embedded software, engineers use simulators that
take the real binary as input. Like the real hardware, these full-system
simulators are organized as a set of components. The main component is the CPU
simulator (ISS), because it is the usual bottleneck for the simulation speed,
and its development is a long and repetitive task. Previous work showed that an
ISS can be generated from an Architecture Description Language (ADL). In the
work reported in this paper, we generate a CPU simulator directly from the
pseudo-formal descriptions of the reference manual. For each instruction, we
extract the information describing its behavior, its binary encoding, and its
assembly syntax. Next, after automatically applying many optimizations on the
extracted information, we generate a SystemC/TLM ISS. We also generate tests
for the decoder and a formal specification in Coq. Experiments show that the
generated ISS is as fast and stable as our previous hand-written ISS.Comment: 3rd Workshop on: Rapid Simulation and Performance Evaluation: Methods
and Tools (2011
Monadic compositional parsing with context using Maltese as a case study
Combinator-based parsing using functional programming provides an elegant, and compositional approach to parser design and development. By its very nature, sensitivity to context usually fails to be properly addressed in these approaches. We identify two techniques to deal with context compositionally, particularly suited for natural language parsing. As case studies, we present parsers for Maltese definite nouns and conjugated verbs of the first form.peer-reviewe
Automatic annotation of the Penn-treebank with LFG f-structure information
Lexical-Functional Grammar f-structures are abstract syntactic representations approximating basic predicate-argument structure. Treebanks annotated with f-structure information are required as training resources for stochastic versions of unification and constraint-based
grammars and for the automatic extraction of such resources. In a number of papers (Frank, 2000; Sadler, van Genabith and Way, 2000) have developed methods for automatically annotating treebank resources with f-structure information. However, to date, these methods
have only been applied to treebank fragments of the order of a few hundred trees. In the present paper we present a new method that scales and has been applied to a complete treebank, in our case the WSJ section of Penn-II (Marcus et al, 1994), with more than 1,000,000 words in about 50,000 sentences
Type classes for efficient exact real arithmetic in Coq
Floating point operations are fast, but require continuous effort on the part
of the user in order to ensure that the results are correct. This burden can be
shifted away from the user by providing a library of exact analysis in which
the computer handles the error estimates. Previously, we [Krebbers/Spitters
2011] provided a fast implementation of the exact real numbers in the Coq proof
assistant. Our implementation improved on an earlier implementation by O'Connor
by using type classes to describe an abstract specification of the underlying
dense set from which the real numbers are built. In particular, we used dyadic
rationals built from Coq's machine integers to obtain a 100 times speed up of
the basic operations already. This article is a substantially expanded version
of [Krebbers/Spitters 2011] in which the implementation is extended in the
various ways. First, we implement and verify the sine and cosine function.
Secondly, we create an additional implementation of the dense set based on
Coq's fast rational numbers. Thirdly, we extend the hierarchy to capture order
on undecidable structures, while it was limited to decidable structures before.
This hierarchy, based on type classes, allows us to share theory on the
naturals, integers, rationals, dyadics, and reals in a convenient way. Finally,
we obtain another dramatic speed-up by avoiding evaluation of termination
proofs at runtime.Comment: arXiv admin note: text overlap with arXiv:1105.275
Web Data Extraction, Applications and Techniques: A Survey
Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many
approaches to extracting data from the Web have been designed to solve specific
problems and operate in ad-hoc domains. Other approaches, instead, heavily
reuse techniques and algorithms developed in the field of Information
Extraction.
This survey aims at providing a structured and comprehensive overview of the
literature in the field of Web Data Extraction. We provided a simple
classification framework in which existing Web Data Extraction applications are
grouped into two main classes, namely applications at the Enterprise level and
at the Social Web level. At the Enterprise level, Web Data Extraction
techniques emerge as a key tool to perform data analysis in Business and
Competitive Intelligence systems as well as for business process
re-engineering. At the Social Web level, Web Data Extraction techniques allow
to gather a large amount of structured data continuously generated and
disseminated by Web 2.0, Social Media and Online Social Network users and this
offers unprecedented opportunities to analyze human behavior at a very large
scale. We discuss also the potential of cross-fertilization, i.e., on the
possibility of re-using Web Data Extraction techniques originally designed to
work in a given domain, in other domains.Comment: Knowledge-based System
Purely functional GLL parsing
Generalised parsing has become increasingly important in the context of software language design and several compiler generators and language workbenches have adopted generalised parsing algorithms such as GLR and GLL. The original GLL parsing algorithms are described in low-level pseudo-code as the output of a parser generator. This paper explains GLL parsing differently, defining the FUN-GLL algorithm as a collection of pure, mathematical functions and focussing on the logic of the algorithm by omitting implementation details. In particular, the data structures are modelled by abstract sets and relations rather than specialised implementations. The description is further simplified by omitting lookahead and adopting the binary subtree representation of derivations to avoid the clerical overhead of graph construction.
Conventional parser combinators inherit the drawbacks from the recursive descent algorithms they implement. Based on FUN-GLL, this paper defines generalised parser combinators that overcome these problems. Th
- âŠ