41 research outputs found
Object oriented regular expressions
Regular expressions are used to parse textual data to match patterns and extract variables. They have been implemented in a vast number of programming languages with a significant quantity of research devoted to improving their operational efficiency. However, regular expressions are limited to finding linear matches. Little research has been done in the field of object-oriented results which would allow textual or binary data to be converted to multi-layered objects. This is significantly relevant as many of todaypsilas data formats are object-based. This paper extends our previous work by detailing an algorithmic approach to perform object-oriented parsing, and provides an initial study of benchmarks of the algorithms of our contributio
Information systems models in higher education
This paper intends to contribute to a better understanding of the process through which information resource, information technology, and organisation actors can contribute to the performance and quality of higher education institutions. Conceptual models will be presented and discussed
RUN, Xtatic, RUN: EFFICIENT IMPLEMENTATION OF AN OBJECT-ORIENTED LANGUAGE WITH REGULAR PATTERN MATCHING
Schema languages such as DTD, XML Schema, and Relax NG have been steadily growing in importance in the XML community. A schema language provides a mechanism for defining the type of XML documents; i.e., the set of constraints that specify the structure of XML documents that are acceptable as data for a certain programming task. A number of recent language designs—many of them descended from the XDuce language of Hosoya, Pierce, and Vouillon—have showed how such schemas can be used statically for type-checking XML processing code and dynamically for evaluation of XML structures. The technical foundation of such languages is the notion of regular types, a mild generalization of nondeterministic top-down tree automata, which correspond to a core of most popular schema notations, and the no-tion of regular patterns—regular types decorated with variable binders—a powerful and convenient primitive for dynamic inspection of XML values. This dissertation is concerned with one of XDuce’s descendants, Xtatic. The goal of the Xtatic project is to bring the regular type and regular pattern technologies to a wide audience by integrating them with a mainstream object-oriented language. My research focuses on an efficient implementation of Xtatic including a compiler that generates fast and compact target program
Regular Expression Subtyping for XML Query and Update Languages
XML database query languages such as XQuery employ regular expression types
with structural subtyping. Subtyping systems typically have two presentations,
which should be equivalent: a declarative version in which the subsumption rule
may be used anywhere, and an algorithmic version in which the use of
subsumption is limited in order to make typechecking syntax-directed and
decidable. However, the XQuery standard type system circumvents this issue by
using imprecise typing rules for iteration constructs and defining only
algorithmic typechecking, and another extant proposal provides more precise
types for iteration constructs but ignores subtyping. In this paper, we
consider a core XQuery-like language with a subsumption rule and prove the
completeness of algorithmic typechecking; this is straightforward for XQuery
proper but requires some care in the presence of more precise iteration typing
disciplines. We extend this result to an XML update language we have introduced
in earlier work.Comment: ESOP 2008. Companion technical report with proof
From Network Interface to Multithreaded Web Applications: A Case Study in Modular Program Verification
Many verifications of realistic software systems are monolithic, in the sense that they define single global invariants over complete system state. More modular proof techniques promise to support reuse of component proofs and even reduce the effort required to verify one concrete system, just as modularity simplifies standard software development. This paper reports on one case study applying modular proof techniques in the Coq proof assistant. To our knowledge, it is the first modular verification certifying a system that combines infrastructure with an application of interest to end users. We assume a nonblocking API for managing TCP networking streams, and on top of that we work our way up to certifying multithreaded, database-backed Web applications. Key verified components include a cooperative threading library and an implementation of a domain-specific language for XML processing. We have deployed our case-study system on mobile robots, where it interfaces with off-the-shelf components for sensing, actuation, and control.National Science Foundation (U.S.) (Grant CCF-1253229)United States. Defense Advanced Research Projects Agency (Agreement FA8750-12-2-0293
From Network Interface to Multithreaded Web Applications: A Case Study in Modular Program Verification
Many verifications of realistic software systems are monolithic, in the sense that they define single global invariants over complete system state. More modular proof techniques promise to support reuse of component proofs and even reduce the effort required to verify one concrete system, just as modularity simplifies standard software development. This paper reports on one case study applying modular proof techniques in the Coq proof assistant. To our knowledge, it is the first modular verification certifying a system that combines infrastructure with an application of interest to end users. We assume a nonblocking API for managing TCP networking streams, and on top of that we work our way up to certifying multithreaded, database-backed Web applications. Key verified components include a cooperative threading library and an implementation of a domain-specific language for XML processing. We have deployed our case-study system on mobile robots, where it interfaces with off-the-shelf components for sensing, actuation, and control.National Science Foundation (U.S.) (NSF grant CCF-1253229)United States. Defense Advanced Research Projects Agency (DARPA, agreement number FA8750-12-2-0293
Compiling Regular Patterns to Sequential Machines
Pattern matching combined with regular expressions has many applications including text and XML processing, lexical analysis, classification of DNA segments and content-based routing. Patterns contain variables to refer to parts of the matching input. But regular patterns pose the problem of ambiguity: Words can be matched against 'overlapping' sections of the pattern in several ways, yielding different variable bindings. A match policy like shortest or longest match disambiguates the outcome of matching. In order to implement the longest/shortest match policies, we propose to compile regular patterns to sequential machines. This intuitive approach %to resolving ambiguities by means of shortest/longest match %policies (and the slightly different ungreedy/greedy match), with lets us derive a compilation scheme with linear runtime complexity. \par The main contributions of this paper are firstly, a decision procedure for unambiguous regular patterns, which can be matched with a single traversal of the input, and secondly, algorithms to obtain deterministic sequential machines from ambiguous patterns. These produce the shortest(longest) match in two consecutive runs. The first run produces an intermediary result from which all possible variable bindings can be reproduced. The second run then chooses the unique binding which adheres to the given match policy. In the general case, this approach is optimal
Web and Semantic Web Query Languages
A number of techniques have been developed to facilitate
powerful data retrieval on the Web and Semantic Web. Three categories
of Web query languages can be distinguished, according to the format
of the data they can retrieve: XML, RDF and Topic Maps. This article
introduces the spectrum of languages falling into these categories
and summarises their salient aspects. The languages are introduced using
common sample data and query types. Key aspects of the query
languages considered are stressed in a conclusion