17,660 research outputs found
Syntaksin kuvaaminen käyttäen tähdettömiä säännöllisiä lausekkeita
Has been cited by: 1. Nathan Vaillette. Dissertation. 2004 2. András Kornai. Mathematical Linguistics. Springer Verlag. 2008. 3. Mans Hulden, Regular Expressions and Predicate Logic in Finite-State Language Processing, Proceeding of the 2009 conference on Finite-State Methods and Natural Language Processing: Post-proceedings of the 7th International Workshop FSMNLP 2008, p.82-97, July 11, 2009 Proceeding volume: 10Koskenniemen Äärellistilaisen leikkauskieliopin (FSIG) lauseopilliset rajoitteet ovat loogisesti vähemmän kompleksisia kuin mihin niissä käytetty formalismi vittaisi. Osoittautuukin että vaikka Voutilaisen (1994) englannin kielelle laatima FSIG-kuvaus käyttää useita säännöllisten lausekkeiden laajennuksia, kieliopin kuvaus kokonaisuutenaan palautuu äärelliseen yhdistelmään unionia, komplementtia ja peräkkäinasettelua. Tämä on oleellinen parannus ENGFSIG:n descriptiiviseen kompleksisuuteen. Tulos avaa ovia FSIG-kuvauksen loogisten ominaisuuksien syvemmälle analyysille ja FSIG kuvausten mahdolliselle optimoinnillle. Todistus sisältää uuden kaavan, joka kääntää Koskenniemien rajoiteoperaation ilman markkerimerkkejä.Syntactic constraints in Koskenniemi’s Finite-State Intersection Grammar (FSIG) are logically less complex than their formalism (Koskenniemi et al., 1992) would suggest: It turns out that although the constraints in Voutilainen’s (1994) FSIG description of English make use of several extensions to regular expressions, the description as a whole reduces to a finite combination of union, complement and concatenation. This is an essential improvement to the descriptive complexity of ENGFSIG. The result opens a door for further analysis of logical properties and possible optimizations in the FSIG descriptions. The proof contains a new formula for compiling Koskenniemi’s restriction operation without any marker symbols.Peer reviewe
Logics for Unranked Trees: An Overview
Labeled unranked trees are used as a model of XML documents, and logical
languages for them have been studied actively over the past several years. Such
logics have different purposes: some are better suited for extracting data,
some for expressing navigational properties, and some make it easy to relate
complex properties of trees to the existence of tree automata for those
properties. Furthermore, logics differ significantly in their model-checking
properties, their automata models, and their behavior on ordered and unordered
trees. In this paper we present a survey of logics for unranked trees
Non-Deterministic Kleene Coalgebras
In this paper, we present a systematic way of deriving (1) languages of
(generalised) regular expressions, and (2) sound and complete axiomatizations
thereof, for a wide variety of systems. This generalizes both the results of
Kleene (on regular languages and deterministic finite automata) and Milner (on
regular behaviours and finite labelled transition systems), and includes many
other systems such as Mealy and Moore machines
On Global Types and Multi-Party Session
Global types are formal specifications that describe communication protocols
in terms of their global interactions. We present a new, streamlined language
of global types equipped with a trace-based semantics and whose features and
restrictions are semantically justified. The multi-party sessions obtained
projecting our global types enjoy a liveness property in addition to the
traditional progress and are shown to be sound and complete with respect to the
set of traces of the originating global type. Our notion of completeness is
less demanding than the classical ones, allowing a multi-party session to leave
out redundant traces from an underspecified global type. In addition to the
technical content, we discuss some limitations of our language of global types
and provide an extensive comparison with related specification languages
adopted in different communities
Semantics and Validation of Shapes Schemas for RDF
We present a formal semantics and proof of soundness for shapes schemas, an
expressive schema language for RDF graphs that is the foundation of Shape
Expressions Language 2.0. It can be used to describe the vocabulary and the
structure of an RDF graph, and to constrain the admissible properties and
values for nodes in that graph. The language defines a typing mechanism called
shapes against which nodes of the graph can be checked. It includes an
algebraic grouping operator, a choice operator and cardinality constraints for
the number of allowed occurrences of a property. Shapes can be combined using
Boolean operators, and can use possibly recursive references to other shapes.
We describe the syntax of the language and define its semantics. The
semantics is proven to be well-defined for schemas that satisfy a reasonable
syntactic restriction, namely stratified use of negation and recursion. We
present two algorithms for the validation of an RDF graph against a shapes
schema. The first algorithm is a direct implementation of the semantics,
whereas the second is a non-trivial improvement. We also briefly give
implementation guidelines
Colored operads, series on colored operads, and combinatorial generating systems
We introduce bud generating systems, which are used for combinatorial
generation. They specify sets of various kinds of combinatorial objects, called
languages. They can emulate context-free grammars, regular tree grammars, and
synchronous grammars, allowing us to work with all these generating systems in
a unified way. The theory of bud generating systems uses colored operads.
Indeed, an object is generated by a bud generating system if it satisfies a
certain equation in a colored operad. To compute the generating series of the
languages of bud generating systems, we introduce formal power series on
colored operads and several operations on these. Series on colored operads are
crucial to express the languages specified by bud generating systems and allow
us to enumerate combinatorial objects with respect to some statistics. Some
examples of bud generating systems are constructed; in particular to specify
some sorts of balanced trees and to obtain recursive formulas enumerating
these.Comment: 48 page
Computerization of African languages-French dictionaries
This paper relates work done during the DiLAF project. It consists in
converting 5 bilingual African language-French dictionaries originally in Word
format into XML following the LMF model. The languages processed are Bambara,
Hausa, Kanuri, Tamajaq and Songhai-zarma, still considered as under-resourced
languages concerning Natural Language Processing tools. Once converted, the
dictionaries are available online on the Jibiki platform for lookup and
modification. The DiLAF project is first presented. A description of each
dictionary follows. Then, the conversion methodology from .doc format to XML
files is presented. A specific point on the usage of Unicode follows. Then,
each step of the conversion into XML and LMF is detailed. The last part
presents the Jibiki lexical resources management platform used for the project.Comment: 8 page
Pattern matching in compilers
In this thesis we develop tools for effective and flexible pattern matching.
We introduce a new pattern matching system called amethyst. Amethyst is not
only a generator of parsers of programming languages, but can also serve as an
alternative to tools for matching regular expressions.
Our framework also produces dynamic parsers. Its intended use is in the
context of IDE (accurate syntax highlighting and error detection on the fly).
Amethyst offers pattern matching of general data structures. This makes it a
useful tool for implementing compiler optimizations such as constant folding,
instruction scheduling, and dataflow analysis in general.
The parsers produced are essentially top-down parsers. Linear time complexity
is obtained by introducing the novel notion of structured grammars and
regularized regular expressions. Amethyst uses techniques known from compiler
optimizations to produce effective parsers.Comment: master thesi
- …