2 research outputs found

    Variants of acceptance specifications for modular system design

    Get PDF
    Les programmes informatiques prennent une place de plus en plus importante dans nos vies. Certains de ces programmes, comme par exemple les systèmes de contrôle de centrales électriques, d'avions ou de systèmes médicaux sont critiques : une panne ou un dysfonctionnement pourraient causer la perte de vies humaines ou des dommages matériels ou environnementaux importants. Les méthodes formelles visent à offrir des moyens de concevoir et vérifier de tels systèmes afin de garantir qu'ils fonctionneront comme prévu. Au fil du temps, ces systèmes deviennent de plus en plus évolués et complexes, ce qui est source de nouveaux défis pour leur vérification. Il devient nécessaire de développer ces systèmes de manière modulaire afin de pouvoir distribuer la tâche d'implémentation à différentes équipes d'ingénieurs. De plus, il est important de pouvoir réutiliser des éléments certifiés et les adapter pour répondre à de nouveaux besoins. Aussi les méthodes formelles doivent évoluer afin de s'adapter à la conception et à la vérification de ces systèmes modulaires de taille toujours croissante. Nous travaillons sur une approche algébrique pour la conception de systèmes corrects par construction. Elle définit un formalisme pour exprimer des spécifications de haut niveau et permet de les raffiner de manière incrémentale en des spécifications plus concrètes tout en préservant leurs propriétés, jusqu'à ce qu'une implémentation soit atteinte. Elle définit également plusieurs opérations permettant de construire des systèmes complexes à partir de composants plus simples en fusionnant différents points de vue d'un même système ou en composant plusieurs sous-systèmes ensemble, ainsi que de décomposer une spécification complexe afin de réutiliser des composants existants et de simplifier la tâche d'implémentation. Le formalisme de spécification que nous utilisons est basé sur des spécifications modales. Intuitivement, une spécification modale est un automate doté de deux types de transitions permettant d'exprimer des comportements optionnels ou obligatoires. Raffiner une spécification modale revient à décider si les parties optionnelles devraient être supprimées ou rendues obligatoires. Cette thèse contient deux principales contributions théoriques basées sur une extension des spécifications modales appelée " spécifications à ensembles d'acceptation ". La première contribution est l'identification d'une sous-classe des spécifications à ensembles d'acceptation, appelée " spécifications à ensembles d'acceptation convexes ", qui permet de définir des opérations bien plus efficaces tout en gardant un haut niveau d'expressivité. La seconde contribution est la définition d'un nouveau formalisme, appelé " spécifications à ensembles d'acceptation marquées ", qui permet d'exprimer des propriétés d'atteignabilité. Ceci peut, par exemple, être utilisé pour s'assurer qu'un système termine ou exprimer une propriété de vivacité dans un système réactif. Les opérations usuelles sont définies sur ce nouveau formalisme et elles garantissent la préservation des propriétés d'atteignabilité. Cette thèse présente également des résultats d'ordre plus pratique. Tous les résultats théoriques sur les spécifications à ensembles d'acceptation convexes ont été prouvés en utilisant l'assistant de preuves Coq. L'outil MAccS a été développé pour implémenter les formalismes et opérations présentés dans cette thèse. Il permet de les tester aisément sur des exemples, ainsi que d'étudier leur efficacité sur des cas concrets.Software programs are taking a more and more important place in our lives. Some of these programs, like the control systems of power plants, aircraft, or medical devices for instance, are critical: a failure or malfunction could cause loss of human lives, damages to equipments, or environmental harm. Formal methods aim at offering means to design and verify such systems in order to guarantee that they will work as expected. As time passes, these systems grow in scope and size, yielding new challenges. It becomes necessary to develop these systems in a modular fashion to be able to distribute the implementation task to engineering teams. Moreover, being able to reuse some trustworthy parts of the systems and extend them to answer new needs in functionalities is increasingly required. As a consequence, formal methods also have to evolve in order to accommodate both the design and the verification of these larger, modular systems and thus address their scalability challenge. We promote an algebraic approach for the design of correct-by-construction systems. It defines a formalism to express high-level specifications of systems and allows to incrementally refine these specifications into more concrete ones while preserving their properties, until an implementation is reached. It also defines several operations allowing to assemble complex systems from simpler components, by merging several viewpoints of a specific system or composing several subsystems together, as well as decomposing a complex specification in order to reuse existing components and ease the implementation task. The specification formalism we use is based on modal specifications. In essence, a modal specification is an automaton with two kinds of transitions allowing to express mandatory and optional behaviors. Refining a modal specification amounts to deciding whether some optional parts should be removed or made mandatory. This thesis contains two main theoretical contributions, based on an extension of modal specifications called acceptance specifications. The first contribution is the identification of a subclass of acceptance specifications, called convex acceptance specifications, which allows to define much more efficient operations while maintaining a high level of expressiveness. The second contribution is the definition of a new formalism, called marked acceptance specifications, that allows to express some reachability properties. This could be used for example to ensure that a system is terminating or to express a liveness property for a reactive system. Usual operations are defined on this new formalism and guarantee the preservation of the reachability properties as well as independent implementability. This thesis also describes some more practical results. All the theoretical results on convex acceptance specifications have been proved using the Coq proof assistant. The tool MAccS has been developed to implement the formalisms and operations presented in this thesis. It allows to test them easily on some examples, as well as run some experimentations and benchmarks

    Representation and Processing of Composition, Variation and Approximation in Language Resources and Tools

    Get PDF
    In my habilitation dissertation, meant to validate my capacity of and maturity for directingresearch activities, I present a panorama of several topics in computational linguistics, linguisticsand computer science.Over the past decade, I was notably concerned with the phenomena of compositionalityand variability of linguistic objects. I illustrate the advantages of a compositional approachto the language in the domain of emotion detection and I explain how some linguistic objects,most prominently multi-word expressions, defy the compositionality principles. I demonstratethat the complex properties of MWEs, notably variability, are partially regular and partiallyidiosyncratic. This fact places the MWEs on the frontiers between different levels of linguisticprocessing, such as lexicon and syntax.I show the highly heterogeneous nature of MWEs by citing their two existing taxonomies.After an extensive state-of-the art study of MWE description and processing, I summarizeMultiflex, a formalism and a tool for lexical high-quality morphosyntactic description of MWUs.It uses a graph-based approach in which the inflection of a MWU is expressed in function ofthe morphology of its components, and of morphosyntactic transformation patterns. Due tounification the inflection paradigms are represented compactly. Orthographic, inflectional andsyntactic variants are treated within the same framework. The proposal is multilingual: it hasbeen tested on six European languages of three different origins (Germanic, Romance and Slavic),I believe that many others can also be successfully covered. Multiflex proves interoperable. Itadapts to different morphological language models, token boundary definitions, and underlyingmodules for the morphology of single words. It has been applied to the creation and enrichmentof linguistic resources, as well as to morphosyntactic analysis and generation. It can be integratedinto other NLP applications requiring the conflation of different surface realizations of the sameconcept.Another chapter of my activity concerns named entities, most of which are particular types ofMWEs. Their rich semantic load turned them into a hot topic in the NLP community, which isdocumented in my state-of-the art survey. I present the main assumptions, processes and resultsissued from large annotation tasks at two levels (for named entities and for coreference), parts ofthe National Corpus of Polish construction. I have also contributed to the development of bothrule-based and probabilistic named entity recognition tools, and to an automated enrichment ofProlexbase, a large multilingual database of proper names, from open sources.With respect to multi-word expressions, named entities and coreference mentions, I pay aspecial attention to nested structures. This problem sheds new light on the treatment of complexlinguistic units in NLP. When these units start being modeled as trees (or, more generally, asacyclic graphs) rather than as flat sequences of tokens, long-distance dependencies, discontinu-ities, overlapping and other frequent linguistic properties become easier to represent. This callsfor more complex processing methods which control larger contexts than what usually happensin sequential processing. Thus, both named entity recognition and coreference resolution comesvery close to parsing, and named entities or mentions with their nested structures are analogous3to multi-word expressions with embedded complements.My parallel activity concerns finite-state methods for natural language and XML processing.My main contribution in this field, co-authored with 2 colleagues, is the first full-fledged methodfor tree-to-language correction, and more precisely for correcting XML documents with respectto a DTD. We have also produced interesting results in incremental finite-state algorithmics,particularly relevant to data evolution contexts such as dynamic vocabularies or user updates.Multilingualism is the leitmotif of my research. I have applied my methods to several naturallanguages, most importantly to Polish, Serbian, English and French. I have been among theinitiators of a highly multilingual European scientific network dedicated to parsing and multi-word expressions. I have used multilingual linguistic data in experimental studies. I believethat it is particularly worthwhile to design NLP solutions taking declension-rich (e.g. Slavic)languages into account, since this leads to more universal solutions, at least as far as nominalconstructions (MWUs, NEs, mentions) are concerned. For instance, when Multiflex had beendeveloped with Polish in mind it could be applied as such to French, English, Serbian and Greek.Also, a French-Serbian collaboration led to substantial modifications in morphological modelingin Prolexbase in its early development stages. This allowed for its later application to Polishwith very few adaptations of the existing model. Other researchers also stress the advantages ofNLP studies on highly inflected languages since their morphology encodes much more syntacticinformation than is the case e.g. in English.In this dissertation I am also supposed to demonstrate my ability of playing an active rolein shaping the scientific landscape, on a local, national and international scale. I describemy: (i) various scientific collaborations and supervision activities, (ii) roles in over 10 regional,national and international projects, (iii) responsibilities in collective bodies such as program andorganizing committees of conferences and workshops, PhD juries, and the National UniversityCouncil (CNU), (iv) activity as an evaluator and a reviewer of European collaborative projects.The issues addressed in this dissertation open interesting scientific perspectives, in whicha special impact is put on links among various domains and communities. These perspectivesinclude: (i) integrating fine-grained language data into the linked open data, (ii) deep parsingof multi-word expressions, (iii) modeling multi-word expression identification in a treebank as atree-to-language correction problem, and (iv) a taxonomy and an experimental benchmark fortree-to-language correction approaches
    corecore