    : Méthodes d'Inférence Symbolique pour les Bases de Données

    This dissertation is a summary of a line of research, that I wasactively involved in, on learning in databases from examples. Thisresearch focused on traditional as well as novel database models andlanguages for querying, transforming, and describing the schema of adatabase. In case of schemas our contributions involve proposing anoriginal languages for the emerging data models of Unordered XML andRDF. We have studied learning from examples of schemas for UnorderedXML, schemas for RDF, twig queries for XML, join queries forrelational databases, and XML transformations defined with a novelmodel of tree-to-word transducers.Investigating learnability of the proposed languages required us toexamine closely a number of their fundamental properties, often ofindependent interest, including normal forms, minimization,containment and equivalence, consistency of a set of examples, andfinite characterizability. Good understanding of these propertiesallowed us to devise learning algorithms that explore a possibly largesearch space with the help of a diligently designed set ofgeneralization operations in search of an appropriate solution.Learning (or inference) is a problem that has two parameters: theprecise class of languages we wish to infer and the type of input thatthe user can provide. We focused on the setting where the user inputconsists of positive examples i.e., elements that belong to the goallanguage, and negative examples i.e., elements that do not belong tothe goal language. In general using both negative and positiveexamples allows to learn richer classes of goal languages than usingpositive examples alone. However, using negative examples is oftendifficult because together with positive examples they may cause thesearch space to take a very complex shape and its exploration may turnout to be computationally challenging.Ce mémoire est une courte présentation d’une direction de recherche, à laquelle j’ai activementparticipé, sur l’apprentissage pour les bases de données à partir d’exemples. Cette recherches’est concentrée sur les modèles et les langages, aussi bien traditionnels qu’émergents, pourl’interrogation, la transformation et la description du schéma d’une base de données. Concernantles schémas, nos contributions consistent en plusieurs langages de schémas pour les nouveaumodèles de bases de données que sont XML non-ordonné et RDF. Nous avons ainsi étudiél’apprentissage à partir d’exemples des schémas pour XML non-ordonné, des schémas pour RDF,des requêtes twig pour XML, les requêtes de jointure pour bases de données relationnelles et lestransformations XML définies par un nouveau modèle de transducteurs arbre-à-mot.Pour explorer si les langages proposés peuvent être appris, nous avons été obligés d’examinerde près un certain nombre de leurs propriétés fondamentales, souvent souvent intéressantespar elles-mêmes, y compris les formes normales, la minimisation, l’inclusion et l’équivalence, lacohérence d’un ensemble d’exemples et la caractérisation finie. Une bonne compréhension de cespropriétés nous a permis de concevoir des algorithmes d’apprentissage qui explorent un espace derecherche potentiellement très vaste grâce à un ensemble d’opérations de généralisation adapté àla recherche d’une solution appropriée.L’apprentissage (ou l’inférence) est un problème à deux paramètres : la classe précise delangage que nous souhaitons inférer et le type d’informations que l’utilisateur peut fournir. Nousnous sommes placés dans le cas où l’utilisateur fournit des exemples positifs, c’est-à-dire deséléments qui appartiennent au langage cible, ainsi que des exemples négatifs, c’est-à-dire qui n’enfont pas partie. En général l’utilisation à la fois d’exemples positifs et négatifs permet d’apprendredes classes de langages plus riches que l’utilisation uniquement d’exemples positifs. Toutefois,l’utilisation des exemples négatifs est souvent difficile parce que les exemples positifs et négatifspeuvent rendre la forme de l’espace de recherche très complexe, et par conséquent, son explorationinfaisable

    Experiencing OptiqueVQS: A Multi-paradigm and Ontology-based Visual Query System for End Users

    This is author's post-print version, published version available on http://link.springer.com/article/10.1007%2Fs10209-015-0404-5Data access in an enterprise setting is a determining factor for value creation processes, such as sense-making, decision-making, and intelligence analysis. Particularly, in an enterprise setting, intuitive data access tools that directly engage domain experts with data could substantially increase competitiveness and profitability. In this respect, the use of ontologies as a natural communication medium between end users and computers has emerged as a prominent approach. To this end, this article introduces a novel ontology-based visual query system, named OptiqueVQS, for end users. OptiqueVQS is built on a powerful and scalable data access platform and has a user-centric design supported by a widget-based flexible and extensible architecture allowing multiple coordinated representation and interaction paradigms to be employed. The results of a usability experiment performed with non-expert users suggest that OptiqueVQS provides a decent level of expressivity and high usability and hence is quite promising

    Building Rules on Top of Ontologies for the Semantic Web with Inductive Logic Programming

    Building rules on top of ontologies is the ultimate goal of the logical layer of the Semantic Web. To this aim an ad-hoc mark-up language for this layer is currently under discussion. It is intended to follow the tradition of hybrid knowledge representation and reasoning systems such as AL\mathcal{AL}-log that integrates the description logic ALC\mathcal{ALC} and the function-free Horn clausal language \textsc{Datalog}. In this paper we consider the problem of automating the acquisition of these rules for the Semantic Web. We propose a general framework for rule induction that adopts the methodological apparatus of Inductive Logic Programming and relies on the expressive and deductive power of AL\mathcal{AL}-log. The framework is valid whatever the scope of induction (description vs. prediction) is. Yet, for illustrative purposes, we also discuss an instantiation of the framework which aims at description and turns out to be useful in Ontology Refinement. Keywords: Inductive Logic Programming, Hybrid Knowledge Representation and Reasoning Systems, Ontologies, Semantic Web. Note: To appear in Theory and Practice of Logic Programming (TPLP)Comment: 30 pages, 6 figure

    Structured Inspections of Search Interfaces: A Practitioners Guide

    In this paper we present a practitioners guide on how to apply a new inspection framework that evaluates search interfaces for their support of different searcher types. Vast amounts of money are being invested into search, and so it is becoming increasingly important to identify problems in design early, while it is relatively cheap to rectify them. The inspection method presented here can be applied quickly to early prototypes, as well as existing systems, and goes beyond other inspection methods, like Cognitive Walkthroughs, to produces rich analyses, including the support provided for different search tactics and user types. The guide is presented as a detailed example, assessing a previously unevaluated search interface: the Tabulator, and so also provides design recommendations for improving it. We conclude with a summary of the benefits of the evaluation framework, and discuss our plans for future enhancements

    Affective graphs: the visual appeal of linked data

    The essence and value of Linked Data lies in the ability of humans and machines to query, access and reason upon highly structured and formalised data. Ontology structures provide an unambiguous description of the structure and content of data. While a multitude of software applications and visualization systems have been developed over the past years for Linked Data, there is still a significant gap that exists between applications that consume Linked Data and interfaces that have been designed with significant focus on aesthetics. Though the importance of aesthetics in affecting the usability, effectiveness and acceptability of user interfaces have long been recognised, little or no explicit attention has been paid to the aesthetics of Linked Data applications. In this paper, we introduce a formalised approach to developing aesthetically pleasing semantic web interfaces by following aesthetic principles and guidelines identified from literature. We apply such principles to design and develop a generic approach of using visualizations to support exploration of Linked Data, in an interface that is pleasing to users. This provides users with means to browse ontology structures, enriched with statistics of the underlying data, facilitating exploratory activities and enabling visual query for highly precise information needs. We evaluated our approach in three ways: an initial objective evaluation comparing our approach with other well-known interfaces for the semantic web and two user evaluations with semantic web researchers

    Inference of Shape Graphs for Graph Databases

    We investigate the problem of constructing a shape graph that describes the structure of a given graph database. We employ the framework of grammatical inference, where the objective is to find an inference algorithm that is both sound, i.e., always producing a schema that validates the input graph, and complete, i.e., able to produce any schema, within a given class of schemas, provided that a sufficiently informative input graph is presented. We identify a number of fundamental limitations that preclude feasible inference. We present inference algorithms based on natural approaches that allow to infer schemas that we argue to be of practical importance

    Building communities for the exchange of learning objects: theoretical foundations and requirements

    In order to reduce overall costs of developing high-quality digital courses (including both the content, and the learning and teaching activities), the exchange of learning objects has been recognized as a promising solution. This article makes an inventory of the issues involved in the exchange of learning objects within a community. It explores some basic theories, models and specifications and provides a theoretical framework containing the functional and non-functional requirements to establish an exchange system in the educational field. Three levels of requirements are discussed. First, the non-functional requirements that deal with the technical conditions to make learning objects interoperable. Second, some basic use cases (activities) are identified that must be facilitated to enable the technical exchange of learning objects, e.g. searching and adapting the objects. Third, some basic use cases are identified that are required to establish the exchange of learning objects in a community, e.g. policy management, information and training. The implications of this framework are then discussed, including recommendations concerning the identification of reward systems, role changes and evaluation instruments
