363 research outputs found

    : Méthodes d'Inférence Symbolique pour les Bases de Données

    Get PDF
    This dissertation is a summary of a line of research, that I wasactively involved in, on learning in databases from examples. Thisresearch focused on traditional as well as novel database models andlanguages for querying, transforming, and describing the schema of adatabase. In case of schemas our contributions involve proposing anoriginal languages for the emerging data models of Unordered XML andRDF. We have studied learning from examples of schemas for UnorderedXML, schemas for RDF, twig queries for XML, join queries forrelational databases, and XML transformations defined with a novelmodel of tree-to-word transducers.Investigating learnability of the proposed languages required us toexamine closely a number of their fundamental properties, often ofindependent interest, including normal forms, minimization,containment and equivalence, consistency of a set of examples, andfinite characterizability. Good understanding of these propertiesallowed us to devise learning algorithms that explore a possibly largesearch space with the help of a diligently designed set ofgeneralization operations in search of an appropriate solution.Learning (or inference) is a problem that has two parameters: theprecise class of languages we wish to infer and the type of input thatthe user can provide. We focused on the setting where the user inputconsists of positive examples i.e., elements that belong to the goallanguage, and negative examples i.e., elements that do not belong tothe goal language. In general using both negative and positiveexamples allows to learn richer classes of goal languages than usingpositive examples alone. However, using negative examples is oftendifficult because together with positive examples they may cause thesearch space to take a very complex shape and its exploration may turnout to be computationally challenging.Ce mémoire est une courte présentation d’une direction de recherche, à laquelle j’ai activementparticipé, sur l’apprentissage pour les bases de données à partir d’exemples. Cette recherches’est concentrée sur les modèles et les langages, aussi bien traditionnels qu’émergents, pourl’interrogation, la transformation et la description du schéma d’une base de données. Concernantles schémas, nos contributions consistent en plusieurs langages de schémas pour les nouveaumodèles de bases de données que sont XML non-ordonné et RDF. Nous avons ainsi étudiél’apprentissage à partir d’exemples des schémas pour XML non-ordonné, des schémas pour RDF,des requêtes twig pour XML, les requêtes de jointure pour bases de données relationnelles et lestransformations XML définies par un nouveau modèle de transducteurs arbre-à-mot.Pour explorer si les langages proposés peuvent être appris, nous avons été obligés d’examinerde près un certain nombre de leurs propriétés fondamentales, souvent souvent intéressantespar elles-mêmes, y compris les formes normales, la minimisation, l’inclusion et l’équivalence, lacohérence d’un ensemble d’exemples et la caractérisation finie. Une bonne compréhension de cespropriétés nous a permis de concevoir des algorithmes d’apprentissage qui explorent un espace derecherche potentiellement très vaste grâce à un ensemble d’opérations de généralisation adapté àla recherche d’une solution appropriée.L’apprentissage (ou l’inférence) est un problème à deux paramètres : la classe précise delangage que nous souhaitons inférer et le type d’informations que l’utilisateur peut fournir. Nousnous sommes placés dans le cas où l’utilisateur fournit des exemples positifs, c’est-à-dire deséléments qui appartiennent au langage cible, ainsi que des exemples négatifs, c’est-à-dire qui n’enfont pas partie. En général l’utilisation à la fois d’exemples positifs et négatifs permet d’apprendredes classes de langages plus riches que l’utilisation uniquement d’exemples positifs. Toutefois,l’utilisation des exemples négatifs est souvent difficile parce que les exemples positifs et négatifspeuvent rendre la forme de l’espace de recherche très complexe, et par conséquent, son explorationinfaisable

    Topological inference in graphs and images

    Get PDF

    On Repairing Structural Issues in Semi-Structured Documents

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Relation among images: Modelling, optimization and applications

    Get PDF
    In the last two decades, the increasing popularity of information technology has led to a dramatic increase in the amount of visual data. Many applications are developed by processing, analyzing and understanding such increasing data; and modelling the relation among images is fundamental to success of many of them. Examples include image classification, content-based image retrieval and face recognition. Given signatures of images, there are many ways to depict the relation among them, such as pairwise distance, kernel function and factor analysis. However, existing methods are still insufficient as they suffer from many real factors such as misalignment of images and inefficiency from nonlinearity. This dissertation focuses on improving the relation modelling, its applications and related optimization. In particular, three aspects of relation modelling are addressed: 1. Integrate image alignment into the relation modelling methods, including image classification and factor analysis, to achieve stability in real applications. 2. Model relation when images are on multiple manifolds. 3. Develop nonlinear relation modelling methods, including tapering kernels for sparsification of kernel-based relation models and developing piecewise linear factor analysis to enjoy both the efficiency of linear models and the flexibility of nonlinear ones. We also discuss future directions of relation modelling in the last chapter from both application and methodology aspects

    Non-acyclicity of coset lattices and generation of finite groups

    Get PDF

    Logs and Models in Engineering Complex Embedded Production Software Systems

    Get PDF

    Networking - A Statistical Physics Perspective

    Get PDF
    Efficient networking has a substantial economic and societal impact in a broad range of areas including transportation systems, wired and wireless communications and a range of Internet applications. As transportation and communication networks become increasingly more complex, the ever increasing demand for congestion control, higher traffic capacity, quality of service, robustness and reduced energy consumption require new tools and methods to meet these conflicting requirements. The new methodology should serve for gaining better understanding of the properties of networking systems at the macroscopic level, as well as for the development of new principled optimization and management algorithms at the microscopic level. Methods of statistical physics seem best placed to provide new approaches as they have been developed specifically to deal with non-linear large scale systems. This paper aims at presenting an overview of tools and methods that have been developed within the statistical physics community and that can be readily applied to address the emerging problems in networking. These include diffusion processes, methods from disordered systems and polymer physics, probabilistic inference, which have direct relevance to network routing, file and frequency distribution, the exploration of network structures and vulnerability, and various other practical networking applications.Comment: (Review article) 71 pages, 14 figure

    Learning-based inductive invariant synthesis

    Get PDF
    The problem of synthesizing adequate inductive invariants to prove a program correct lies at the heart of automated program verification. We investigate, herein, learning approaches to synthesize inductive invariants of sequential programs towards automatically verifying them. To this end, we identify that prior learning approaches were unduly influenced by traditional machine learning models that learned concepts from positive and negative counterexamples. We argue that these models are not robust for invariant synthesis and, consequently, introduce ICE, a robust learning paradigm for synthesizing invariants that learns using positive, negative and implication counterexamples, and show that it admits honest teachers and strongly convergent mechanisms for invariant synthesis. We develop the first learning algorithms in this model with implication counterexamples for two domains, one for learning arbitrary Boolean combinations of numerical invariants over scalar variables and one for quantified invariants of linear data-structures including arrays and dynamic lists. We implement the ICE learners and an appropriate teacher, and show that the resulting invariant synthesis is robust, practical, convergent, and efficient. In order to deductively verify shared-memory concurrent programs, we present a sequentialization result and show that synthesizing rely-guarantee annotations for them can be reduced to invariant synthesis for sequential programs. Further, for verifying asynchronous event-driven systems, we develop a new invariant synthesis technique that constructs almost-synchronous invariants over concrete system configurations. These invariants, for most systems, are finitely representable, and can be thereby constructed, including for the USB driver that ships with Microsoft Windows phone
    corecore