357 research outputs found

    Developments from enquiries into the learnability of the pattern languages from positive data

    Get PDF
    AbstractThe pattern languages are languages that are generated from patterns, and were first proposed by Angluin as a non-trivial class that is inferable from positive data [D. Angluin, Finding patterns common to a set of strings, Journal of Computer and System Sciences 21 (1980) 46–62; D. Angluin, Inductive inference of formal languages from positive data, Information and Control 45 (1980) 117–135]. In this paper we chronologize some results that developed from the investigations on the inferability of the pattern languages from positive data

    : Méthodes d'Inférence Symbolique pour les Bases de Données

    Get PDF
    This dissertation is a summary of a line of research, that I wasactively involved in, on learning in databases from examples. Thisresearch focused on traditional as well as novel database models andlanguages for querying, transforming, and describing the schema of adatabase. In case of schemas our contributions involve proposing anoriginal languages for the emerging data models of Unordered XML andRDF. We have studied learning from examples of schemas for UnorderedXML, schemas for RDF, twig queries for XML, join queries forrelational databases, and XML transformations defined with a novelmodel of tree-to-word transducers.Investigating learnability of the proposed languages required us toexamine closely a number of their fundamental properties, often ofindependent interest, including normal forms, minimization,containment and equivalence, consistency of a set of examples, andfinite characterizability. Good understanding of these propertiesallowed us to devise learning algorithms that explore a possibly largesearch space with the help of a diligently designed set ofgeneralization operations in search of an appropriate solution.Learning (or inference) is a problem that has two parameters: theprecise class of languages we wish to infer and the type of input thatthe user can provide. We focused on the setting where the user inputconsists of positive examples i.e., elements that belong to the goallanguage, and negative examples i.e., elements that do not belong tothe goal language. In general using both negative and positiveexamples allows to learn richer classes of goal languages than usingpositive examples alone. However, using negative examples is oftendifficult because together with positive examples they may cause thesearch space to take a very complex shape and its exploration may turnout to be computationally challenging.Ce mémoire est une courte présentation d’une direction de recherche, à laquelle j’ai activementparticipé, sur l’apprentissage pour les bases de données à partir d’exemples. Cette recherches’est concentrée sur les modèles et les langages, aussi bien traditionnels qu’émergents, pourl’interrogation, la transformation et la description du schéma d’une base de données. Concernantles schémas, nos contributions consistent en plusieurs langages de schémas pour les nouveaumodèles de bases de données que sont XML non-ordonné et RDF. Nous avons ainsi étudiél’apprentissage à partir d’exemples des schémas pour XML non-ordonné, des schémas pour RDF,des requêtes twig pour XML, les requêtes de jointure pour bases de données relationnelles et lestransformations XML définies par un nouveau modèle de transducteurs arbre-à-mot.Pour explorer si les langages proposés peuvent être appris, nous avons été obligés d’examinerde près un certain nombre de leurs propriétés fondamentales, souvent souvent intéressantespar elles-mêmes, y compris les formes normales, la minimisation, l’inclusion et l’équivalence, lacohérence d’un ensemble d’exemples et la caractérisation finie. Une bonne compréhension de cespropriétés nous a permis de concevoir des algorithmes d’apprentissage qui explorent un espace derecherche potentiellement très vaste grâce à un ensemble d’opérations de généralisation adapté àla recherche d’une solution appropriée.L’apprentissage (ou l’inférence) est un problème à deux paramètres : la classe précise delangage que nous souhaitons inférer et le type d’informations que l’utilisateur peut fournir. Nousnous sommes placés dans le cas où l’utilisateur fournit des exemples positifs, c’est-à-dire deséléments qui appartiennent au langage cible, ainsi que des exemples négatifs, c’est-à-dire qui n’enfont pas partie. En général l’utilisation à la fois d’exemples positifs et négatifs permet d’apprendredes classes de langages plus riches que l’utilisation uniquement d’exemples positifs. Toutefois,l’utilisation des exemples négatifs est souvent difficile parce que les exemples positifs et négatifspeuvent rendre la forme de l’espace de recherche très complexe, et par conséquent, son explorationinfaisable

    Set systems: order types, continuous nondeterministic deformations, and quasi-orders

    Get PDF
    By reformulating a learning process of a set system L as a game between Teacher and Learner, we define the order type of L to be the order type of the game tree, if the tree is well-founded. The features of the order type of L (dim L in symbol) are (1) We can represent any well-quasi-order (wqo for short) by the set system L of the upper-closed sets of the wqo such that the maximal order type of the wqo is equal to dim L. (2) dim L is an upper bound of the mind-change complexity of L. dim L is defined iff L has a finite elasticity (fe for short), where, according to computational learning theory, if an indexed family of recursive languages has fe then it is learnable by an algorithm from positive data. Regarding set systems as subspaces of Cantor spaces, we prove that fe of set systems is preserved by any continuous function which is monotone with respect to the set-inclusion. By it, we prove that finite elasticity is preserved by various (nondeterministic) language operators (Kleene-closure, shuffle-closure, union, product, intersection,. . ..) The monotone continuous functions represent nondeterministic computations. If a monotone continuous function has a computation tree with each node followed by at most n immediate successors and the order type of a set system L is {\alpha}, then the direct image of L is a set system of order type at most n-adic diagonal Ramsey number of {\alpha}. Furthermore, we provide an order-type-preserving contravariant embedding from the category of quasi-orders and finitely branching simulations between them, into the complete category of subspaces of Cantor spaces and monotone continuous functions having Girard's linearity between them. Keyword: finite elasticity, shuffle-closur

    Mind change efficient learning

    Get PDF
    This paper studies efficient learning with respect to mind changes. Our starting point is the idea that a learner that is efficient with respect to mind changes minimizes mind changes not only globally in the entire learning problem, but also locally in subproblems after receiving some evidence. Formalizing this idea leads to the notion of uniform mind change optimality. We characterize the structure of language classes that can be identified with at most α mind changes by some learner (not necessarily effective): A language class L is identifiable with α mind changes iff the accumulation order of L is at most α. Accumulation order is a classic concept from point-set topology. To aid the construction of learning algorithms, we show that the characteristic property of uniformly mind change optimal learners is that they output conjectures (languages) with maximal accumulation order. We illustrate the theory by describing mind change optimal learners for various problems such as identifying linear subspaces and one-variable patterns

    Topological properties of concept spaces (full version)

    Get PDF
    AbstractBased on the observation that the category of concept spaces with the positive information topology is equivalent to the category of countably based T0 topological spaces, we investigate further connections between the learning in the limit model of inductive inference and topology. In particular, we show that the “texts” or “positive presentations” of concepts in inductive inference can be viewed as special cases of the “admissible representations” of computable analysis. We also show that several structural properties of concept spaces have well known topological equivalents. In addition to topological methods, we use algebraic closure operators to analyze the structure of concept spaces, and we show the connection between these two approaches. The goal of this paper is not only to introduce new perspectives to learning theorists, but also to present the field of inductive inference in a way more accessible to domain theorists and topologists

    Computation with Advice

    Get PDF
    Computation with advice is suggested as generalization of both computation with discrete advice and Type-2 Nondeterminism. Several embodiments of the generic concept are discussed, and the close connection to Weihrauch reducibility is pointed out. As a novel concept, computability with random advice is studied; which corresponds to correct solutions being guessable with positive probability. In the framework of computation with advice, it is possible to define computational complexity for certain concepts of hypercomputation. Finally, some examples are given which illuminate the interplay of uniform and non-uniform techniques in order to investigate both computability with advice and the Weihrauch lattice

    The descriptive theory of represented spaces

    Full text link
    This is a survey on the ongoing development of a descriptive theory of represented spaces, which is intended as an extension of both classical and effective descriptive set theory to deal with both sets and functions between represented spaces. Most material is from work-in-progress, and thus there may be a stronger focus on projects involving the author than an objective survey would merit.Comment: survey of work-in-progres

    Inklusion von Patternsprachen und verwandte Probleme

    Get PDF
    A pattern is a word that consists of variables and terminal symbols. The pattern language that is generated by a pattern A is the set of all terminal words that can be obtained from A by uniform replacement of variables with terminal words. For example, the pattern A = a x y a x (where x and y are variables, and the letter a is a terminal symbol) generates the set of all words that have some word a x both as prefix and suffix (where these two occurrences of a x do not overlap). Due to their simple definition, pattern languages have various connections to a wide range of other areas in theoretical computer science and mathematics. Among these areas are combinatorics on words, logic, and the theory of free semigroups. On the other hand, many of the canonical questions in formal language theory are surprisingly difficult. The present thesis discusses various aspects of the inclusion problem of pattern languages. It can be divide in two parts. The first one examines the decidability of pattern languages with a limited number of variables and fixed terminal alphabets. In addition to this, the minimizability of regular expressions with repetition operators is studied. The second part deals with descriptive patterns, the smallest generalizations of arbitrary languages through pattern languages ("smallest" with respect to the inclusion relation). Main questions are the existence and the discoverability of descriptive patterns for arbitrary languages.Ein Pattern ist ein Wort aus Variablen und Terminalsymbolen. Die von einem Pattern A erzeugte Patternsprache ist die Menge aller Terminalwörter, die durch eine uniforme Ersetzung der Variablen in A durch Terminalwörter erzeugt werden können. So beschreibt das Pattern A = a x y a x (wobei x und y Variablen sind und a ein Terminal ist) die Menge aller Wörter, die ein Wort der Form a x sowohl als Präfix, als auch als Suffix haben (ohne dass sich diese beiden Vorkommen von a x überlappen). Wegen ihrer einfachen Definition besitzen Patternsprachen eine Vielzahl von Verbindungen zu verschiedenen anderen Gebieten der theoretischen Informatik und Mathematik, unter anderem zur Wortkombinatorik, Logik und der Theorie freier Halbgruppen. Andererseits führen viele der üblichen sprachtheoretischen Fragestellungen bei Patternsprachen zu kombinatorischen Problemen von überraschender Schwierigkeit. Die vorliegende Dissertation widmet sich verschiedenen Aspekten des Inklusionsproblems von Patternsprachen und kann in zwei Teile unterteilt werden. Der erste Teil untersucht die Entscheidbarkeit des Inklusionsproblems für Sprachen, die von Pattern mit beschränkter Variablenzahl über Terminalalphabeten von beschränkter Größe erzeugt werden. Darüber hinaus werden verschiedene Aspekte der Minimierbarkeit von regulären Ausdrücken mit Rückreferenzen betrachtet. Der zweite Teil der Dissertation handelt von deskriptiven Pattern; d.h. denjenigen Pattern, die die (hinsichtlich der Inklusion) kleinsten Verallgemeinerungen einer gegebenen Sprache erzeugen. Hauptfragen sind hierbei die Existenz und die Auffindbarkeit deskriptiver Pattern für beliebige Sprachen
    • …
    corecore