Search CORE

68 research outputs found

Learning Rational Functions

Author: C. Choffrut
C. Choffrut
C. Higuera de la
C. Reutenauer
C.C. Elgot
E.M. Gold
J. Carme
J. Engelfriet
J. Engelfriet
J. Högberg
J. Oncina
J. Oncina
S. Friese
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

International audienceRational functions are transformations from words to words that can be defined by string transducers. Rational functions are also captured by deterministic string transducers with lookahead. We show for the first time that the class of rational functions can be learned in the limit with polynomial time and data, when represented by string transducers with lookahead in the diagonal-minimal normal form that we introduce

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

Experiments using semantics for learning language comprehension and production

Author: Angluin Dana
Becerra-Bonache Leonor
Publication venue: Cambridge Scholars Publishing
Publication date: 01/01/2011
Field of study

Several questions in natural language learning may be addressed by studying formal language learning models. In this work we hope to contribute to a deeper understanding of the role of semantics in language acquisition. We propose a simple formal model of meaning and denotation using finite state transducers, and an algorithm that learns a meaning function from examples consisting of a situation and an utterance denoting something in the situation. We describe the results of testing this algorithm in a domain of geometric shapes and their properties and relations in several natural languages: Arabic, English, Greek, Hebrew, Hindi, Mandarin, Russian, Spanish, and Turkish. In addition, we explore how a learner who has learned to comprehend utterances might go about learning to produce them, and present experimental results for this task. One concrete goal of our formal model is to be able to give an account of interactions in which an adult provides a meaning-preserving and grammatically correct expansion of a child's incomplete utterance

HAL-UJM

Strict Locality and Phonological Maps

Author: Chandlee Jane
Heinz Jeffrey
Publication venue: Haverford Scholarship
Publication date: 01/01/2018
Field of study

Haverford College: Haverford Scholarship

Benchmarking Compositionality with Formal Languages

Author: Cotterell Ryan
Rawski Jonathan
Saphra Naomi
Valvoda Josef
Williams Adina
Publication venue
Publication date: 01/01/2022
Field of study

Recombining known primitive concepts into larger novel combinations is a quintessentially human cognitive capability. Whether large neural models in NLP can acquire this ability while learning from data is an open question. In this paper, we investigate this problem from the perspective of formal languages. We use deterministic finite-state transducers to make an unbounded number of datasets with controllable properties governing compositionality. By randomly sampling over many transducers, we explore which of their properties contribute to learnability of a compositional relation by a neural network. We find that the models either learn the relations completely or not at all. The key is transition coverage, setting a soft learnability limit at 400 examples per transition

arXiv.org e-Print Archive

Repository for Publications and Research Data

SJSU ScholarWorks

Learning Automata and Transducers: A Categorical Approach

Author: Colcombet Thomas
Petri?an Daniela
Stabile Riccardo
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 29th EACSL Annual Conference on Computer Science Logic (CSL 2021)
Publication date: 01/01/2021
Field of study

In this paper, we present a categorical approach to learning automata over words, in the sense of the L*-algorithm of Angluin. This yields a new generic L*-like algorithm which can be instantiated for learning deterministic automata, automata weighted over fields, as well as subsequential transducers. The generic nature of our algorithm is obtained by adopting an approach in which automata are simply functors from a particular category representing words to a "computation category". We establish that the sufficient properties for yielding the existence of minimal automata (that were disclosed in a previous paper), in combination with some additional hypotheses relative to termination, ensure the correctness of our generic algorithm

HAL Descartes

Dagstuhl Research Online Publication Server

Learning Moore Machines from Input-Output Traces

Author: A Gupta
A Solar-Lezama
AV Aleksandrov
AW Biermann
B Jonsson
C Higuera de la
CL Heitmeyer
D Angluin
D Lee
EM Gold
EM Gold
F Aarts
F Aarts
F Howar
HI Akram
IP Buzhinsky
J Oncina
K Meinke
K Takahashi
KJ Lang
LPJ Veelenturf
M Shahbaz
M Spichakova
MA Colón
MJ Heule
O Grinchtein
P Dupont
R Alur
R Dorofeeva
S Cassel
T Berg
TM Mitchell
TS Chow
V Ulyantsev
X Jin
Z Kohavi
Publication venue
Publication date: 02/09/2016
Field of study

The problem of learning automata from example traces (but no equivalence or membership queries) is fundamental in automata learning theory and practice. In this paper we study this problem for finite state machines with inputs and outputs, and in particular for Moore machines. We develop three algorithms for solving this problem: (1) the PTAP algorithm, which transforms a set of input-output traces into an incomplete Moore machine and then completes the machine with self-loops; (2) the PRPNI algorithm, which uses the well-known RPNI algorithm for automata learning to learn a product of automata encoding a Moore machine; and (3) the MooreMI algorithm, which directly learns a Moore machine using PTAP extended with state merging. We prove that MooreMI has the fundamental identification in the limit property. We also compare the algorithms experimentally in terms of the size of the learned machine and several notions of accuracy, introduced in this paper. Finally, we compare with OSTIA, an algorithm that learns a more general class of transducers, and find that OSTIA generally does not learn a Moore machine, even when fed with a characteristic sample

arXiv.org e-Print Archive

Crossref

Recommended from our members

Symbolic Model Learning: New Algorithms and Applications

Author: Argyros Georgios
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

In this thesis, we study algorithms which can be used to extract, or learn, formal mathematical models from software systems and then using these models to test whether the given software systems satisfy certain security properties such as robustness against code injection attacks. Specifically, we focus on studying learning algorithms for automata and transducers and the symbolic extensions of these models, namely symbolic finite automata (SFAs). In a high level, this thesis contributes the following results: 1. In the first part of the thesis, we present a unified treatment of many common variations of the seminal L* algorithm for learning deterministic finite automata (DFAs) as a congruence learning algorithm for the underlying Nerode congruence which forms the basis of automata theory. Under this formulation the basic data structures used by different variations are unified as different ways to implement the Nerode congruence using queries. 2. Next, building on the new formulation of L*-style algorithms we proceed to develop new algorithms for learning transducer models. Firstly, we present the first algorithm for learning deterministic partial transducers. Furthermore, we extend my algorithm into non-deterministic models by introducing a novel, generalized congruence relation over string transformations which is able to capture a subclass of string transformations with regular lookahead. We demonstrate that this class is able to capture many practical string transformation from the domain of string sanitizers in Web applications. 3. Classical learning algorithms for automata and transducers operate over finite alphabets and have a query complexity that scales linearly with the size of the alphabet. However, in practice, this dependence on the alphabet size hinders the performance of the algorithms. To address this issue, we develop the MAT* algorithm for learning symbolic finite state automata (SFAs) which operate over infinite alphabets. In practice, the MAT* learning algorithm allow us to plug custom transition learning algorithms which will efficiently infer the predicates in the transitions of the SFA without querying the whole alphabet set. 4. Finally, we use our learning algorithm toolbox as the basis for the development of a set of black-box testing algorithms. More specifically, we present Grammar Oriented Filter Auditing (GOFA), a novel technique which allows one to utilize my learning algorithms to evaluate the robustness of a string sanitizer or filter against a set of attack strings given as a context-free grammar. Furthermore, because such grammars are many times unavailable, we developed sfadiff a differential testing technique based on symbolic automata learning which can be used in order to perform differential testing of two different parser implementations using SFA learning algorithms and we demonstrate how our algorithm can be used to develop program fingerprints. We evaluate our algorithms against state-of-the-art Web Application Firewalls and discover over 15 previously unknown vulnerabilities which result in evading the firewalls and performing code injection attacks in the backend Web application. Finally, we show how our learning algorithms can uncover vulnerabilities which are missed by other black-box methods such as fuzzing and grammar-based testing

Columbia University Academic Commons