Search CORE

18 research outputs found

Some improvements of the spectral learning approach for probabilistic grammatical inference

Author: Denis Francois
Gybels Mattias
Habrard Amaury
Publication venue: HAL CCSD
Publication date: 17/09/2014
Field of study

International audienceSpectral methods propose new and elegant solutions in probabilistic grammatical inference. We propose two ways to improve them. We show how a linear representation, or equivalently a weighted automata, output by the spectral learning algorithm can be taken as an initial point for the Baum Welch algorithm, in order to increase the likelihood of the observation data. Secondly, we show how the inference problem can naturally be expressed in the framework of Structured Low-Rank Approximation. Both ideas are tested on a benchmark extracted from the PAutomaC challenge

HAL-UJM

HAL AMU

Sp2Learn: A Toolbox for the spectral learning of weighted automata *

Author
Publication venue
Publication date: 01/01/2016
Field of study

Abstract Sp2Learn is a Python toolbox for the spectral learning of weighted automata from a set of strings, licensed under Free BSD. This paper gives the main formal ideas behind the spectral learning algorithm and details the content of the toolbox. Use cases and an experimental section are also provided

CiteSeerX

PAC learning of Probabilistic Automaton based on the Method of Moments

Author: Glaude Hadrien
Pietquin Olivier
Publication venue: HAL CCSD
Publication date: 19/06/2016
Field of study

International audienceProbabilitic Finite Automata (PFA) are gener-ative graphical models that define distributions with latent variables over finite sequences of symbols, a.k.a. stochastic languages. Traditionally , unsupervised learning of PFA is performed through algorithms that iteratively improves the likelihood like the Expectation-Maximization (EM) algorithm. Recently, learning algorithms based on the so-called Method of Moments (MoM) have been proposed as a much faster alternative that comes with PAC-style guarantees. However, these algorithms do not ensure the learnt automata to model a proper distribution , limiting their applicability and preventing them to serve as an initialization to iterative algorithms. In this paper, we propose a new MoM-based algorithm with PAC-style guarantees that learns automata defining proper distributions. We assess its performances on synthetic problems from the PAutomaC challenge and real datasets extracted from Wikipedia against previous MoM-based algorithms and EM algorithm

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Non-negative Spectral Learning for Linear Sequential Systems

Author: AE Guterman
CJ Lin
F Denis
JW Carlyle
R Bailly
R Bailly
S Verwer
SA Vavasis
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Spectral learning with proper probabilities for finite state automation

Author: Enderli Cyrille
Glaude Hadrien
Pietquin Olivier
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/12/2015
Field of study

International audienceProbabilistic Finite Automaton (PFA), Probabilistic Finite State Transducers (PFST) and Hidden Markov Models (HMM) are widely used in Automatic Speech Recognition (ASR), Text-to-Speech (TTS) systems and Part Of Speech (POS) tagging for language mod-eling. Traditionally, unsupervised learning of these latent variable models is done by Expectation-Maximization (EM)-like algorithms, as the Baum-Welch algorithm. In a recent alternative line of work, learning algorithms based on spectral properties of some low order moments matrices or tensors were proposed. In comparison to EM, they are orders of magnitude faster and come with theoretical convergence guarantees. However, returned models are not ensured to compute proper distributions. They often return negative values that do not sum to one, limiting their applicability and preventing them to serve as an initialization to EM-like algorithms. In this paper, we propose a new spectral algorithm able to learn a large range of models constrained to return proper distributions. We assess its performances on synthetic problems from the PAutomaC challenge and real datasets extracted from Wikipedia. Experiments show that it outperforms previous spectral approaches as well as the Baum-Welch algorithm with random restarts, in addition to serve as an efficient initialization step to EM-like algorithms

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Flexible State-Merging for learning (P)DFAs in Python

Author: Hammerschmidt Christian
Loos Benjamin Laurent
State Radu
Verwer Sicco
Publication venue
Publication date: 01/10/2016
Field of study

We present a Python package for learning (non-)probabilistic deterministic finite state automata and provide heuristics in the red-blue framework. As our package is built along the API of the popular \texttt{scikit-learn} package, it is easy to use and new learning methods are easy to add. It provides PDFA learning as an additional tool for sequence prediction or classification to data scientists, without the need to understand the algorithm itself but rather the limitations of PDFA as a model. With applications of automata learning in diverse fields such as network traffic analysis, software engineering and biology, a stratified package opens opportunities for practitioners

Open Repository and Bibliography - Luxembourg