Search CORE

12,381 research outputs found

Bayesian Structural Inference for Hidden Processes

Author: Crutchfield James P.
Strelioff Christopher C.
Publication venue: 'American Physical Society (APS)'
Publication date: 05/09/2013
Field of study

We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian Structural Inference (BSI) relies on a set of candidate unifilar HMM (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological epsilon-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be epsilon-machines, irrespective of estimated transition probabilities. Properties of epsilon-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSI's effectiveness in estimating a process's randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.Comment: 20 pages, 11 figures, 1 table; supplementary materials, 15 pages, 11 figures, 6 tables; http://csc.ucdavis.edu/~cmg/compmech/pubs/bsihp.ht

arXiv.org e-Print Archive

eScholarship - University of California

Recommended from our members

Machine learning : techniques and foundations

Author: Carbonell Jaime G.
Langley Pat
Publication venue: eScholarship, University of California
Publication date: 30/03/1987
Field of study

The field of machine learning studies computational methods for acquiring new knowledge, new skills, and new ways to organize existing knowledge. In this paper we present some of the basic techniques and principles that underlie AI research on learning, including methods for learning from examples, learning in problem solving, learning by analogy, grammar acquisition, and machine discovery. In each case, we illustrate the techniques with paradigmatic examples

eScholarship - University of California

Structure induction by lossless graph compression

Author: Peshkin Leonid
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

This work is motivated by the necessity to automate the discovery of structure in vast and evergrowing collection of relational data commonly represented as graphs, for example genomic networks. A novel algorithm, dubbed Graphitour, for structure induction by lossless graph compression is presented and illustrated by a clear and broadly known case of nested structure in a DNA molecule. This work extends to graphs some well established approaches to grammatical inference previously applied only to strings. The bottom-up graph compression problem is related to the maximum cardinality (non-bipartite) maximum cardinality matching problem. The algorithm accepts a variety of graph types including directed graphs and graphs with labeled nodes and arcs. The resulting structure could be used for representation and classification of graphs.Comment: 10 pages, 7 figures, 2 tables published in Proceedings of the Data Compression Conference, 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Effect of Non-tightness on Bayesian Estimation of PCFGs

Author: Cohen S. B.
Johnson M.
Publication venue
Publication date: 01/01/2013
Field of study

Probabilistic context-free grammars have the unusual property of not always defining tight distributions (i.e., the sum of the “probabilities” of the trees the grammar generates can be less than one). This paper reviews how this non-tightness can arise and discusses its impact on Bayesian estimation of PCFGs. We begin by presenting the notion of “almost everywhere tight grammars ” and show that linear CFGs follow it. We then propose three different ways of reinterpreting non-tight PCFGs to make them tight, show that the Bayesian estimators in Johnson et al. (2007) are correct under one of them, and provide MCMC samplers for the other two. We conclude with a discussion of the impact of tightness empirically.

CiteSeerX

Edinburgh Research Explorer

Macquarie University ResearchOnline

SkILL - a Stochastic Inductive Logic Learner

Author: Côrte-Real Joana
Dutra Inês
Mantadelis Theofrastos
Rocha Ricardo
Publication venue
Publication date: 02/06/2015
Field of study

Probabilistic Inductive Logic Programming (PILP) is a rel- atively unexplored area of Statistical Relational Learning which extends classic Inductive Logic Programming (ILP). This work introduces SkILL, a Stochastic Inductive Logic Learner, which takes probabilistic annotated data and produces First Order Logic theories. Data in several domains such as medicine and bioinformatics have an inherent degree of uncer- tainty, that can be used to produce models closer to reality. SkILL can not only use this type of probabilistic data to extract non-trivial knowl- edge from databases, but it also addresses efficiency issues by introducing a novel, efficient and effective search strategy to guide the search in PILP environments. The capabilities of SkILL are demonstrated in three dif- ferent datasets: (i) a synthetic toy example used to validate the system, (ii) a probabilistic adaptation of a well-known biological metabolism ap- plication, and (iii) a real world medical dataset in the breast cancer domain. Results show that SkILL can perform as well as a deterministic ILP learner, while also being able to incorporate probabilistic knowledge that would otherwise not be considered

arXiv.org e-Print Archive

Crossref