21,543 research outputs found

    Minimum Conditional Description Length Estimation for Markov Random Fields

    Full text link
    In this paper we discuss a method, which we call Minimum Conditional Description Length (MCDL), for estimating the parameters of a subset of sites within a Markov random field. We assume that the edges are known for the entire graph G=(V,E)G=(V,E). Then, for a subset U⊂VU\subset V, we estimate the parameters for nodes and edges in UU as well as for edges incident to a node in UU, by finding the exponential parameter for that subset that yields the best compression conditioned on the values on the boundary ∂U\partial U. Our estimate is derived from a temporally stationary sequence of observations on the set UU. We discuss how this method can also be applied to estimate a spatially invariant parameter from a single configuration, and in so doing, derive the Maximum Pseudo-Likelihood (MPL) estimate.Comment: Information Theory and Applications (ITA) workshop, February 201

    Neighborhood radius estimation in Variable-neighborhood Random Fields

    Full text link
    We consider random fields defined by finite-region conditional probabilities depending on a neighborhood of the region which changes with the boundary conditions. To predict the symbols within any finite region it is necessary to inspect a random number of neighborhood symbols which might change according to the value of them. In analogy to the one dimensional setting we call these neighborhood symbols the context of the region. This framework is a natural extension, to d-dimensional fields, of the notion of variable-length Markov chains introduced by Rissanen (1983) in his classical paper. We define an algorithm to estimate the radius of the smallest ball containing the context based on a realization of the field. We prove the consistency of this estimator. Our proofs are constructive and yield explicit upper bounds for the probability of wrong estimation of the radius of the context

    Divergence rates of Markov order estimators and their application to statistical estimation of stationary ergodic processes

    Get PDF
    Stationary ergodic processes with finite alphabets are estimated by finite memory processes from a sample, an n-length realization of the process, where the memory depth of the estimator process is also estimated from the sample using penalized maximum likelihood (PML). Under some assumptions on the continuity rate and the assumption of non-nullness, a rate of convergence in dˉ\bar{d}-distance is obtained, with explicit constants. The result requires an analysis of the divergence of PML Markov order estimators for not necessarily finite memory processes. This divergence problem is investigated in more generality for three information criteria: the Bayesian information criterion with generalized penalty term yielding the PML, and the normalized maximum likelihood and the Krichevsky-Trofimov code lengths. Lower and upper bounds on the estimated order are obtained. The notion of consistent Markov order estimation is generalized for infinite memory processes using the concept of oracle order estimates, and generalized consistency of the PML Markov order estimator is presented.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ468 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Bayesian Networks for Max-linear Models

    Full text link
    We study Bayesian networks based on max-linear structural equations as introduced in Gissibl and Kl\"uppelberg [16] and provide a summary of their independence properties. In particular we emphasize that distributions for such networks are generally not faithful to the independence model determined by their associated directed acyclic graph. In addition, we consider some of the basic issues of estimation and discuss generalized maximum likelihood estimation of the coefficients, using the concept of a generalized likelihood ratio for non-dominated families as introduced by Kiefer and Wolfowitz [21]. Finally we argue that the structure of a minimal network asymptotically can be identified completely from observational data.Comment: 18 page

    Causal inference using the algorithmic Markov condition

    Full text link
    Inferring the causal structure that links n observables is usually based upon detecting statistical dependences and choosing simple graphs that make the joint measure Markovian. Here we argue why causal inference is also possible when only single observations are present. We develop a theory how to generate causal graphs explaining similarities between single objects. To this end, we replace the notion of conditional stochastic independence in the causal Markov condition with the vanishing of conditional algorithmic mutual information and describe the corresponding causal inference rules. We explain why a consistent reformulation of causal inference in terms of algorithmic complexity implies a new inference principle that takes into account also the complexity of conditional probability densities, making it possible to select among Markov equivalent causal graphs. This insight provides a theoretical foundation of a heuristic principle proposed in earlier work. We also discuss how to replace Kolmogorov complexity with decidable complexity criteria. This can be seen as an algorithmic analog of replacing the empirically undecidable question of statistical independence with practical independence tests that are based on implicit or explicit assumptions on the underlying distribution.Comment: 16 figure
    • 

    corecore