3,554 research outputs found

    Coherent frequentism

    Full text link
    By representing the range of fair betting odds according to a pair of confidence set estimators, dual probability measures on parameter space called frequentist posteriors secure the coherence of subjective inference without any prior distribution. The closure of the set of expected losses corresponding to the dual frequentist posteriors constrains decisions without arbitrarily forcing optimization under all circumstances. This decision theory reduces to those that maximize expected utility when the pair of frequentist posteriors is induced by an exact or approximate confidence set estimator or when an automatic reduction rule is applied to the pair. In such cases, the resulting frequentist posterior is coherent in the sense that, as a probability distribution of the parameter of interest, it satisfies the axioms of the decision-theoretic and logic-theoretic systems typically cited in support of the Bayesian posterior. Unlike the p-value, the confidence level of an interval hypothesis derived from such a measure is suitable as an estimator of the indicator of hypothesis truth since it converges in sample-space probability to 1 if the hypothesis is true or to 0 otherwise under general conditions.Comment: The confidence-measure theory of inference and decision is explicitly extended to vector parameters of interest. The derivation of upper and lower confidence levels from valid and nonconservative set estimators is formalize

    A Multi-scale View of the Emergent Complexity of Life: A Free-energy Proposal

    Get PDF
    We review some of the main implications of the free-energy principle (FEP) for the study of the self-organization of living systems – and how the FEP can help us to understand (and model) biotic self-organization across the many temporal and spatial scales over which life exists. In order to maintain its integrity as a bounded system, any biological system - from single cells to complex organisms and societies - has to limit the disorder or dispersion (i.e., the long-run entropy) of its constituent states. We review how this can be achieved by living systems that minimize their variational free energy. Variational free energy is an information theoretic construct, originally introduced into theoretical neuroscience and biology to explain perception, action, and learning. It has since been extended to explain the evolution, development, form, and function of entire organisms, providing a principled model of biotic self-organization and autopoiesis. It has provided insights into biological systems across spatiotemporal scales, ranging from microscales (e.g., sub- and multicellular dynamics), to intermediate scales (e.g., groups of interacting animals and culture), through to macroscale phenomena (the evolution of entire species). A crucial corollary of the FEP is that an organism just is (i.e., embodies or entails) an implicit model of its environment. As such, organisms come to embody causal relationships of their ecological niche, which, in turn, is influenced by their resulting behaviors. Crucially, free-energy minimization can be shown to be equivalent to the maximization of Bayesian model evidence. This allows us to cast natural selection in terms of Bayesian model selection, providing a robust theoretical account of how organisms come to match or accommodate the spatiotemporal complexity of their surrounding niche. In line with the theme of this volume; namely, biological complexity and self-organization, this chapter will examine a variational approach to self-organization across multiple dynamical scales

    Backwards is the way forward: feedback in the cortical hierarchy predicts the expected future

    Get PDF
    Clark offers a powerful description of the brain as a prediction machine, which offers progress on two distinct levels. First, on an abstract conceptual level, it provides a unifying framework for perception, action, and cognition (including subdivisions such as attention, expectation, and imagination). Second, hierarchical prediction offers progress on a concrete descriptive level for testing and constraining conceptual elements and mechanisms of predictive coding models (estimation of predictions, prediction errors, and internal models)

    On the consistency of Multithreshold Entropy Linear Classifier

    Get PDF
    Multithreshold Entropy Linear Classifier (MELC) is a recent classifier idea which employs information theoretic concept in order to create a multithreshold maximum margin model. In this paper we analyze its consistency over multithreshold linear models and show that its objective function upper bounds the amount of misclassified points in a similar manner like hinge loss does in support vector machines. For further confirmation we also conduct some numerical experiments on five datasets.Comment: Presented at Theoretical Foundations of Machine Learning 2015 (http://tfml.gmum.net), final version published in Schedae Informaticae Journa

    Learning to Discover Sparse Graphical Models

    Get PDF
    We consider structure discovery of undirected graphical models from observational data. Inferring likely structures from few examples is a complex task often requiring the formulation of priors and sophisticated inference procedures. Popular methods rely on estimating a penalized maximum likelihood of the precision matrix. However, in these approaches structure recovery is an indirect consequence of the data-fit term, the penalty can be difficult to adapt for domain-specific knowledge, and the inference is computationally demanding. By contrast, it may be easier to generate training samples of data that arise from graphs with the desired structure properties. We propose here to leverage this latter source of information as training data to learn a function, parametrized by a neural network that maps empirical covariance matrices to estimated graph structures. Learning this function brings two benefits: it implicitly models the desired structure or sparsity properties to form suitable priors, and it can be tailored to the specific problem of edge structure discovery, rather than maximizing data likelihood. Applying this framework, we find our learnable graph-discovery method trained on synthetic data generalizes well: identifying relevant edges in both synthetic and real data, completely unknown at training time. We find that on genetics, brain imaging, and simulation data we obtain performance generally superior to analytical methods
    corecore