3,554 research outputs found
Coherent frequentism
By representing the range of fair betting odds according to a pair of
confidence set estimators, dual probability measures on parameter space called
frequentist posteriors secure the coherence of subjective inference without any
prior distribution. The closure of the set of expected losses corresponding to
the dual frequentist posteriors constrains decisions without arbitrarily
forcing optimization under all circumstances. This decision theory reduces to
those that maximize expected utility when the pair of frequentist posteriors is
induced by an exact or approximate confidence set estimator or when an
automatic reduction rule is applied to the pair. In such cases, the resulting
frequentist posterior is coherent in the sense that, as a probability
distribution of the parameter of interest, it satisfies the axioms of the
decision-theoretic and logic-theoretic systems typically cited in support of
the Bayesian posterior. Unlike the p-value, the confidence level of an interval
hypothesis derived from such a measure is suitable as an estimator of the
indicator of hypothesis truth since it converges in sample-space probability to
1 if the hypothesis is true or to 0 otherwise under general conditions.Comment: The confidence-measure theory of inference and decision is explicitly
extended to vector parameters of interest. The derivation of upper and lower
confidence levels from valid and nonconservative set estimators is formalize
A Multi-scale View of the Emergent Complexity of Life: A Free-energy Proposal
We review some of the main implications of the free-energy principle (FEP) for the study of the self-organization of living systems – and how the FEP can help us to understand (and model) biotic self-organization across the many temporal and spatial scales over which life exists. In order to maintain its integrity as a bounded system, any biological system - from single cells to complex organisms and societies - has to limit the disorder or dispersion (i.e., the long-run entropy) of its constituent states. We review how this can be achieved by living systems that minimize their variational free energy. Variational free energy is an information theoretic construct, originally introduced into theoretical neuroscience and biology to explain perception, action, and learning. It has since been extended to explain the evolution, development, form, and function of entire organisms, providing a principled model of biotic self-organization and autopoiesis. It has provided insights into biological systems across spatiotemporal scales, ranging from microscales (e.g., sub- and multicellular dynamics), to intermediate scales (e.g., groups of interacting animals and culture), through to macroscale phenomena (the evolution of entire species). A crucial corollary of the FEP is that an organism just is (i.e., embodies or entails) an implicit model of its environment. As such, organisms come to embody causal relationships of their ecological niche, which, in turn, is influenced by their resulting behaviors. Crucially, free-energy minimization can be shown to be equivalent to the maximization of Bayesian model evidence. This allows us to cast natural selection in terms of Bayesian model selection, providing a robust theoretical account of how organisms come to match or accommodate the spatiotemporal complexity of their surrounding niche. In line with the theme of this volume; namely, biological complexity and self-organization, this chapter will examine a variational approach to self-organization across multiple dynamical scales
Backwards is the way forward: feedback in the cortical hierarchy predicts the expected future
Clark offers a powerful description of the brain as a prediction machine, which offers progress on two distinct levels. First, on an abstract conceptual level, it provides a unifying framework for perception, action, and cognition (including subdivisions such as attention, expectation, and imagination). Second, hierarchical prediction offers progress on a concrete descriptive level for testing and constraining conceptual elements and mechanisms of predictive coding models (estimation of predictions, prediction errors, and internal models)
On the consistency of Multithreshold Entropy Linear Classifier
Multithreshold Entropy Linear Classifier (MELC) is a recent classifier idea
which employs information theoretic concept in order to create a multithreshold
maximum margin model. In this paper we analyze its consistency over
multithreshold linear models and show that its objective function upper bounds
the amount of misclassified points in a similar manner like hinge loss does in
support vector machines. For further confirmation we also conduct some
numerical experiments on five datasets.Comment: Presented at Theoretical Foundations of Machine Learning 2015
(http://tfml.gmum.net), final version published in Schedae Informaticae
Journa
Learning to Discover Sparse Graphical Models
We consider structure discovery of undirected graphical models from
observational data. Inferring likely structures from few examples is a complex
task often requiring the formulation of priors and sophisticated inference
procedures. Popular methods rely on estimating a penalized maximum likelihood
of the precision matrix. However, in these approaches structure recovery is an
indirect consequence of the data-fit term, the penalty can be difficult to
adapt for domain-specific knowledge, and the inference is computationally
demanding. By contrast, it may be easier to generate training samples of data
that arise from graphs with the desired structure properties. We propose here
to leverage this latter source of information as training data to learn a
function, parametrized by a neural network that maps empirical covariance
matrices to estimated graph structures. Learning this function brings two
benefits: it implicitly models the desired structure or sparsity properties to
form suitable priors, and it can be tailored to the specific problem of edge
structure discovery, rather than maximizing data likelihood. Applying this
framework, we find our learnable graph-discovery method trained on synthetic
data generalizes well: identifying relevant edges in both synthetic and real
data, completely unknown at training time. We find that on genetics, brain
imaging, and simulation data we obtain performance generally superior to
analytical methods
- …