20,160 research outputs found

    Entropy Concentration and the Empirical Coding Game

    Full text link
    We give a characterization of Maximum Entropy/Minimum Relative Entropy inference by providing two `strong entropy concentration' theorems. These theorems unify and generalize Jaynes' `concentration phenomenon' and Van Campenhout and Cover's `conditional limit theorem'. The theorems characterize exactly in what sense a prior distribution Q conditioned on a given constraint, and the distribution P, minimizing the relative entropy D(P ||Q) over all distributions satisfying the constraint, are `close' to each other. We then apply our theorems to establish the relationship between entropy concentration and a game-theoretic characterization of Maximum Entropy Inference due to Topsoe and others.Comment: A somewhat modified version of this paper was published in Statistica Neerlandica 62(3), pages 374-392, 200

    A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity

    Get PDF
    We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of data-independent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the data-dependent information complexity (also known as stochastic or PAC-Bayesian, KL(posterior∄⁥prior)\mathrm{KL}(\text{posterior} \operatorname{\|} \text{prior}) complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax log-loss individual-sequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to L2(P)L_2(P) entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and Cesa-Bianchi who did the log-loss case with L∞L_\infty. Together, these results recover optimal bounds for VC- and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity.Comment: 38 page

    Almost the Best of Three Worlds: Risk, Consistency and Optional Stopping for the Switch Criterion in Nested Model Selection

    Get PDF
    We study the switch distribution, introduced by Van Erven et al. (2012), applied to model selection and subsequent estimation. While switching was known to be strongly consistent, here we show that it achieves minimax optimal parametric risk rates up to a log⁥log⁥n\log\log n factor when comparing two nested exponential families, partially confirming a conjecture by Lauritzen (2012) and Cavanaugh (2012) that switching behaves asymptotically like the Hannan-Quinn criterion. Moreover, like Bayes factor model selection but unlike standard significance testing, when one of the models represents a simple hypothesis, the switch criterion defines a robust null hypothesis test, meaning that its Type-I error probability can be bounded irrespective of the stopping rule. Hence, switching is consistent, insensitive to optional stopping and almost minimax risk optimal, showing that, Yang's (2005) impossibility result notwithstanding, it is possible to `almost' combine the strengths of AIC and Bayes factor model selection.Comment: To appear in Statistica Sinic

    A survey on fractional variational calculus

    Full text link
    Main results and techniques of the fractional calculus of variations are surveyed. We consider variational problems containing Caputo derivatives and study them using both indirect and direct methods. In particular, we provide necessary optimality conditions of Euler-Lagrange type for the fundamental, higher-order, and isoperimetric problems, and compute approximated solutions based on truncated Gr\"{u}nwald--Letnikov approximations of Caputo derivatives.Comment: This is a preprint of a paper whose final and definite form is in 'Handbook of Fractional Calculus with Applications. Vol 1: Basic Theory', De Gruyter. Submitted 29-March-2018; accepted, after a revision, 13-June-201

    Taylor- and fugacity expansion for the effective center model of QCD at finite density

    Full text link
    Using the effective center model of QCD we test series expansions for finite chemical potential ÎŒ\mu. In particular we study two variants of Taylor expansion as well as the fugacity series. The effective center model has a dual representation where the sign problem is absent and reliable Monte Carlo simulations are possible at arbitrary ÎŒ\mu. We use the results from the dual simulation as reference data to assess the Taylor- and fugacity series approaches. We find that for most of parameter space fugacity expansion is the best (but also numerically most expensive) choice for reproducing the dual simulation results, while conventional Taylor expansion is reliable only for very small ÎŒ\mu. We also discuss the results of a modified Taylor expansion in e±Ό−1e^{\pm \mu} - 1 which at the same numerical effort clearly outperforms the conventional Taylor series.Comment: presented at the 31st International Symposium on Lattice Field Theory (Lattice 2013), 29 July - 3 August 2013, Mainz, Germany. Reference adde

    Optional Stopping with Bayes Factors: a categorization and extension of folklore results, with an application to invariant situations

    Get PDF
    It is often claimed that Bayesian methods, in particular Bayes factor methods for hypothesis testing, can deal with optional stopping. We first give an overview, using elementary probability theory, of three different mathematical meanings that various authors give to this claim: (1) stopping rule independence, (2) posterior calibration and (3) (semi-) frequentist robustness to optional stopping. We then prove theorems to the effect that these claims do indeed hold in a general measure-theoretic setting. For claims of type (2) and (3), such results are new. By allowing for non-integrable measures based on improper priors, we obtain particularly strong results for the practically important case of models with nuisance parameters satisfying a group invariance (such as location or scale). We also discuss the practical relevance of (1)--(3), and conclude that whether Bayes factor methods actually perform well under optional stopping crucially depends on details of models, priors and the goal of the analysis.Comment: 29 page

    On derivations with respect to finite sets of smooth functions

    Full text link
    The purpose of this paper is to show that functions that derivate the two-variable product function and one of the exponential, trigonometric or hyperbolic functions are also standard derivations. The more general problem considered is to describe finite sets of differentiable functions such that derivations with respect to this set are automatically standard derivations
    • 

    corecore