805 research outputs found

    Entropic Projections and Dominating Points

    Get PDF
    Generalized entropic projections and dominating points are solutions to convex minimization problems related to conditional laws of large numbers. They appear in many areas of applied mathematics such as statistical physics, information theory, mathematical statistics, ill-posed inverse problems or large deviation theory. By means of convex conjugate duality and functional analysis, criteria are derived for their existence. Representations of the generalized entropic projections are obtained: they are the ``measure component" of some extended entropy minimization problem.Comment: ESAIM P&S (2011) to appea

    Minimization of entropy functionals

    Get PDF
    Entropy functionals (i.e. convex integral functionals) and extensions of these functionals are minimized on convex sets. This paper is aimed at reducing as much as possible the assumptions on the constraint set. Dual equalities and characterizations of the minimizers are obtained with weak constraint qualifications

    Generalized minimizers of convex integral functionals, Bregman distance, Pythagorean identities

    Get PDF
    Integral functionals based on convex normal integrands are minimized subject to finitely many moment constraints. The integrands are finite on the positive and infinite on the negative numbers, strictly convex but not necessarily differentiable. The minimization is viewed as a primal problem and studied together with a dual one in the framework of convex duality. The effective domain of the value function is described by a conic core, a modification of the earlier concept of convex core. Minimizers and generalized minimizers are explicitly constructed from solutions of modified dual problems, not assuming the primal constraint qualification. A generalized Pythagorean identity is presented using Bregman distance and a correction term for lack of essential smoothness in integrands. Results are applied to minimization of Bregman distances. Existence of a generalized dual solution is established whenever the dual value is finite, assuming the dual constraint qualification. Examples of `irregular' situations are included, pointing to the limitations of generality of certain key results

    Entropy: The Markov Ordering Approach

    Full text link
    The focus of this article is on entropy and Markov processes. We study the properties of functionals which are invariant with respect to monotonic transformations and analyze two invariant "additivity" properties: (i) existence of a monotonic transformation which makes the functional additive with respect to the joining of independent systems and (ii) existence of a monotonic transformation which makes the functional additive with respect to the partitioning of the space of states. All Lyapunov functionals for Markov chains which have properties (i) and (ii) are derived. We describe the most general ordering of the distribution space, with respect to which all continuous-time Markov processes are monotonic (the {\em Markov order}). The solution differs significantly from the ordering given by the inequality of entropy growth. For inference, this approach results in a convex compact set of conditionally "most random" distributions.Comment: 50 pages, 4 figures, Postprint version. More detailed discussion of the various entropy additivity properties and separation of variables for independent subsystems in MaxEnt problem is added in Section 4.2. Bibliography is extende

    Relative entropy and the multi-variable multi-dimensional moment problem

    Full text link
    Entropy-like functionals on operator algebras have been studied since the pioneering work of von Neumann, Umegaki, Lindblad, and Lieb. The most well-known are the von Neumann entropy trace(ρlogâĄÏ)trace (\rho\log \rho) and a generalization of the Kullback-Leibler distance trace(ρlogâĄÏâˆ’ÏlogâĄÏƒ)trace (\rho \log \rho - \rho \log \sigma), refered to as quantum relative entropy and used to quantify distance between states of a quantum system. The purpose of this paper is to explore these as regularizing functionals in seeking solutions to multi-variable and multi-dimensional moment problems. It will be shown that extrema can be effectively constructed via a suitable homotopy. The homotopy approach leads naturally to a further generalization and a description of all the solutions to such moment problems. This is accomplished by a renormalization of a Riemannian metric induced by entropy functionals. As an application we discuss the inverse problem of describing power spectra which are consistent with second-order statistics, which has been the main motivation behind the present work.Comment: 24 pages, 3 figure

    From Stochastic Mixability to Fast Rates

    Full text link
    Empirical risk minimization (ERM) is a fundamental learning rule for statistical learning problems where the data is generated according to some unknown distribution P\mathsf{P} and returns a hypothesis ff chosen from a fixed class F\mathcal{F} with small loss ℓ\ell. In the parametric setting, depending upon (ℓ,F,P)(\ell, \mathcal{F},\mathsf{P}) ERM can have slow (1/n)(1/\sqrt{n}) or fast (1/n)(1/n) rates of convergence of the excess risk as a function of the sample size nn. There exist several results that give sufficient conditions for fast rates in terms of joint properties of ℓ\ell, F\mathcal{F}, and P\mathsf{P}, such as the margin condition and the Bernstein condition. In the non-statistical prediction with expert advice setting, there is an analogous slow and fast rate phenomenon, and it is entirely characterized in terms of the mixability of the loss ℓ\ell (there being no role there for F\mathcal{F} or P\mathsf{P}). The notion of stochastic mixability builds a bridge between these two models of learning, reducing to classical mixability in a special case. The present paper presents a direct proof of fast rates for ERM in terms of stochastic mixability of (ℓ,F,P)(\ell,\mathcal{F}, \mathsf{P}), and in so doing provides new insight into the fast-rates phenomenon. The proof exploits an old result of Kemperman on the solution to the general moment problem. We also show a partial converse that suggests a characterization of fast rates for ERM in terms of stochastic mixability is possible.Comment: 21 pages, accepted to NIPS 201

    An informational approach to the global optimization of expensive-to-evaluate functions

    Full text link
    In many global optimization problems motivated by engineering applications, the number of function evaluations is severely limited by time or cost. To ensure that each evaluation contributes to the localization of good candidates for the role of global minimizer, a sequential choice of evaluation points is usually carried out. In particular, when Kriging is used to interpolate past evaluations, the uncertainty associated with the lack of information on the function can be expressed and used to compute a number of criteria accounting for the interest of an additional evaluation at any given point. This paper introduces minimizer entropy as a new Kriging-based criterion for the sequential choice of points at which the function should be evaluated. Based on \emph{stepwise uncertainty reduction}, it accounts for the informational gain on the minimizer expected from a new evaluation. The criterion is approximated using conditional simulations of the Gaussian process model behind Kriging, and then inserted into an algorithm similar in spirit to the \emph{Efficient Global Optimization} (EGO) algorithm. An empirical comparison is carried out between our criterion and \emph{expected improvement}, one of the reference criteria in the literature. Experimental results indicate major evaluation savings over EGO. Finally, the method, which we call IAGO (for Informational Approach to Global Optimization) is extended to robust optimization problems, where both the factors to be tuned and the function evaluations are corrupted by noise.Comment: Accepted for publication in the Journal of Global Optimization (This is the revised version, with additional details on computational problems, and some grammatical changes

    Lossy compression of discrete sources via Viterbi algorithm

    Full text link
    We present a new lossy compressor for discrete-valued sources. For coding a sequence xnx^n, the encoder starts by assigning a certain cost to each possible reconstruction sequence. It then finds the one that minimizes this cost and describes it losslessly to the decoder via a universal lossless compressor. The cost of each sequence is a linear combination of its distance from the sequence xnx^n and a linear function of its kthk^{\rm th} order empirical distribution. The structure of the cost function allows the encoder to employ the Viterbi algorithm to recover the minimizer of the cost. We identify a choice of the coefficients comprising the linear function of the empirical distribution used in the cost function which ensures that the algorithm universally achieves the optimum rate-distortion performance of any stationary ergodic source in the limit of large nn, provided that kk diverges as o(log⁥n)o(\log n). Iterative techniques for approximating the coefficients, which alleviate the computational burden of finding the optimal coefficients, are proposed and studied.Comment: 26 pages, 6 figures, Submitted to IEEE Transactions on Information Theor
    • 

    corecore