Search CORE

602 research outputs found

On generalized entropies, Bayesian decisions and statistical diversity

Author: Vajda Igor
Zvárová Jana
Publication venue: Institute of Information Theory and Automation AS CR
Publication date: 01/01/2007
Field of study

summary:The paper summarizes and extends the theory of generalized

\phi

-entropies

H_{\phi }(X)

of random variables

X

obtained as

\phi

-informations

I_{\phi }(X;Y)

about

X

maximized over random variables

Y

. Among the new results is the proof of the fact that these entropies need not be concave functions of distributions

p_{X}

. An extended class of power entropies

H_{\alpha }(X)

is introduced, parametrized by

\alpha \in {\mathbb{R}}

, where

H_{\alpha }(X)

are concave in

p_{X}

for

\alpha \ge 0

and convex for

\alpha <0

. It is proved that all power entropies with

\alpha \le 2

are maximal

\phi

-informations

I_{\phi }(X;X)

for appropriate

\phi

depending on

\alpha

. Prominent members of this subclass of power entropies are the Shannon entropy

H_{1}(X)

and the quadratic entropy

H_{2}(X)

. The paper investigates also the tightness of practically important previously established relations between these two entropies and errors

e(X)

of Bayesian decisions about possible realizations of

X

. The quadratic entropy is shown to provide estimates which are in average more than 100 % tighter those based on the Shannon entropy, and this tightness is shown to increase even further when

\alpha

increases beyond

\alpha =2

. Finally, the paper studies various measures of statistical diversity and introduces a general measure of anisotony between them. This measure is numerically evaluated for the entropic measures of diversity

H_1(X)

and

H_2(X)

Institute of Mathematics AS CR, v. v. i.

Generalized information theory meets human cognition: Introducing a unified framework to model uncertainty and information search

Author: Aczél
Aczél
Aczél
Arimoto
Austerweil
Bar-Hillel
Baron
Baron
Barwise
Beck
Ben-Bassat
Benish
Boztas
Bradley
Bramley
Brier
Brössel
Carnap
Cho
Crupi
Crupi
Crupi
Crupi
Crupi
Csizár
D'Agostino
Daróczy
Dawid
Denzler
Evans
Fano
Festa
Festa
Fitelson
Fitelson
Fitelson
Floridi
Floridi
Frank
Gauvrit
Gibbs
Gini
Glass
Gneiting
Good
Good
Good
Goosens
Griffiths
Griffiths
Gureckis
Hartley
Hasson
Hattori
Hattori
Havrda
Hill
Horwich
Hurlbert
Hájek
Jost
Kaniadakis
Katsikopoulos
Keylock
Kim
Klayman
Kyburg
Laakso
Lande
Lee
Legge
Leitgeb
Leitgeb
Levinstein
Lewontin
Lindley
Markant
Masi
Meder
Meder
Muliere
Najemnik
Najemnik
Naudts
Navarro
Nelson
Nelson
Nelson
Nelson
Nelson
Nelson
Niiniluoto
Oaksford
Oaksford
Oaksford
Oaksford
Patil
Pedersen
Pettigrew
Popper
Predd
Pyl
Raiffa
Raileanu
Ramírez-Reyes
Rao
Renninger
Ricotta
Roche
Roche
Ruggeri
Rusconi
Rényi
Sahoo
Savage
Schupbach
Selten
Shannon
Sharma
Simpson
Skov
Slowiaczek
Stringer
Taneja
Tentori
Tribus
Trope
Trope
Tsallis
Tsallis
Tsallis
Tweeney
Vajda
Wang
Wason
Wason
Wason
Wu
Publication venue
Publication date: 01/01/2018
Field of study

Searching for information is critical in many situations. In medicine, for instance, careful choice of a diagnostic test can help narrow down the range of plausible diseases that the patient might have. In a probabilistic framework, test selection is often modeled by assuming that people’s goal is to reduce uncertainty about possible states of the world. In cognitive science, psychology, and medical decision making, Shannon entropy is the most prominent and most widely used model to formalize probabilistic uncertainty and the reduction thereof. However, a variety of alternative entropy metrics (Hartley, Quadratic, Tsallis, Rényi, and more) are popular in the social and the natural sciences, computer science, and philosophy of science. Particular entropy measures have been predominant in particular research areas, and it is often an open issue whether these divergences emerge from different theoretical and practical goals or are merely due to historical accident. Cutting across disciplinary boundaries, we show that several entropy and entropy reduction measures arise as special cases in a unified formalism, the Sharma-Mittal framework. Using mathematical results, computer simulations, and analyses of published behavioral data, we discuss four key questions: How do various entropy models relate to each other? What insights can be obtained by considering diverse entropy models within a unified framework? What is the psychological plausibility of different entropy models? What new questions and insights for research on human information acquisition follow? Our work provides several new pathways for theoretical and empirical research, reconciling apparently conflicting approaches and empirical findings within a comprehensive and unified information-theoretic formalism

Crossref

University of Surrey

Archivio della ricerca della Scuola IMT Alti Studi Lucca

PhilSci Archive

Surrey Research Insight

MPG.PuRe

Generalized information theory meets human cognition: Introducing a unified framework to model uncertainty and information search

Author: CEVOLANI Gustavo
CRUPI Vincenzo
MEDER Björn
NELSON Jonathan
TENTORI Katya
Publication venue
Publication date: 01/01/2018
Field of study

Generalized information criteria for Bayes decisions

Author: Morales Domingo
Vajda Igor
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

summary:This paper deals with Bayesian models given by statistical experiments and standard loss functions. Bayes probability of error and Bayes risk are estimated by means of classical and generalized information criteria applicable to the experiment. The accuracy of the estimation is studied. Among the information criteria studied in the paper is the class of posterior power entropies which include the Shannon entropy as special case for the power

\alpha =1

. It is shown that the most accurate estimate is in this class achieved by the quadratic posterior entropy of the power

\alpha =2

. The paper introduces and studies also a new class of alternative power entropies which in general estimate the Bayes errors and risk more tightly than the classical power entropies. Concrete examples, tables and figures illustrate the obtained results

Institute of Mathematics AS CR, v. v. i.

Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory

Author: Dawid A. Philip
Grunwald Peter D.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2002
Field of study

We describe and develop a close relationship between two problems that have customarily been regarded as distinct: that of maximizing entropy, and that of minimizing worst-case expected loss. Using a formulation grounded in the equilibrium theory of zero-sum games between Decision Maker and Nature, these two problems are shown to be dual to each other, the solution to each providing that to the other. Although Tops\oe described this connection for the Shannon entropy over 20 years ago, it does not appear to be widely known even in that important special case. We here generalize this theory to apply to arbitrary decision problems and loss functions. We indicate how an appropriate generalized definition of entropy can be associated with such a problem, and we show that, subject to certain regularity conditions, the above-mentioned duality continues to apply in this extended context. This simultaneously provides a possible rationale for maximizing entropy and a tool for finding robust Bayes acts. We also describe the essential identity between the problem of maximizing entropy and that of minimizing a related discrepancy or divergence between distributions. This leads to an extension, to arbitrary discrepancies, of a well-known minimax theorem for the case of Kullback-Leibler divergence (the ``redundancy-capacity theorem'' of information theory). For the important case of families of distributions having certain mean values specified, we develop simple sufficient conditions and methods for identifying the desired solutions.Comment: Published by the Institute of Mathematical Statistics (http://www.imstat.org) in the Annals of Statistics (http://www.imstat.org/aos/) at http://dx.doi.org/10.1214/00905360400000055

arXiv.org e-Print Archive

CiteSeerX

Crossref

UCL Discovery

Generalizing Bayesian Optimization with Decision-theoretic Entropies

Author: Ermon Stefano
Meng Chenlin
Neiswanger Willie
Yu Lantao
Zhao Shengjia
Publication venue
Publication date: 04/10/2022
Field of study

Bayesian optimization (BO) is a popular method for efficiently inferring optima of an expensive black-box function via a sequence of queries. Existing information-theoretic BO procedures aim to make queries that most reduce the uncertainty about optima, where the uncertainty is captured by Shannon entropy. However, an optimal measure of uncertainty would, ideally, factor in how we intend to use the inferred quantity in some downstream procedure. In this paper, we instead consider a generalization of Shannon entropy from work in statistical decision theory (DeGroot 1962, Rao 1984), which contains a broad class of uncertainty measures parameterized by a problem-specific loss function corresponding to a downstream task. We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures such as knowledge gradient, expected improvement, and entropy search. We then show how alternative choices for the loss yield a flexible family of acquisition functions that can be customized for use in novel optimization settings. Additionally, we develop gradient-based methods to efficiently optimize our proposed family of acquisition functions, and demonstrate strong empirical performance on a diverse set of sequential decision making tasks, including variants of top-

k

optimization, multi-level set estimation, and sequence search.Comment: Appears in Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS 2022

arXiv.org e-Print Archive

A Boltzmann machine for the organization of intelligent machines

Author: Moed Michael C.
Saridis George N.
Publication venue
Publication date
Field of study

In the present technological society, there is a major need to build machines that would execute intelligent tasks operating in uncertain environments with minimum interaction with a human operator. Although some designers have built smart robots, utilizing heuristic ideas, there is no systematic approach to design such machines in an engineering manner. Recently, cross-disciplinary research from the fields of computers, systems AI and information theory has served to set the foundations of the emerging area of the design of intelligent machines. Since 1977 Saridis has been developing an approach, defined as Hierarchical Intelligent Control, designed to organize, coordinate and execute anthropomorphic tasks by a machine with minimum interaction with a human operator. This approach utilizes analytical (probabilistic) models to describe and control the various functions of the intelligent machine structured by the intuitively defined principle of Increasing Precision with Decreasing Intelligence (IPDI) (Saridis 1979). This principle, even though resembles the managerial structure of organizational systems (Levis 1988), has been derived on an analytic basis by Saridis (1988). The purpose is to derive analytically a Boltzmann machine suitable for optimal connection of nodes in a neural net (Fahlman, Hinton, Sejnowski, 1985). Then this machine will serve to search for the optimal design of the organization level of an intelligent machine. In order to accomplish this, some mathematical theory of the intelligent machines will be first outlined. Then some definitions of the variables associated with the principle, like machine intelligence, machine knowledge, and precision will be made (Saridis, Valavanis 1988). Then a procedure to establish the Boltzmann machine on an analytic basis will be presented and illustrated by an example in designing the organization level of an Intelligent Machine. A new search technique, the Modified Genetic Algorithm, is presented and proved to converge to the minimum of a cost function. Finally, simulations will show the effectiveness of a variety of search techniques for the intelligent machine

NASA Technical Reports Server

Justification of Logarithmic Loss via the Benefit of Side Information

Author: Courtade Thomas
Jiao Jiantao
Venkat Kartik
Weissman Tsachy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/12/2015
Field of study

We consider a natural measure of relevance: the reduction in optimal prediction risk in the presence of side information. For any given loss function, this relevance measure captures the benefit of side information for performing inference on a random variable under this loss function. When such a measure satisfies a natural data processing property, and the random variable of interest has alphabet size greater than two, we show that it is uniquely characterized by the mutual information, and the corresponding loss function coincides with logarithmic loss. In doing so, our work provides a new characterization of mutual information, and justifies its use as a measure of relevance. When the alphabet is binary, we characterize the only admissible forms the measure of relevance can assume while obeying the specified data processing property. Our results naturally extend to measuring causal influence between stochastic processes, where we unify different causal-inference measures in the literature as instantiations of directed information

arXiv.org e-Print Archive

Crossref

On the information-theoretic formulation of network participation

Author: Agius Dominic
Cajic Pavle
Cliff Oliver M.
Fulcher Ben D.
Lizier Joseph T.
Shine James M.
Publication venue
Publication date: 24/07/2023
Field of study

The participation coefficient is a widely used metric of the diversity of a node's connections with respect to a modular partition of a network. An information-theoretic formulation of this concept of connection diversity, referred to here as participation entropy, has been introduced as the Shannon entropy of the distribution of module labels across a node's connected neighbors. While diversity metrics have been studied theoretically in other literatures, including to index species diversity in ecology, many of these results have not previously been applied to networks. Here we show that the participation coefficient is a first-order approximation to participation entropy and use the desirable additive properties of entropy to develop new metrics of connection diversity with respect to multiple labelings of nodes in a network, as joint and conditional participation entropies. The information-theoretic formalism developed here allows new and more subtle types of nodal connection patterns in complex networks to be studied

arXiv.org e-Print Archive