Search CORE

8 research outputs found

Collision Entropy Estimation in a One-Line Formula

Author: Alessandro Gecchele
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 20/06/2023
Field of study

We address the unsolved question of how best to estimate the collision entropy, also called quadratic or second order Rényi entropy. Integer-order Rényi entropies are synthetic indices useful for the characterization of probability distributions. In recent decades, numerous studies have been conducted to arrive at their valid estimates starting from experimental data, so to derive suitable classification methods for the underlying processes, but optimal solutions have not been reached yet. Limited to the estimation of collision entropy, a one-line formula is presented here. The results of some specific Monte Carlo experiments give evidence of the validity of this estimator even for the very low densities of the data spread in high-dimensional sample spaces. The method strengths are unbiased consistency, generality and minimum computational cost

Cryptology ePrint Archive

LIPIcs

Author: Obremski Maciej
Skórski Maciej
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 01/01/2017
Field of study

We revisit the problem of estimating entropy of discrete distributions from independent samples, studied recently by Acharya, Orlitsky, Suresh and Tyagi (SODA 2015), improving their upper and lower bounds on the necessary sample size n. For estimating Renyi entropy of order alpha, up to constant accuracy and error probability, we show the following * Upper bounds n = O(1) 2^{(1-1/alpha)H_alpha} for integer alpha>1, as the worst case over distributions with Renyi entropy equal to H_alpha. * Lower bounds n = Omega(1) K^{1-1/alpha} for any real alpha>1, with the constant being an inverse polynomial of the accuracy, as the worst case over all distributions on K elements. Our upper bounds essentially replace the alphabet size by a factor exponential in the entropy, which offers improvements especially in low or medium entropy regimes (interesting for example in anomaly detection). As for the lower bounds, our proof explicitly shows how the complexity depends on both alphabet and accuracy, partially solving the open problem posted in previous works. The argument for upper bounds derives a clean identity for the variance of falling-power sum of a multinomial distribution. Our approach for lower bounds utilizes convex optimization to find a distribution with possibly worse estimation performance, and may be of independent interest as a tool to work with Le Cam’s two point method

Dagstuhl Research Online Publication Server

IST Austria: PubRep (Institute of Science and Technology)

IST PubRep

Cryptology ePrint Archive

FERMI: Fair Empirical Risk Minimization via Exponential R\'enyi Mutual Information

Author: Baharlouei Sina
Beirami Ahmad
Lowy Andrew
Pavan Rakesh
Razaviyayn Meisam
Publication venue
Publication date: 25/07/2021
Field of study

Despite the success of large-scale empirical risk minimization (ERM) at achieving high accuracy across a variety of machine learning tasks, fair ERM is hindered by the incompatibility of fairness constraints with stochastic optimization. In this paper, we propose the fair empirical risk minimization via exponential R\'enyi mutual information (FERMI) framework. FERMI is built on a stochastic estimator for exponential R\'enyi mutual information (ERMI), an information divergence measuring the degree of the dependence of predictions on sensitive attributes. Theoretically, we show that ERMI upper bounds existing popular fairness violation metrics, thus controlling ERMI provides guarantees on other commonly used violations, such as

L_\infty

. We derive an unbiased estimator for ERMI, which we use to derive the FERMI algorithm. We prove that FERMI converges for demographic parity, equalized odds, and equal opportunity notions of fairness in stochastic optimization. Empirically, we show that FERMI is amenable to large-scale problems with multiple (non-binary) sensitive attributes and non-binary targets. Extensive experiments show that FERMI achieves the most favorable tradeoffs between fairness violation and test accuracy across all tested setups compared with state-of-the-art baselines for demographic parity, equalized odds, equal opportunity. These benefits are especially significant for non-binary classification with large sensitive sets and small batch sizes, showcasing the effectiveness of the FERMI objective and the developed stochastic algorithm for solving it.Comment: 29 page

arXiv.org e-Print Archive