9,430 research outputs found
Proximity Operators of Discrete Information Divergences
Information divergences allow one to assess how close two distributions are
from each other. Among the large panel of available measures, a special
attention has been paid to convex -divergences, such as
Kullback-Leibler, Jeffreys-Kullback, Hellinger, Chi-Square, Renyi, and
I divergences. While -divergences have been extensively
studied in convex analysis, their use in optimization problems often remains
challenging. In this regard, one of the main shortcomings of existing methods
is that the minimization of -divergences is usually performed with
respect to one of their arguments, possibly within alternating optimization
techniques. In this paper, we overcome this limitation by deriving new
closed-form expressions for the proximity operator of such two-variable
functions. This makes it possible to employ standard proximal methods for
efficiently solving a wide range of convex optimization problems involving
-divergences. In addition, we show that these proximity operators are
useful to compute the epigraphical projection of several functions of practical
interest. The proposed proximal tools are numerically validated in the context
of optimal query execution within database management systems, where the
problem of selectivity estimation plays a central role. Experiments are carried
out on small to large scale scenarios
Function-space regularized R\'enyi divergences
We propose a new family of regularized R\'enyi divergences parametrized not
only by the order but also by a variational function space. These new
objects are defined by taking the infimal convolution of the standard R\'enyi
divergence with the integral probability metric (IPM) associated with the
chosen function space. We derive a novel dual variational representation that
can be used to construct numerically tractable divergence estimators. This
representation avoids risk-sensitive terms and therefore exhibits lower
variance, making it well-behaved when ; this addresses a notable
weakness of prior approaches. We prove several properties of these new
divergences, showing that they interpolate between the classical R\'enyi
divergences and IPMs. We also study the limit, which leads to
a regularized worst-case-regret and a new variational representation in the
classical case. Moreover, we show that the proposed regularized R\'enyi
divergences inherit features from IPMs such as the ability to compare
distributions that are not absolutely continuous, e.g., empirical measures and
distributions with low-dimensional support. We present numerical results on
both synthetic and real datasets, showing the utility of these new divergences
in both estimation and GAN training applications; in particular, we demonstrate
significantly reduced variance and improved training performance.Comment: 24 pages, 4 figure
Addressing GAN Training Instabilities via Tunable Classification Losses
Generative adversarial networks (GANs), modeled as a zero-sum game between a
generator (G) and a discriminator (D), allow generating synthetic data with
formal guarantees. Noting that D is a classifier, we begin by reformulating the
GAN value function using class probability estimation (CPE) losses. We prove a
two-way correspondence between CPE loss GANs and -GANs which minimize
-divergences. We also show that all symmetric -divergences are equivalent
in convergence. In the finite sample and model capacity setting, we define and
obtain bounds on estimation and generalization errors. We specialize these
results to -GANs, defined using -loss, a tunable CPE loss
family parametrized by . We next introduce a class of
dual-objective GANs to address training instabilities of GANs by modeling each
player's objective using -loss to obtain -GANs. We
show that the resulting non-zero sum game simplifies to minimizing an
-divergence under appropriate conditions on .
Generalizing this dual-objective formulation using CPE losses, we define and
obtain upper bounds on an appropriately defined estimation error. Finally, we
highlight the value of tuning in alleviating training
instabilities for the synthetic 2D Gaussian mixture ring as well as the large
publicly available Celeb-A and LSUN Classroom image datasets.Comment: arXiv admin note: text overlap with arXiv:2302.1432
Lower Bounds on the Bayesian Risk via Information Measures
This paper focuses on parameter estimation and introduces a new method for
lower bounding the Bayesian risk. The method allows for the use of virtually
\emph{any} information measure, including R\'enyi's ,
-Divergences, and Sibson's -Mutual Information. The approach
considers divergences as functionals of measures and exploits the duality
between spaces of measures and spaces of functions. In particular, we show that
one can lower bound the risk with any information measure by upper bounding its
dual via Markov's inequality. We are thus able to provide estimator-independent
impossibility results thanks to the Data-Processing Inequalities that
divergences satisfy. The results are then applied to settings of interest
involving both discrete and continuous parameters, including the
``Hide-and-Seek'' problem, and compared to the state-of-the-art techniques. An
important observation is that the behaviour of the lower bound in the number of
samples is influenced by the choice of the information measure. We leverage
this by introducing a new divergence inspired by the ``Hockey-Stick''
Divergence, which is demonstrated empirically to provide the largest
lower-bound across all considered settings. If the observations are subject to
privatisation, stronger impossibility results can be obtained via Strong
Data-Processing Inequalities. The paper also discusses some generalisations and
alternative directions
- …