4 research outputs found
FaiREE: Fair Classification with Finite-Sample and Distribution-Free Guarantee
Algorithmic fairness plays an increasingly critical role in machine learning
research. Several group fairness notions and algorithms have been proposed.
However, the fairness guarantee of existing fair classification methods mainly
depends on specific data distributional assumptions, often requiring large
sample sizes, and fairness could be violated when there is a modest number of
samples, which is often the case in practice. In this paper, we propose FaiREE,
a fair classification algorithm that can satisfy group fairness constraints
with finite-sample and distribution-free theoretical guarantees. FaiREE can be
adapted to satisfy various group fairness notions (e.g., Equality of
Opportunity, Equalized Odds, Demographic Parity, etc.) and achieve the optimal
accuracy. These theoretical guarantees are further supported by experiments on
both synthetic and real data. FaiREE is shown to have favorable performance
over state-of-the-art algorithms.Comment: 45 pages, 9 figure
On the Generalization Properties of Diffusion Models
Diffusion models are a class of generative models that serve to establish a
stochastic transport map between an empirically observed, yet unknown, target
distribution and a known prior. Despite their remarkable success in real-world
applications, a theoretical understanding of their generalization capabilities
remains underdeveloped. This work embarks on a comprehensive theoretical
exploration of the generalization attributes of diffusion models. We establish
theoretical estimates of the generalization gap that evolves in tandem with the
training dynamics of score-based diffusion models, suggesting a polynomially
small generalization error () on both the sample size
and the model capacity , evading the curse of dimensionality (i.e., not
exponentially large in the data dimension) when early-stopped. Furthermore, we
extend our quantitative analysis to a data-dependent scenario, wherein target
distributions are portrayed as a succession of densities with progressively
increasing distances between modes. This precisely elucidates the adverse
effect of "modes shift" in ground truths on the model generalization. Moreover,
these estimates are not solely theoretical constructs but have also been
confirmed through numerical simulations. Our findings contribute to the
rigorous understanding of diffusion models' generalization properties and
provide insights that may guide practical applications.Comment: 42 pages, 11 figure
CloudHealth: A Model-Driven Approach to Watch the Health of Cloud Services
Cloud systems are complex and large systems where services provided by
different operators must coexist and eventually cooperate. In such a complex
environment, controlling the health of both the whole environment and the
individual services is extremely important to timely and effectively react to
misbehaviours, unexpected events, and failures. Although there are solutions to
monitor cloud systems at different granularity levels, how to relate the many
KPIs that can be collected about the health of the system and how health
information can be properly reported to operators are open questions. This
paper reports the early results we achieved in the challenge of monitoring the
health of cloud systems. In particular we present CloudHealth, a model-based
health monitoring approach that can be used by operators to watch specific
quality attributes. The CloudHealth Monitoring Model describes how to
operationalize high level monitoring goals by dividing them into subgoals,
deriving metrics for the subgoals, and using probes to collect the metrics. We
use the CloudHealth Monitoring Model to control the probes that must be
deployed on the target system, the KPIs that are dynamically collected, and the
visualization of the data in dashboards.Comment: 8 pages, 2 figures, 1 tabl