4 research outputs found

    FaiREE: Fair Classification with Finite-Sample and Distribution-Free Guarantee

    Full text link
    Algorithmic fairness plays an increasingly critical role in machine learning research. Several group fairness notions and algorithms have been proposed. However, the fairness guarantee of existing fair classification methods mainly depends on specific data distributional assumptions, often requiring large sample sizes, and fairness could be violated when there is a modest number of samples, which is often the case in practice. In this paper, we propose FaiREE, a fair classification algorithm that can satisfy group fairness constraints with finite-sample and distribution-free theoretical guarantees. FaiREE can be adapted to satisfy various group fairness notions (e.g., Equality of Opportunity, Equalized Odds, Demographic Parity, etc.) and achieve the optimal accuracy. These theoretical guarantees are further supported by experiments on both synthetic and real data. FaiREE is shown to have favorable performance over state-of-the-art algorithms.Comment: 45 pages, 9 figure

    On the Generalization Properties of Diffusion Models

    Full text link
    Diffusion models are a class of generative models that serve to establish a stochastic transport map between an empirically observed, yet unknown, target distribution and a known prior. Despite their remarkable success in real-world applications, a theoretical understanding of their generalization capabilities remains underdeveloped. This work embarks on a comprehensive theoretical exploration of the generalization attributes of diffusion models. We establish theoretical estimates of the generalization gap that evolves in tandem with the training dynamics of score-based diffusion models, suggesting a polynomially small generalization error (O(n−2/5+m−4/5)O(n^{-2/5}+m^{-4/5})) on both the sample size nn and the model capacity mm, evading the curse of dimensionality (i.e., not exponentially large in the data dimension) when early-stopped. Furthermore, we extend our quantitative analysis to a data-dependent scenario, wherein target distributions are portrayed as a succession of densities with progressively increasing distances between modes. This precisely elucidates the adverse effect of "modes shift" in ground truths on the model generalization. Moreover, these estimates are not solely theoretical constructs but have also been confirmed through numerical simulations. Our findings contribute to the rigorous understanding of diffusion models' generalization properties and provide insights that may guide practical applications.Comment: 42 pages, 11 figure

    CloudHealth: A Model-Driven Approach to Watch the Health of Cloud Services

    Full text link
    Cloud systems are complex and large systems where services provided by different operators must coexist and eventually cooperate. In such a complex environment, controlling the health of both the whole environment and the individual services is extremely important to timely and effectively react to misbehaviours, unexpected events, and failures. Although there are solutions to monitor cloud systems at different granularity levels, how to relate the many KPIs that can be collected about the health of the system and how health information can be properly reported to operators are open questions. This paper reports the early results we achieved in the challenge of monitoring the health of cloud systems. In particular we present CloudHealth, a model-based health monitoring approach that can be used by operators to watch specific quality attributes. The CloudHealth Monitoring Model describes how to operationalize high level monitoring goals by dividing them into subgoals, deriving metrics for the subgoals, and using probes to collect the metrics. We use the CloudHealth Monitoring Model to control the probes that must be deployed on the target system, the KPIs that are dynamically collected, and the visualization of the data in dashboards.Comment: 8 pages, 2 figures, 1 tabl
    corecore