Search CORE

4 research outputs found

FaiREE: Fair Classification with Finite-Sample and Distribution-Free Guarantee

Author: Li Puheng
Zhang Linjun
Zou James
Publication venue
Publication date: 28/11/2022
Field of study

Algorithmic fairness plays an increasingly critical role in machine learning research. Several group fairness notions and algorithms have been proposed. However, the fairness guarantee of existing fair classification methods mainly depends on specific data distributional assumptions, often requiring large sample sizes, and fairness could be violated when there is a modest number of samples, which is often the case in practice. In this paper, we propose FaiREE, a fair classification algorithm that can satisfy group fairness constraints with finite-sample and distribution-free theoretical guarantees. FaiREE can be adapted to satisfy various group fairness notions (e.g., Equality of Opportunity, Equalized Odds, Demographic Parity, etc.) and achieve the optimal accuracy. These theoretical guarantees are further supported by experiments on both synthetic and real data. FaiREE is shown to have favorable performance over state-of-the-art algorithms.Comment: 45 pages, 9 figure

arXiv.org e-Print Archive

On the Generalization Properties of Diffusion Models

Author: Bian Jiang
Li Puheng
Li Zhong
Zhang Huishuai
Publication venue
Publication date: 12/01/2024
Field of study

Diffusion models are a class of generative models that serve to establish a stochastic transport map between an empirically observed, yet unknown, target distribution and a known prior. Despite their remarkable success in real-world applications, a theoretical understanding of their generalization capabilities remains underdeveloped. This work embarks on a comprehensive theoretical exploration of the generalization attributes of diffusion models. We establish theoretical estimates of the generalization gap that evolves in tandem with the training dynamics of score-based diffusion models, suggesting a polynomially small generalization error (

O(n^{-2/5}+m^{-4/5})

) on both the sample size

n

and the model capacity

m

, evading the curse of dimensionality (i.e., not exponentially large in the data dimension) when early-stopped. Furthermore, we extend our quantitative analysis to a data-dependent scenario, wherein target distributions are portrayed as a succession of densities with progressively increasing distances between modes. This precisely elucidates the adverse effect of "modes shift" in ground truths on the model generalization. Moreover, these estimates are not solely theoretical constructs but have also been confirmed through numerical simulations. Our findings contribute to the rigorous understanding of diffusion models' generalization properties and provide insights that may guide practical applications.Comment: 42 pages, 11 figure

arXiv.org e-Print Archive

CloudHealth: A Model-Driven Approach to Watch the Health of Cloud Services

Author: Adjoyan Seza
Bhardwaj Sushil
Gazzola Luca
Ibrahim
Li Fuliang
Naik Priyanka
Schneider Chris
Zhang Puheng
Zheng Xianrong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/03/2018
Field of study

Cloud systems are complex and large systems where services provided by different operators must coexist and eventually cooperate. In such a complex environment, controlling the health of both the whole environment and the individual services is extremely important to timely and effectively react to misbehaviours, unexpected events, and failures. Although there are solutions to monitor cloud systems at different granularity levels, how to relate the many KPIs that can be collected about the health of the system and how health information can be properly reported to operators are open questions. This paper reports the early results we achieved in the challenge of monitoring the health of cloud systems. In particular we present CloudHealth, a model-based health monitoring approach that can be used by operators to watch specific quality attributes. The CloudHealth Monitoring Model describes how to operationalize high level monitoring goals by dividing them into subgoals, deriving metrics for the subgoals, and using probes to collect the metrics. We use the CloudHealth Monitoring Model to control the probes that must be deployed on the target system, the KPIs that are dynamically collected, and the visualization of the data in dashboards.Comment: 8 pages, 2 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

Niobium doping of Li1.2Mn0.54Ni0.13Co0.13O2 cathode materials with enhanced structural stability and electrochemical performance

Author: Ding
Du
Feng
Honglei Li
Hy
Hy
Jiajie Li
Johnson
Kwade
Lee
Li
Li
Li
Li
Li
Lin
Liu
Lu
Luo
Luo
Lv
Ma
Ma
Meng
Myung
Nayak
Nayak
Pan
Piskin
Puheng Yang
Qing
Rozier
Shichao Zhang
Thackeray
Wang
Wang
Wu
Xu
Xu
Xu
Yalan Xing
Yang
Yu
Yu
Zhang
Zhang
Zhang
Zhao
Zheng
Zhixu Jian
Zhou
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref