906 research outputs found
Advancing Subgroup Fairness via Sleeping Experts
We study methods for improving fairness to subgroups in settings with overlapping populations and sequential predictions. Classical notions of fairness focus on the balance of some property across different populations. However, in many applications the goal of the different groups is not to be predicted equally but rather to be predicted well. We demonstrate that the task of satisfying this guarantee for multiple overlapping groups is not straightforward and show that for the simple objective of unweighted average of false negative and false positive rate, satisfying this for overlapping populations can be statistically impossible even when we are provided predictors that perform well separately on each subgroup. On the positive side, we show that when individuals are equally important to the different groups they belong to, this goal is achievable; to do so, we draw a connection to the sleeping experts literature in online learning. Motivated by the one-sided feedback in natural settings of interest, we extend our results to such a feedback model. We also provide a game-theoretic interpretation of our results, examining the incentives of participants to join the system and to provide the system full information about predictors they may possess. We end with several interesting open problems concerning the strength of guarantees that can be achieved in a computationally efficient manner
A Unifying Perspective on Multi-Calibration: Game Dynamics for Multi-Objective Learning
We provide a unifying framework for the design and analysis of
multicalibrated predictors. By placing the multicalibration problem in the
general setting of multi-objective learning -- where learning guarantees must
hold simultaneously over a set of distributions and loss functions -- we
exploit connections to game dynamics to achieve state-of-the-art guarantees for
a diverse set of multicalibration learning problems. In addition to shedding
light on existing multicalibration guarantees and greatly simplifying their
analysis, our approach also yields improved guarantees, such as obtaining
stronger multicalibration conditions that scale with the square-root of group
size and improving the complexity of -class multicalibration by an
exponential factor of . Beyond multicalibration, we use these game dynamics
to address emerging considerations in the study of group fairness and
multi-distribution learning.Comment: 45 pages. Authors are ordered alphabeticall
High-Dimensional Prediction for Sequential Decision Making
We study the problem of making predictions of an adversarially chosen
high-dimensional state that are unbiased subject to an arbitrary collection of
conditioning events, with the goal of tailoring these events to downstream
decision makers. We give efficient algorithms for solving this problem, as well
as a number of applications that stem from choosing an appropriate set of
conditioning events.
For example, we can efficiently make predictions targeted at polynomially
many decision makers, giving each of them optimal swap regret if they
best-respond to our predictions. We generalize this to online combinatorial
optimization, where the decision makers have a very large action space, to give
the first algorithms offering polynomially many decision makers no regret on
polynomially many subsequences that may depend on their actions and the
context. We apply these results to get efficient no-subsequence-regret
algorithms in extensive-form games (EFGs), yielding a new family of regret
guarantees for EFGs that generalizes some existing EFG regret notions, e.g.
regret to informed causal deviations, and is generally incomparable to other
known such notions.
Next, we develop a novel transparent alternative to conformal prediction for
building valid online adversarial multiclass prediction sets. We produce class
scores that downstream algorithms can use for producing valid-coverage
prediction sets, as if these scores were the true conditional class
probabilities. We show this implies strong conditional validity guarantees
including set-size-conditional and multigroup-fair coverage for polynomially
many downstream prediction sets. Moreover, our class scores can be guaranteed
to have improved loss, cross-entropy loss, and generally any Bregman
loss, compared to any collection of benchmark models, yielding a
high-dimensional real-valued version of omniprediction.Comment: Added references, Arxiv abstract edite
Comparative Learning: A Sample Complexity Theory for Two Hypothesis Classes
In many learning theory problems, a central role is played by a hypothesis class: we might assume that the data is labeled according to a hypothesis in the class (usually referred to as the realizable setting), or we might evaluate the learned model by comparing it with the best hypothesis in the class (the agnostic setting). Taking a step beyond these classic setups that involve only a single hypothesis class, we study a variety of problems that involve two hypothesis classes simultaneously.
We introduce comparative learning as a combination of the realizable and agnostic settings in PAC learning: given two binary hypothesis classes S and B, we assume that the data is labeled according to a hypothesis in the source class S and require the learned model to achieve an accuracy comparable to the best hypothesis in the benchmark class B. Even when both S and B have infinite VC dimensions, comparative learning can still have a small sample complexity. We show that the sample complexity of comparative learning is characterized by the mutual VC dimension VC(S,B) which we define to be the maximum size of a subset shattered by both S and B. We also show a similar result in the online setting, where we give a regret characterization in terms of the analogous mutual Littlestone dimension Ldim(S,B). These results also hold for partial hypotheses.
We additionally show that the insights necessary to characterize the sample complexity of comparative learning can be applied to other tasks involving two hypothesis classes. In particular, we characterize the sample complexity of realizable multiaccuracy and multicalibration using the mutual fat-shattering dimension, an analogue of the mutual VC dimension for real-valued hypotheses. This not only solves an open problem proposed by Hu, Peale, Reingold (2022), but also leads to independently interesting results extending classic ones about regression, boosting, and covering number to our two-hypothesis-class setting
Taming Wild Price Fluctuations: Monotone Stochastic Convex Optimization with Bandit Feedback
Prices generated by automated price experimentation algorithms often display
wild fluctuations, leading to unfavorable customer perceptions and violations
of individual fairness: e.g., the price seen by a customer can be significantly
higher than what was seen by her predecessors, only to fall once again later.
To address this concern, we propose demand learning under a monotonicity
constraint on the sequence of prices, within the framework of stochastic convex
optimization with bandit feedback.
Our main contribution is the design of the first sublinear-regret algorithms
for monotonic price experimentation for smooth and strongly concave revenue
functions under noisy as well as noiseless bandit feedback. The monotonicity
constraint presents a unique challenge: since any increase (or decrease) in the
decision-levels is final, an algorithm needs to be cautious in its exploration
to avoid over-shooting the optimum. At the same time, minimizing regret
requires that progress be made towards the optimum at a sufficient pace.
Balancing these two goals is particularly challenging under noisy feedback,
where obtaining sufficiently accurate gradient estimates is expensive. Our key
innovation is to utilize conservative gradient estimates to adaptively tailor
the degree of caution to local gradient information, being aggressive far from
the optimum and being increasingly cautious as the prices approach the optimum.
Importantly, we show that our algorithms guarantee the same regret rates (up to
logarithmic factors) as the best achievable rates of regret without the
monotonicity requirement
Leveling the Playing Field: Attracting, Engaging, and Advancing People with Disabilities
People with disabilities experience significant challenges in finding employment. The participation of people with disabilities in the workforce and their median income are both less than half that of the civilian workforce. They work part time 68 percent more frequently than people without disabilities. These disheartening results persist despite the enactment of significant federal legislation aimed at making the workplace more supportive and accessible to people with disabilities. The Conference Board Research Working Group (RWG) on Improving Employment Outcomes for People with Disabilities was convened to address how to overcome these disparities. It was sponsored by the Employment and Disability Institute at Cornell University, under a grant from the National Institute on Disability and Rehabilitation Research of the U.S. Department of Education. The RWG members focused on four questions:
1) The business case: Is it advantageous for organizations to employ people with disabilities?
2) Organizational readiness: What should organizations do to create a workplace that enables people with disabilities to thrive and advance?
3) Measurement: How can success for both people with disabilities and the organization itself be determined?
4) Self-disclosure: How can people with disabilities, especially those whose disabilities are not obvious, be encouraged to identify themselves so that resources can be directed toward them and outcomes can be measured
Recommended from our members
IT’S ALL THE RAGE: AN ANIMATED APPROACH TO SCREENING FOR POSTPARTUM DEPRESSION
Postpartum depression presents a complication for mothers which can, in some cases, be severe and even life-threatening. Instruments commonly used to screen for this psychological condition have been challenged by an extensive body of literature, with many mothers being unidentified and even untreated for their symptoms. The presented research introduces a newly developed screening instrument for detecting probable postpartum depression using text-free scenario-based animations, based on the lived experience of the condition as qualified by empirical research and the existing body of literature. Developed items were controlled for quality via Think Aloud Protocol and alignment studies with subject matter experts (mothers and clinicians). A consequently revised version of a scale was piloted within the United States (N=433) using an online survey platform and was tested for psychometric quality. Overall, the presented studies show promise for this newly developed tool and provide ample reason to believe that using animated scenario-based items to measure the construct of postpartum depression in a screening context is a viable option for practice and a method that deserves additional research. Additional research related to the practicality of this method of assessment and its potential to increase access and fairness in receiving care for mental health conditions is needed. The development of this assessment has the potential to innovate psychological screening practices and clinical assessment for patients at large. This approach to assessment, being patient-informed and technology-based, provides promise for lowering the rate of maternal suicide, creating a common language between patients and providers, and potentially expanding the field of psychological assessment by forming ways of reaching those who have been missed by limited screening tools (i.e., historically marginalized groups). By freeing instrument development from typical protocol and grounding the entire process in the lived experience, using animation introduces an innovative method of assessing which is not only feasible, but also favored by the users themselves
- …