4,230 research outputs found
Identifying the returns to lying when the truth is unobserved
Consider an observed binary regressor D and an unobserved binary variable D*, both of which affect some other variable Y . This paper considers nonparametric identification and estimation of the effect of D on Y , conditioning on D* = 0. For example, suppose Y is a person's wage, the unobserved D* indicates if the person has been to college, and the observed D indicates whether the individual claims to have been to college. This paper then identifies and estimates the difference in average wages between those who falsely claim college experience versus those who tell the truth about not having college.We estimate this average returns to lying to be about 7% to 20%. Nonparametric identification without observing D* is obtained either by observing a variable V that is roughly analogous to an instrument for ordinary measurement error, or by imposing restrictions on model error moments.
Nonparametric identification of regression models containing a misclassified dichotomous regressor without instruments
This note considers nonparametric identification of a general nonlinear regression model with a dichotomous regressor subject to misclassification error. The available sample information consists of a dependent variable and a set of regressors, one of which is binary and error-ridden with misclassification error that has unknown distribution. Our identification strategy does not parameterize any regression or distribution functions, and does not require additional sample information such as instrumental variables, repeated measurements, or an auxiliary sample. Our main identifying assumption is that the regression model error has zero conditional third moment. The results include a closed-form solution for the unknown distributions and the regression function.
Nonparametric identification and estimation of nonclassical errors-in-variables models without additional information
This paper considers identification and estimation of a nonparametric regression model with an unobserved discrete covariate. The sample consists of a dependent variable and a set of covariates, one of which is discrete and arbitrarily correlates with the unobserved covariate. The observed discrete covariate has the same support as the unobserved covariate, and can be interpreted as a proxy or mismeasure of the unobserved one, but with a nonclassical measurement error that has an unknown distribution. We obtain nonparametric identification of the model given monotonicity of the regression function and a rank condition that is directly testable given the data. Our identification strategy does not require additional sample information, such as instrumental variables or a secondary sample. We then estimate the model via the method of sieve maximum likelihood, and provide root-n asymptotic normality and semiparametric efficiency of smooth functionals of interest. Two small simulations are presented to illustrate the identification and the estimation results.
Mitigating Discrimination in Insurance with Wasserstein Barycenters
The insurance industry is heavily reliant on predictions of risks based on
characteristics of potential customers. Although the use of said models is
common, researchers have long pointed out that such practices perpetuate
discrimination based on sensitive features such as gender or race. Given that
such discrimination can often be attributed to historical data biases, an
elimination or at least mitigation is desirable. With the shift from more
traditional models to machine-learning based predictions, calls for greater
mitigation have grown anew, as simply excluding sensitive variables in the
pricing process can be shown to be ineffective. In this article, we first
investigate why predictions are a necessity within the industry and why
correcting biases is not as straightforward as simply identifying a sensitive
variable. We then propose to ease the biases through the use of Wasserstein
barycenters instead of simple scaling. To demonstrate the effects and
effectiveness of the approach we employ it on real data and discuss its
implications
A Sequentially Fair Mechanism for Multiple Sensitive Attributes
In the standard use case of Algorithmic Fairness, the goal is to eliminate
the relationship between a sensitive variable and a corresponding score.
Throughout recent years, the scientific community has developed a host of
definitions and tools to solve this task, which work well in many practical
applications. However, the applicability and effectivity of these tools and
definitions becomes less straightfoward in the case of multiple sensitive
attributes. To tackle this issue, we propose a sequential framework, which
allows to progressively achieve fairness across a set of sensitive features. We
accomplish this by leveraging multi-marginal Wasserstein barycenters, which
extends the standard notion of Strong Demographic Parity to the case with
multiple sensitive characteristics. This method also provides a closed-form
solution for the optimal, sequentially fair predictor, permitting a clear
interpretation of inter-sensitive feature correlations. Our approach seamlessly
extends to approximate fairness, enveloping a framework accommodating the
trade-off between risk and unfairness. This extension permits a targeted
prioritization of fairness improvements for a specific attribute within a set
of sensitive attributes, allowing for a case specific adaptation. A data-driven
estimation procedure for the derived solution is developed, and comprehensive
numerical experiments are conducted on both synthetic and real datasets. Our
empirical findings decisively underscore the practical efficacy of our
post-processing approach in fostering fair decision-making
Addressing Fairness and Explainability in Image Classification Using Optimal Transport
Algorithmic Fairness and the explainability of potentially unfair outcomes
are crucial for establishing trust and accountability of Artificial
Intelligence systems in domains such as healthcare and policing. Though
significant advances have been made in each of the fields separately, achieving
explainability in fairness applications remains challenging, particularly so in
domains where deep neural networks are used. At the same time, ethical
data-mining has become ever more relevant, as it has been shown countless times
that fairness-unaware algorithms result in biased outcomes. Current approaches
focus on mitigating biases in the outcomes of the model, but few attempts have
been made to try to explain \emph{why} a model is biased. To bridge this gap,
we propose a comprehensive approach that leverages optimal transport theory to
uncover the causes and implications of biased regions in images, which easily
extends to tabular data as well. Through the use of Wasserstein barycenters, we
obtain scores that are independent of a sensitive variable but keep their
marginal orderings. This step ensures predictive accuracy but also helps us to
recover the regions most associated with the generation of the biases. Our
findings hold significant implications for the development of trustworthy and
unbiased AI systems, fostering transparency, accountability, and fairness in
critical decision-making scenarios across diverse domains
Parametric Fairness with Statistical Guarantees
Algorithmic fairness has gained prominence due to societal and regulatory
concerns about biases in Machine Learning models. Common group fairness metrics
like Equalized Odds for classification or Demographic Parity for both
classification and regression are widely used and a host of computationally
advantageous post-processing methods have been developed around them. However,
these metrics often limit users from incorporating domain knowledge. Despite
meeting traditional fairness criteria, they can obscure issues related to
intersectional fairness and even replicate unwanted intra-group biases in the
resulting fair solution. To avoid this narrow perspective, we extend the
concept of Demographic Parity to incorporate distributional properties in the
predictions, allowing expert knowledge to be used in the fair solution. We
illustrate the use of this new metric through a practical example of wages, and
develop a parametric method that efficiently addresses practical challenges
like limited training data and constraints on total spending, offering a robust
solution for real-life applications
Fairness in Multi-Task Learning via Wasserstein Barycenters
Algorithmic Fairness is an established field in machine learning that aims to
reduce biases in data. Recent advances have proposed various methods to ensure
fairness in a univariate environment, where the goal is to de-bias a single
task. However, extending fairness to a multi-task setting, where more than one
objective is optimised using a shared representation, remains underexplored. To
bridge this gap, we develop a method that extends the definition of
\textit{Strong Demographic Parity} to multi-task learning using multi-marginal
Wasserstein barycenters. Our approach provides a closed form solution for the
optimal fair multi-task predictor including both regression and binary
classification tasks. We develop a data-driven estimation procedure for the
solution and run numerical experiments on both synthetic and real datasets. The
empirical results highlight the practical value of our post-processing
methodology in promoting fair decision-making
Microstructure and magnetization of Y-Ba-Cu-O prepared by melt quenching, partial melting and doping
Y-Ba-Cu-O samples prepared by means of a variety of melt-based techniques exhibit high values for their magnetic properties compared with those of samples prepared by solid state sintering. These techniques include single-stage partial melting as well as melt quenching followed by a second heat treatment stage, and they have been applied to the stoichiometric 123 composition as well as to formulations containing excess yttrium or other dopants. The structure of these melt-based samples is highly aligned, and the magnetization readings exhibit large anisotropy. At 77 K and magnetic field intensities of about 2 kOe, diamagnetic susceptibilities as high as -14 x 10(exp -3) emu/g were obtained in the cases of melt-quenched samples and remanent magnetization values as high as 10 emu/g for samples prepared by partial melting
Melt-processed bulk superconductors: Fabrication and characterization for power and space applications
Melt-process bulk superconducting materials based on variations on the base YBa2Cu3O(x) were produced in a variety of shapes and forms. Very high values of both zero-field and high-field magnetization were observed. These are useful for levitation and power applications. Magnetic measurements show that the effects of field direction and intensity, temperature and time are consistent with an aligned grain structure with multiple pinning sites and with models of thermally activated flux motion
- …