18,251 research outputs found
Model Cards for Model Reporting
Trained machine learning models are increasingly used to perform high-impact
tasks in areas such as law enforcement, medicine, education, and employment. In
order to clarify the intended use cases of machine learning models and minimize
their usage in contexts for which they are not well suited, we recommend that
released models be accompanied by documentation detailing their performance
characteristics. In this paper, we propose a framework that we call model
cards, to encourage such transparent model reporting. Model cards are short
documents accompanying trained machine learning models that provide benchmarked
evaluation in a variety of conditions, such as across different cultural,
demographic, or phenotypic groups (e.g., race, geographic location, sex,
Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex
and Fitzpatrick skin type) that are relevant to the intended application
domains. Model cards also disclose the context in which models are intended to
be used, details of the performance evaluation procedures, and other relevant
information. While we focus primarily on human-centered machine learning models
in the application fields of computer vision and natural language processing,
this framework can be used to document any trained machine learning model. To
solidify the concept, we provide cards for two supervised models: One trained
to detect smiling faces in images, and one trained to detect toxic comments in
text. We propose model cards as a step towards the responsible democratization
of machine learning and related AI technology, increasing transparency into how
well AI technology works. We hope this work encourages those releasing trained
machine learning models to accompany model releases with similar detailed
evaluation numbers and other relevant documentation
Rethinking Bias Mitigation: Fairer Architectures Make for Fairer Face Recognition
Face recognition systems are widely deployed in safety-critical applications,
including law enforcement, yet they exhibit bias across a range of
socio-demographic dimensions, such as gender and race. Conventional wisdom
dictates that model biases arise from biased training data. As a consequence,
previous works on bias mitigation largely focused on pre-processing the
training data, adding penalties to prevent bias from effecting the model during
training, or post-processing predictions to debias them, yet these approaches
have shown limited success on hard problems such as face recognition. In our
work, we discover that biases are actually inherent to neural network
architectures themselves. Following this reframing, we conduct the first neural
architecture search for fairness, jointly with a search for hyperparameters.
Our search outputs a suite of models which Pareto-dominate all other
high-performance architectures and existing bias mitigation methods in terms of
accuracy and fairness, often by large margins, on the two most widely used
datasets for face identification, CelebA and VGGFace2. Furthermore, these
models generalize to other datasets and sensitive attributes. We release our
code, models and raw data files at https://github.com/dooleys/FR-NAS
Improving fairness in machine learning systems: What do industry practitioners need?
The potential for machine learning (ML) systems to amplify social inequities
and unfairness is receiving increasing popular and academic attention. A surge
of recent work has focused on the development of algorithmic tools to assess
and mitigate such unfairness. If these tools are to have a positive impact on
industry practice, however, it is crucial that their design be informed by an
understanding of real-world needs. Through 35 semi-structured interviews and an
anonymous survey of 267 ML practitioners, we conduct the first systematic
investigation of commercial product teams' challenges and needs for support in
developing fairer ML systems. We identify areas of alignment and disconnect
between the challenges faced by industry practitioners and solutions proposed
in the fair ML research literature. Based on these findings, we highlight
directions for future ML and HCI research that will better address industry
practitioners' needs.Comment: To appear in the 2019 ACM CHI Conference on Human Factors in
Computing Systems (CHI 2019
Overcoming Racial Harms to Democracy from Artificial Intelligence
While the United States is becoming more racially diverse, generative artificial intelligence and related technologies threaten to undermine truly representative democracy. Left unchecked, AI will exacerbate already substantial existing challenges, such as racial polarization, cultural anxiety, antidemocratic attitudes, racial vote dilution, and voter suppression. Synthetic video and audio (“deepfakes”) receive the bulk of popular attention—but are just the tip of the iceberg. Microtargeting of racially tailored disinformation, racial bias in automated election administration, discriminatory voting restrictions, racially targeted cyberattacks, and AI-powered surveillance that chills racial justice claims are just a few examples of how AI is threatening democracy. Unfortunately, existing laws—including the Voting Rights Act—are unlikely to address the challenges. These problems, however, are not insurmountable if policymakers, activists, and technology companies act now. This Article asserts that AI should be regulated to facilitate a racially inclusive democracy, proposes novel principles that provide a framework to regulate AI, and offers specific policy interventions to illustrate the implementation of the principles. Even though race is the most significant demographic factor that shapes voting patterns in the United States, this is the first article to comprehensively identify the racial harms to democracy posed by AI and offer a way forward
Demographic Bias in Presentation Attack Detection of Iris Recognition Systems
With the widespread use of biometric systems, the demographic bias problem
raises more attention. Although many studies addressed bias issues in biometric
verification, there are no works that analyze the bias in presentation attack
detection (PAD) decisions. Hence, we investigate and analyze the demographic
bias in iris PAD algorithms in this paper. To enable a clear discussion, we
adapt the notions of differential performance and differential outcome to the
PAD problem. We study the bias in iris PAD using three baselines (hand-crafted,
transfer-learning, and training from scratch) using the NDCLD-2013 database.
The experimental results point out that female users will be significantly less
protected by the PAD, in comparison to males.Comment: accepted for publication at EUSIPCO202
Manifestations of Xenophobia in AI Systems
Xenophobia is one of the key drivers of marginalisation, discrimination, and
conflict, yet many prominent machine learning (ML) fairness frameworks fail to
comprehensively measure or mitigate the resulting xenophobic harms. Here we aim
to bridge this conceptual gap and help facilitate safe and ethical design of
artificial intelligence (AI) solutions. We ground our analysis of the impact of
xenophobia by first identifying distinct types of xenophobic harms, and then
applying this framework across a number of prominent AI application domains,
reviewing the potential interplay between AI and xenophobia on social media and
recommendation systems, healthcare, immigration, employment, as well as biases
in large pre-trained models. These help inform our recommendations towards an
inclusive, xenophilic design of future AI systems
Fairness in Visual Clustering: A Novel Transformer Clustering Approach
Promoting fairness for deep clustering models in unsupervised clustering
settings to reduce demographic bias is a challenging goal. This is because of
the limitation of large-scale balanced data with well-annotated labels for
sensitive or protected attributes. In this paper, we first evaluate demographic
bias in deep clustering models from the perspective of cluster purity, which is
measured by the ratio of positive samples within a cluster to their correlation
degree. This measurement is adopted as an indication of demographic bias. Then,
a novel loss function is introduced to encourage a purity consistency for all
clusters to maintain the fairness aspect of the learned clustering model.
Moreover, we present a novel attention mechanism, Cross-attention, to measure
correlations between multiple clusters, strengthening faraway positive samples
and improving the purity of clusters during the learning process. Experimental
results on a large-scale dataset with numerous attribute settings have
demonstrated the effectiveness of the proposed approach on both clustering
accuracy and fairness enhancement on several sensitive attributes
On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms
Artificial Intelligence (AI) has made its way into various scientific fields,
providing astonishing improvements over existing algorithms for a wide variety
of tasks. In recent years, there have been severe concerns over the
trustworthiness of AI technologies. The scientific community has focused on the
development of trustworthy AI algorithms. However, machine and deep learning
algorithms, popular in the AI community today, depend heavily on the data used
during their development. These learning algorithms identify patterns in the
data, learning the behavioral objective. Any flaws in the data have the
potential to translate directly into algorithms. In this study, we discuss the
importance of Responsible Machine Learning Datasets and propose a framework to
evaluate the datasets through a responsible rubric. While existing work focuses
on the post-hoc evaluation of algorithms for their trustworthiness, we provide
a framework that considers the data component separately to understand its role
in the algorithm. We discuss responsible datasets through the lens of fairness,
privacy, and regulatory compliance and provide recommendations for constructing
future datasets. After surveying over 100 datasets, we use 60 datasets for
analysis and demonstrate that none of these datasets is immune to issues of
fairness, privacy preservation, and regulatory compliance. We provide
modifications to the ``datasheets for datasets" with important additions for
improved dataset documentation. With governments around the world regularizing
data protection laws, the method for the creation of datasets in the scientific
community requires revision. We believe this study is timely and relevant in
today's era of AI.Comment: corrected typo
- …