2,099 research outputs found
Fair Feature Importance Scores for Interpreting Tree-Based Methods and Surrogates
Across various sectors such as healthcare, criminal justice, national
security, finance, and technology, large-scale machine learning (ML) and
artificial intelligence (AI) systems are being deployed to make critical
data-driven decisions. Many have asked if we can and should trust these ML
systems to be making these decisions. Two critical components are prerequisites
for trust in ML systems: interpretability, or the ability to understand why the
ML system makes the decisions it does, and fairness, which ensures that ML
systems do not exhibit bias against certain individuals or groups. Both
interpretability and fairness are important and have separately received
abundant attention in the ML literature, but so far, there have been very few
methods developed to directly interpret models with regard to their fairness.
In this paper, we focus on arguably the most popular type of ML interpretation:
feature importance scores. Inspired by the use of decision trees in knowledge
distillation, we propose to leverage trees as interpretable surrogates for
complex black-box ML models. Specifically, we develop a novel fair feature
importance score for trees that can be used to interpret how each feature
contributes to fairness or bias in trees, tree-based ensembles, or tree-based
surrogates of any complex ML system. Like the popular mean decrease in impurity
for trees, our Fair Feature Importance Score is defined based on the mean
decrease (or increase) in group bias. Through simulations as well as real
examples on benchmark fairness datasets, we demonstrate that our Fair Feature
Importance Score offers valid interpretations for both tree-based ensembles and
tree-based surrogates of other ML systems
How is Gaze Influenced by Image Transformations? Dataset and Model
Data size is the bottleneck for developing deep saliency models, because
collecting eye-movement data is very time consuming and expensive. Most of
current studies on human attention and saliency modeling have used high quality
stereotype stimuli. In real world, however, captured images undergo various
types of transformations. Can we use these transformations to augment existing
saliency datasets? Here, we first create a novel saliency dataset including
fixations of 10 observers over 1900 images degraded by 19 types of
transformations. Second, by analyzing eye movements, we find that observers
look at different locations over transformed versus original images. Third, we
utilize the new data over transformed images, called data augmentation
transformation (DAT), to train deep saliency models. We find that label
preserving DATs with negligible impact on human gaze boost saliency prediction,
whereas some other DATs that severely impact human gaze degrade the
performance. These label preserving valid augmentation transformations provide
a solution to enlarge existing saliency datasets. Finally, we introduce a novel
saliency model based on generative adversarial network (dubbed GazeGAN). A
modified UNet is proposed as the generator of the GazeGAN, which combines
classic skip connections with a novel center-surround connection (CSC), in
order to leverage multi level features. We also propose a histogram loss based
on Alternative Chi Square Distance (ACS HistLoss) to refine the saliency map in
terms of luminance distribution. Extensive experiments and comparisons over 3
datasets indicate that GazeGAN achieves the best performance in terms of
popular saliency evaluation metrics, and is more robust to various
perturbations. Our code and data are available at:
https://github.com/CZHQuality/Sal-CFS-GAN
Privacy and Fairness in Recommender Systems via Adversarial Training of User Representations
Latent factor models for recommender systems represent users and items as low
dimensional vectors. Privacy risks of such systems have previously been studied
mostly in the context of recovery of personal information in the form of usage
records from the training data. However, the user representations themselves
may be used together with external data to recover private user information
such as gender and age. In this paper we show that user vectors calculated by a
common recommender system can be exploited in this way. We propose the
privacy-adversarial framework to eliminate such leakage of private information,
and study the trade-off between recommender performance and leakage both
theoretically and empirically using a benchmark dataset. An advantage of the
proposed method is that it also helps guarantee fairness of results, since all
implicit knowledge of a set of attributes is scrubbed from the representations
used by the model, and thus can't enter into the decision making. We discuss
further applications of this method towards the generation of deeper and more
insightful recommendations.Comment: International Conference on Pattern Recognition and Method
The Rhetoric and Reality of Anthropomorphism in Artificial Intelligence
Artificial intelligence (AI) has historically been conceptualized in anthropomorphic terms. Some algorithms deploy biomimetic designs in a deliberate attempt to effect a sort of digital isomorphism of the human brain. Others leverage more general learning strategies that happen to coincide with popular theories of cognitive science and social epistemology. In this paper, I challenge the anthropomorphic credentials of the neural network algorithm, whose similarities to human cognition I argue are vastly overstated and narrowly construed. I submit that three alternative supervised learning methods—namely lasso penalties, bagging, and boosting—offer subtler, more interesting analogies to human reasoning as both an individual and a social phenomenon. Despite the temptation to fall back on anthropomorphic tropes when discussing AI, however, I conclude that such rhetoric is at best misleading and at worst downright dangerous. The impulse to humanize algorithms is an obstacle to properly conceptualizing the ethical challenges posed by emerging technologies
- …