Search CORE

11 research outputs found

Fair Algorithms for Hierarchical Agglomerative Clustering

Author: Chhabra Anshuman
Mohapatra Prasant
Vashishth Vidushi
Publication venue
Publication date: 22/06/2021
Field of study

Hierarchical Agglomerative Clustering (HAC) algorithms are extensively utilized in modern data science, and seek to partition the dataset into clusters while generating a hierarchical relationship between the data samples. HAC algorithms are employed in many applications, such as biology, natural language processing, and recommender systems. Thus, it is imperative to ensure that these algorithms are fair -- even if the dataset contains biases against certain protected groups, the cluster outputs generated should not discriminate against samples from any of these groups. However, recent work in clustering fairness has mostly focused on center-based clustering algorithms, such as k-median and k-means clustering. In this paper, we propose fair algorithms for performing HAC that enforce fairness constraints 1) irrespective of the distance linkage criteria used, 2) generalize to any natural measures of clustering fairness for HAC, 3) work for multiple protected groups, and 4) have competitive running times to vanilla HAC. Through extensive experiments on multiple real-world UCI datasets, we show that our proposed algorithm finds fairer clusterings compared to vanilla HAC as well as other state-of-the-art fair clustering approaches

arXiv.org e-Print Archive

Recommended from our members

Towards Robust and Fair Machine Learning

Author: Chhabra Anshuman
Publication venue: eScholarship, University of California
Publication date: 01/01/2023
Field of study

Recent advances in Machine Learning (ML) and Deep Learning (DL) have resulted in the widespread adoption of models across various application pipelines. However, despite these performance improvements, ML/DL models have been shown to be vulnerable to adversarial inputs that can reduce functionality. Concerns over these issues have prompted researchers to study model robustness from multiple perspectives-- such as privacy, fairness, security, interpretability, among others. In this thesis, we build upon these ideas of robustness, by investigating adversarial and social robustness for a number of different learning models and problem settings. We first study adversarial robustness of unsupervised clustering models, by proposing novel poisoning and evasion attacks for both deep and classical models. We then study the social robustness of models in the context of fairness, and propose the antidote data problem for fair clustering, as well as the fair video summarization problem. Finally, we investigate two problems at the intersection of adversarial and social robustness. We propose a new robust fair clustering method that can jointly ensure adversarial and social robustness, and data selection approaches that can improve interpretability, and optimize the utility, fairness, and robustness for classification models. Through the concepts and ideas proposed in this thesis we aim to lay the groundwork for analyzing and ensuring robustness of ML/DL models of the future

eScholarship - University of California

Towards Robust and Fair Machine Learning

Author: Chhabra Anshuman
Publication venue
Publication date: 01/08/2023
Field of study

Ezid

Towards Robust and Fair Machine Learning

Author: Chhabra Anshuman
Publication venue
Publication date: 01/08/2023
Field of study

Ezid

Suspicion-Free Adversarial Attacks on Clustering Algorithms

Author: Chhabra Anshuman
Mohapatra Prasant
Roy Abhishek
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 16/11/2019
Field of study

Clustering algorithms are used in a large number of applications and play an important role in modern machine learning– yet, adversarial attacks on clustering algorithms seem to be broadly overlooked unlike supervised learning. In this paper, we seek to bridge this gap by proposing a black-box adversarial attack for clustering models for linearly separable clusters. Our attack works by perturbing a single sample close to the decision boundary, which leads to the misclustering of multiple unperturbed samples, named spill-over adversarial samples. We theoretically show the existence of such adversarial samples for the K-Means clustering. Our attack is especially strong as (1) we ensure the perturbed sample is not an outlier, hence not detectable, and (2) the exact metric used for clustering is not known to the attacker. We theoretically justify that the attack can indeed be successful without the knowledge of the true metric. We conclude by providing empirical results on a number of datasets, and clustering algorithms. To the best of our knowledge, this is the first work that generates spill-over adversarial samples without the knowledge of the true metric ensuring that the perturbed sample is not an outlier, and theoretically proves the above

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Recommended from our members

Auditing YouTubes recommendation system for ideologically congenial, extreme, and problematic recommendations.

Author: Chhabra Anshuman
Haroon Muhammad
Liu Xin
Mohapatra Prasant
Shafiq Zubair
Wojcieszak Magdalena
Publication venue: eScholarship, University of California
Publication date: 12/12/2023
Field of study

Algorithms of social media platforms are often criticized for recommending ideologically congenial and radical content to their users. Despite these concerns, evidence on such filter bubbles and rabbit holes of radicalization is inconclusive. We conduct an audit of the platform using 100,000 sock puppets that allow us to systematically and at scale isolate the influence of the algorithm in recommendations. We test 1) whether recommended videos are congenial with regard to users ideology, especially deeper in the watch trail and whether 2) recommendations deeper in the trail become progressively more extreme and come from problematic channels. We find that YouTubes algorithm recommends congenial content to its partisan users, although some moderate and cross-cutting exposure is possible and that congenial recommendations increase deeper in the trail for right-leaning users. We do not find meaningful increases in ideological extremity of recommendations deeper in the trail, yet we show that a growing proportion of recommendations comes from channels categorized as problematic (e.g., IDW, Alt-right, Conspiracy, and QAnon), with this increase being most pronounced among the very-right users. Although the proportion of these problematic recommendations is low (max of 2.5%), they are still encountered by over 36.1% of users and up to 40% in the case of very-right users

eScholarship - University of California

Auditing YouTube's Recommendation System for Ideologically Congenial, Extreme, and Problematic Recommendations

Author: Anshuman Chhabra
Magdalena Wojcieszak
Muhammad Haroon
Prasant Mohapatra
Xin Liu
Zubair Shafiq
Publication venue: OSF
Publication date: 14/06/2023
Field of study

OSF Preprints