Search CORE

12 research outputs found

LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images

Author: Chattopadhyay Prithvijit
Hoffman Judy
Prabhu Viraj
Yenamandra Sriram
Publication venue
Publication date: 30/05/2023
Field of study

We propose an automated algorithm to stress-test a trained visual model by generating language-guided counterfactual test images (LANCE). Our method leverages recent progress in large language modeling and text-based image editing to augment an IID test set with a suite of diverse, realistic, and challenging test images without altering model weights. We benchmark the performance of a diverse set of pretrained models on our generated data and observe significant and consistent performance drops. We further analyze model sensitivity across different types of edits, and demonstrate its applicability at surfacing previously unknown class-level model biases in ImageNet.Comment: Project webpage: https://virajprabhu.github.io/lance-web

arXiv.org e-Print Archive

Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings

Author: Chandrasekaran Arjun
Hoffman Judy
Prabhu Viraj
Saenko Kate
Publication venue
Publication date: 18/12/2020
Field of study

Generalizing deep neural networks to new target domains is critical to their real-world utility. In practice, it may be feasible to get some target data labeled, but to be cost-effective it is desirable to select a maximally-informative subset via active learning (AL). We study the problem of AL under a domain shift, called Active Domain Adaptation (Active DA). We empirically demonstrate how existing AL approaches based solely on model uncertainty or diversity sampling are suboptimal for Active DA. Our algorithm, Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings (ADA-CLUE), i) identifies target instances for labeling that are both uncertain under the model and diverse in feature space, and ii) leverages the available source and target data for adaptation by optimizing a semi-supervised adversarial entropy loss that is complementary to our active sampling objective. On standard image classification-based domain adaptation benchmarks, ADA-CLUE consistently outperforms competing active adaptation, active learning, and domain adaptation methods across domain shifts of varying severity

arXiv.org e-Print Archive

MPG.PuRe

Evaluating Visual Conversational Agents via Cooperative Human-AI Games

Author: Batra Dhruv
Chandrasekaran Arjun
Chattopadhyay Prithvijit
Das Abhishek
Lee Stefan
Parikh Devi
Prabhu Viraj
Yadav Deshraj
Publication venue
Publication date: 16/08/2017
Field of study

As AI continues to advance, human-AI teams are inevitable. However, progress in AI is routinely measured in isolation, without a human in the loop. It is crucial to benchmark progress in AI, not just in isolation, but also in terms of how it translates to helping humans perform certain tasks, i.e., the performance of human-AI teams. In this work, we design a cooperative game - GuessWhich - to measure human-AI team performance in the specific context of the AI being a visual conversational agent. GuessWhich involves live interaction between the human and the AI. The AI, which we call ALICE, is provided an image which is unseen by the human. Following a brief description of the image, the human questions ALICE about this secret image to identify it from a fixed pool of images. We measure performance of the human-ALICE team by the number of guesses it takes the human to correctly identify the secret image after a fixed number of dialog rounds with ALICE. We compare performance of the human-ALICE teams for two versions of ALICE. Our human studies suggest a counterintuitive trend - that while AI literature shows that one version outperforms the other when paired with an AI questioner bot, we find that this improvement in AI-AI performance does not translate to improved human-AI performance. This suggests a mismatch between benchmarking of AI in isolation and in the context of human-AI teams.Comment: HCOMP 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

We're Not Using Videos Effectively: An Updated Domain Adaptive Video Segmentation Baseline

Author: Chattopadhyay Prithvijit
Hoffman Judy
Kareer Simar
Maheshwari Harsh
Prabhu Viraj
Vijaykumar Vivek
Publication venue
Publication date: 27/02/2024
Field of study

There has been abundant work in unsupervised domain adaptation for semantic segmentation (DAS) seeking to adapt a model trained on images from a labeled source domain to an unlabeled target domain. While the vast majority of prior work has studied this as a frame-level Image-DAS problem, a few Video-DAS works have sought to additionally leverage the temporal signal present in adjacent frames. However, Video-DAS works have historically studied a distinct set of benchmarks from Image-DAS, with minimal cross-benchmarking. In this work, we address this gap. Surprisingly, we find that (1) even after carefully controlling for data and model architecture, state-of-the-art Image-DAS methods (HRDA and HRDA+MIC) outperform Video-DAS methods on established Video-DAS benchmarks (+14.5 mIoU on Viper

\rightarrow

CityscapesSeq, +19.0 mIoU on Synthia

\rightarrow

CityscapesSeq), and (2) naive combinations of Image-DAS and Video-DAS techniques only lead to marginal improvements across datasets. To avoid siloed progress between Image-DAS and Video-DAS, we open-source our codebase with support for a comprehensive set of Video-DAS and Image-DAS methods on a common benchmark. Code available at https://github.com/SimarKareer/UnifiedVideoDAComment: TMLR 202

arXiv.org e-Print Archive