Search CORE

5 research outputs found

LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images

Author: Chattopadhyay Prithvijit
Hoffman Judy
Prabhu Viraj
Yenamandra Sriram
Publication venue
Publication date: 30/05/2023
Field of study

We propose an automated algorithm to stress-test a trained visual model by generating language-guided counterfactual test images (LANCE). Our method leverages recent progress in large language modeling and text-based image editing to augment an IID test set with a suite of diverse, realistic, and challenging test images without altering model weights. We benchmark the performance of a diverse set of pretrained models on our generated data and observe significant and consistent performance drops. We further analyze model sensitivity across different types of edits, and demonstrate its applicability at surfacing previously unknown class-level model biases in ImageNet.Comment: Project webpage: https://virajprabhu.github.io/lance-web

arXiv.org e-Print Archive

GOAT: GO to Any Thing

Author: Batra Dhruv
Chang Matthew
Chaplot Devendra Singh
Gervet Theophile
Gupta Saurabh
Khanna Mukul
Malik Jitendra
Min So Yeon
Mottaghi Roozbeh
Paxton Chris
Shah Dhruv
Shah Kavit
Yenamandra Sriram
Publication venue
Publication date: 10/11/2023
Field of study

In deployment scenarios such as homes and warehouses, mobile robots are expected to autonomously navigate for extended periods, seamlessly executing tasks articulated in terms that are intuitively understandable by human operators. We present GO To Any Thing (GOAT), a universal navigation system capable of tackling these requirements with three key features: a) Multimodal: it can tackle goals specified via category labels, target images, and language descriptions, b) Lifelong: it benefits from its past experience in the same environment, and c) Platform Agnostic: it can be quickly deployed on robots with different embodiments. GOAT is made possible through a modular system design and a continually augmented instance-aware semantic memory that keeps track of the appearance of objects from different viewpoints in addition to category-level semantics. This enables GOAT to distinguish between different instances of the same category to enable navigation to targets specified by images and language descriptions. In experimental comparisons spanning over 90 hours in 9 different homes consisting of 675 goals selected across 200+ different object instances, we find GOAT achieves an overall success rate of 83%, surpassing previous methods and ablations by 32% (absolute improvement). GOAT improves with experience in the environment, from a 60% success rate at the first goal to a 90% success after exploration. In addition, we demonstrate that GOAT can readily be applied to downstream tasks such as pick and place and social navigation

arXiv.org e-Print Archive

Leveraging Infrastructure to Enhance Wireless Networks

Author: Yenamandra Guruvenkata Vivek Sriram Yenamandra
Publication venue: The Ohio State University / OhioLINK
Publication date: 23/10/2017
Field of study

OhioLINK Electronic Thesis and Dissertation Center

Housekeep: Tidying Virtual Households using Commonsense Reasoning

Author: Agrawal Harsh
Batra Dhruv
Gilitschenski Igor
Kant Yash
Ramachandran Arun
Szot Andrew
Yenamandra Sriram
Publication venue
Publication date: 21/05/2022
Field of study

We introduce Housekeep, a benchmark to evaluate commonsense reasoning in the home for embodied AI. In Housekeep, an embodied agent must tidy a house by rearranging misplaced objects without explicit instructions specifying which objects need to be rearranged. Instead, the agent must learn from and is evaluated against human preferences of which objects belong where in a tidy house. Specifically, we collect a dataset of where humans typically place objects in tidy and untidy houses constituting 1799 objects, 268 object categories, 585 placements, and 105 rooms. Next, we propose a modular baseline approach for Housekeep that integrates planning, exploration, and navigation. It leverages a fine-tuned large language model (LLM) trained on an internet text corpus for effective planning. We show that our baseline agent generalizes to rearranging unseen objects in unknown environments. See our webpage for more details: https://yashkant.github.io/housekeep

arXiv.org e-Print Archive

HomeRobot: An Open Source Software Stack for Mobile Manipulation Research

Author: Bisk Yonatan
Matulevich Blaine
Paxton Chris
Ramakrishnan Santhosh
Shah Binit
Shah Dhruv
Wang Austin
Yadav Karmesh
Yenamandra Sriram
Publication venue: AAAI Press
Publication date: 22/01/2024
Field of study

Reproducibility in robotics research requires capable, shared hardware platforms which can be used for a wide variety of research. We’ve seen the power of these sorts of shared platforms in more general machine learning research, where there is constant iteration on shared AI platforms like PyTorch. To be able to make rapid progress in robotics in the same way, we propose that we need: (1) shared real-world platforms which allow different teams to test and compare methods at low cost; (2) challenging simulations that reflect real-world environments and especially can drive perception and planning research; and (3) low-cost platforms with enough software to get started addressing all of these problems. To this end, we propose HomeRobot, a mobile manipulator software stack with associated benchmark in simulation, which is initially based on the low-cost, human-safe Hello Robot Stretch

Association for the Advancement of Artificial Intelligence: AAAI Publications