Search CORE

12 research outputs found

Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning

Author: Agarwal Nakul
Choi Chiho
Chundi Suhas
Dariush Behzad
Kochenderfer Mykel
Li Jiachen
Roelofs Sean
Sachdeva Enna
Publication venue
Publication date: 12/09/2023
Field of study

The widespread adoption of commercial autonomous vehicles (AVs) and advanced driver assistance systems (ADAS) may largely depend on their acceptance by society, for which their perceived trustworthiness and interpretability to riders are crucial. In general, this task is challenging because modern autonomous systems software relies heavily on black-box artificial intelligence models. Towards this goal, this paper introduces a novel dataset, Rank2Tell, a multi-modal ego-centric dataset for Ranking the importance level and Telling the reason for the importance. Using various close and open-ended visual question answering, the dataset provides dense annotations of various semantic, spatial, temporal, and relational attributes of various important objects in complex traffic scenarios. The dense annotations and unique attributes of the dataset make it a valuable resource for researchers working on visual scene understanding and related fields. Further, we introduce a joint model for joint importance level ranking and natural language captions generation to benchmark our dataset and demonstrate performance with quantitative evaluations

arXiv.org e-Print Archive

AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?

Author: Agarwal Nakul
Do Minh Quan
Fu Changcheng
Lee Kwonjoon
Sun Chen
Wang Shijie
Zhang Ce
Zhao Qi
Publication venue
Publication date: 09/10/2023
Field of study

Can we better anticipate an actor's future actions (e.g. mix eggs) by knowing what commonly happens after his/her current action (e.g. crack eggs)? What if we also know the longer-term goal of the actor (e.g. making egg fried rice)? The long-term action anticipation (LTA) task aims to predict an actor's future behavior from video observations in the form of verb and noun sequences, and it is crucial for human-machine interaction. We propose to formulate the LTA task from two perspectives: a bottom-up approach that predicts the next actions autoregressively by modeling temporal dynamics; and a top-down approach that infers the goal of the actor and plans the needed procedure to accomplish the goal. We hypothesize that large language models (LLMs), which have been pretrained on procedure text data (e.g. recipes, how-tos), have the potential to help LTA from both perspectives. It can help provide the prior knowledge on the possible next actions, and infer the goal given the observed part of a procedure, respectively. To leverage the LLMs, we propose a two-stage framework, AntGPT. It first recognizes the actions already performed in the observed videos and then asks an LLM to predict the future actions via conditioned generation, or to infer the goal and plan the whole procedure by chain-of-thought prompting. Empirical results on the Ego4D LTA v1 and v2 benchmarks, EPIC-Kitchens-55, as well as EGTEA GAZE+ demonstrate the effectiveness of our proposed approach. AntGPT achieves state-of-the-art performance on all above benchmarks, and can successfully infer the goal and thus perform goal-conditioned "counterfactual" prediction via qualitative analysis. Code and model will be released at https://brown-palm.github.io/AntGP

arXiv.org e-Print Archive

Assessment of left ventricular systolic and diastolic function in juvenile rheumatoid arthritis

Author: Bharti Bishwa BhushanB Kumar Sudeep, Kapoor Aditya, Agarwal Amita, Mishra Ramnath, Sinha Nakul
Publication venue: Medknow Publications and Staff Society of Seth GS Medical College and KEM Hospital, Mumbai, India
Publication date: 31/12/2004
Field of study

Background and Aims: Recognizing the paucity of data regarding echocardiographic studies of Left ventricular (LV) systolic and diastolic function in patients with juvenile rheumatoid arthritis (JRA), a study was carried out to study these parameters in these subjects. Settings, Design and Methods: Thirty-five patients with JRA and an equal number of age- and sex-matched controls were studied by two-dimensional and Doppler echocardiography. Results: Patients with JRA had higher systolic and diastolic blood pressures, resting heart rates, LV systolic (26.9\ub14.3 vs. 22.4 \ub1 4.1 mm, p=0.001) and diastolic size (42.3\ub14.6 vs. 35.4\ub13.8 mm, p<0.001) and volumes. Though ejection fraction (EF) and fractional shortening (FS) were normal, they were lower in those with JRA as compared to controls (EF: 62.9\ub14.47 vs. 67.5\ub13.63 %, p<0.001; FS: 36.4\ub14.5 vs. 38.5 \ub1 6.87, p=0.2). On Doppler analysis the JRA group had lower peak E velocity, higher peak A velocity, higher A VTI and more prolonged IVRT. Male patients had higher A VTI and IVRT as compared to females. Those with longer duration of disease had larger LV systolic (r=0.517, p=0.01) and diastolic dimension (r=0.40, p=0.05) and lower FS (r=-0.506, p=0.01). Patients with polyarticular JRA had higher E and A VTI as compared to those with systemic or oligoarticular types. Conclusion: Despite an asymptomatic cardiac status, significant systolic and diastolic functional abnormalities exist in patients with JRA. The duration of the disease, mode of presentation, patient's age and gender have a significant impact on the left ventricular systolic and diastolic functions in patients with JRA

Bioline International

Recommended from our members

Learning to Recognise Objects and Actions for Intelligent Agents

Author: Agarwal Nakul
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Computer vision involves a host of tasks, such as boundary detection, semantic segmentation, surface estimation, object detection, image classification, action localization, to name a few. For a holistic understanding of a scene, which is required by a lot of real-world applications, many of these tasks need to be combined together. For instance, an autonomous car should not only be able to detect other cars (object) but also if a pedestrian is walking (action). The former requires localizing the object, which can either be at the pixel level or bounding box level. The latter requires localizing the action, and by extension the actor, in both space and time. These problems are best dealt with approaches involving supervised learning models which rely on large annotated datasets, and so the problem becomes even more challenging when there is lack of labeled data.In this thesis, we first tackle the problem of spatio-temporal action localization in an unsupervised setting. As the name suggests, it requires modeling of both spatial and temporal features. So, we propose an end-to-end learning framework for an adaptation method which aligns both spatial and temporal features and conduct experiments on the action localization task. To highlight the potential benefits for autonomous cars, we also construct and benchmark a new dataset which contains pedestrian actions collected in driving scenes. Then, for a holistic understanding of the scene, we shift our attention from localizing actions to recognising objects especially in a city street scenario. We do this by jointly dealing with the tasks of object detection and semantic segmentation. While the former localizes the individual instances of objects at the bounding box level, the latter provides pixel level distinction but at the category level. We explore a novel observation that connects the two tasks and provide an end-to-end learning framework to exploit this connection

eScholarship - University of California

Improving multiclass classification by deep networks using DAGSVM and Triplet Loss

Author: Agarwal Nakul
Balasubramanian Vineeth N
C V Jawahar
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

With recent advances in the field of computer vision and especially deep learning, many fully connected and convolutional neural networks have been trained to achieve state-of-the-art performance on a wide variety of tasks such as speech recognition, image classification and natural language processing. For classification tasks however, most of these deep learning models employ the softmax activation function for prediction and minimize cross-entropy loss. In contrast, we demonstrate a consistent advantage by replacing the softmax layer by a set of binary SVM classifiers organized in a tree or DAG (Directed Acyclic Graph) structure. The idea is to not treat the multiclass classification problem as a whole but to break it down into smaller binary problems where each classifier acts as an expert by focusing on differentiating between only two classes and thus improves the overall accuracy. Furthermore, by arranging the classifiers in a DAG structure, we later also show how it is possible to further improve the performance of the binary classifiers by learning more discriminative features through the same deep network. We validated the proposed methodology on two benchmark datasets, and the results corroborated our claim

Research Archive of Indian Institute of Technology Hyderabad

Assessment of left ventricular systolic and diastolic function in juvenile rheumatoid arthritis

Author: Agarwal Amita
Bharti Bishwa Bhushan
Kapoor Aditya
Kumar Sudeep
Mishra Ramnath
Sinha Nakul
Publication venue: Wolters Kluwer Medknow Publications
Publication date: 01/10/2004
Field of study

Background and Aims: Recognizing the paucity of data regarding echocardiographic studies of Left ventricular (LV) systolic and diastolic function in patients with juvenile rheumatoid arthritis (JRA), a study was carried out to study these parameters in these subjects. Settings, Design and Methods: Thirty-five patients with JRA and an equal number of age- and sex-matched controls were studied by two-dimensional and Doppler echocardiography. Results: Patients with JRA had higher systolic and diastolic blood pressures, resting heart rates, LV systolic (26.9±4.3 vs. 22.4 ± 4.1 mm, p=0.001) and diastolic size (42.3±4.6 vs. 35.4±3.8 mm, p<0.001) and volumes. Though ejection fraction (EF) and fractional shortening (FS) were normal, they were lower in those with JRA as compared to controls (EF: 62.9±4.47 vs. 67.5±3.63 %, p<0.001; FS: 36.4±4.5 vs. 38.5 ± 6.87, p=0.2). On Doppler analysis the JRA group had lower peak E velocity, higher peak A velocity, higher A VTI and more prolonged IVRT. Male patients had higher A VTI and IVRT as compared to females. Those with longer duration of disease had larger LV systolic (r=0.517, p=0.01) and diastolic dimension (r=0.40, p=0.05) and lower FS (r=-0.506, p=0.01). Patients with polyarticular JRA had higher E and A VTI as compared to those with systemic or oligoarticular types. Conclusion: Despite an asymptomatic cardiac status, significant systolic and diastolic functional abnormalities exist in patients with JRA. The duration of the disease, mode of presentation, patient's age and gender have a significant impact on the left ventricular systolic and diastolic functions in patients with JRA

Directory of Open Access Journals

Angiotensin-converting enzyme gene polymorphism in coronary artery disease in north India

Author: Agarwal Sarita
Agrawal Suraksha
Gilmour Ashley
Mastana Sarabjit
Ramesh V.
Singh Vivek Pratap
Sinha Nakul
Tewari Satyendra
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

Background: The aim of this study was to investigate the role of angiotensin-converting enzyme gene polymorphism in patients with coronary artery disease in north India. Methods and Results: One hundred forty-six patients with angiographically proven atherosclerotic coronary artery disease, and 146 age- and sex-matched control subjects (treadmill-negative) were included in the study. Genomic DNA was extracted and analyzed for angiotensin-converting enzyme insertion/deletion polymorphism. Two independent investigators scored the genotypes. Conclusions: When we compared the genotypes of patients with coronary artery disease with those of normal controls, it was seen that all three genotypes, i.e. DD, ID and II, were not statistically different among patients and controls. Further, we categorized the patient and control groups into 2 subgroups, i.e. below and above 50 years of age. Interestingly, it was observed that the DD genotype was significantly higher in patients in the higher age group (i.e. above 50 years of age). However, this needs further validation by studying patients with coronary artery disease from other parts of India

Enlighten

Angiotensin-converting enzyme gene polymorphism in coronary artery disease in north India

Author: Agarwal Sarita
Agrawal Suraksha
Gilmour Ashley
Mastana Sarabjit
Ramesh V.
Singh Vivek Pratap
Sinha Nakul
Tewari Satyendra
Publication venue: 'Elsevier BV'
Publication date
Field of study