Search CORE

11 research outputs found

Data Selection for Generalization in Unimodal & Multimodal Models

Author: Maharana Adyasha
Publication venue: University of North Carolina at Chapel Hill Graduate School
Publication date: 01/01/2024
Field of study

In this thesis, I present research on improving datasets for deep learning models with automateddata transformation methods. The immediate goal of this work is to maximize the in-domainperformance, out-of-domain generalization, and robustness of models trained on the transformeddatasets. The broader goal of this research is to expand our understanding of how training dataimpacts deep learning models. The data transformation methods discussed in this work can beclassified into data augmentations, data order, and data subset selection.First, I present work on data augmentation methods that improve the robustness of deep learningmodels. I demonstrate the vulnerability of reading comprehension models to a series of novel adversarial attacks and present a policy search method to add optimized proportions of these adversarialattacks to the training data. It improves the in-domain, cross-domain, and cross-lingual generalizationof the model. Then, I expose the phenomena of cross-task inconsistency in multi-task multimodalmodels and show that automatically generated contrast sets can be used to make the model consistent.Second, I explore the efficacy of curriculum learning for finetuning language models on commonsense reasoning tasks. I experiment paced curriculum strategies using a variety of scoringfunctions for quantifying the difficulty of a sample and find that a hard-to-easy curriculum promotesout-of-domain generalization in such models.Third, I discuss the importance of jointly considering diversity and sample difficulty for datasubset selection in the pretraining, fine-tuning, and continual learning paradigms. I propose a scalablestate-of-the-art graph-based algorithm for combining the two factors during the pruning of pretrainingand fine-tuning datasets across data modalities. Further, I propose a multi-way pruning algorithm forselecting training data that contains a balanced mixture of seen vs. unseen tasks and frequent vs. raretasks at each time step during continual instruction tuning of multimodal large language models.In summary, I present several automated data transformation methods spanning augmentation,ordering, and selection for improving the performance of the models trained on the transformeddatasets along various axes.Doctor of Philosoph

Carolina Digital Repository

Debiasing Multimodal Models via Causal Information Minimization

Author: Bansal Mohit
Maharana Adyasha
Patil Vaidehi
Publication venue
Publication date: 28/11/2023
Field of study

Most existing debiasing methods for multimodal models, including causal intervention and inference methods, utilize approximate heuristics to represent the biases, such as shallow features from early stages of training or unimodal features for multimodal tasks like VQA, etc., which may not be accurate. In this paper, we study bias arising from confounders in a causal graph for multimodal data and examine a novel approach that leverages causally-motivated information minimization to learn the confounder representations. Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data. Hence, minimizing the information content of features obtained from a pretrained biased model helps learn the simplest predictive features that capture the underlying data distribution. We treat these features as confounder representations and use them via methods motivated by causal theory to remove bias from models. We find that the learned confounder representations indeed capture dataset biases, and the proposed debiasing methods improve out-of-distribution (OOD) performance on multiple multimodal datasets without sacrificing in-distribution performance. Additionally, we introduce a novel metric to quantify the sufficiency of spurious features in models' predictions that further demonstrates the effectiveness of our proposed methods. Our code is available at: https://github.com/Vaidehi99/CausalInfoMinComment: EMNLP 2023 Findings (16 pages

arXiv.org e-Print Archive

Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models

Author: Bansal Mohit
Clark Christopher
Kamath Amita
Kembhavi Aniruddha
Maharana Adyasha
Publication venue
Publication date: 28/03/2023
Field of study

As general purpose vision models get increasingly effective at a wide set of tasks, it is imperative that they be consistent across the tasks they support. Inconsistent AI models are considered brittle and untrustworthy by human users and are more challenging to incorporate into larger systems that take dependencies on their outputs. Measuring consistency between very heterogeneous tasks that might include outputs in different modalities is challenging since it is difficult to determine if the predictions are consistent with one another. As a solution, we introduce a benchmark dataset, COCOCON, where we use contrast sets created by modifying test instances for multiple tasks in small but semantically meaningful ways to change the gold label, and outline metrics for measuring if a model is consistent by ranking the original and perturbed instances across tasks. We find that state-of-the-art systems suffer from a surprisingly high degree of inconsistent behavior across tasks, especially for more heterogeneous tasks. Finally, we propose using a rank correlation-based auxiliary objective computed over large automatically created cross-task contrast sets to improve the multi-task consistency of large unified models, while retaining their original accuracy on downstream tasks. Project website available at https://adymaharana.github.io/cococon/Comment: Project Website: https://adymaharana.github.io/cococon

arXiv.org e-Print Archive

Evaluating Very Long-Term Conversational Memory of LLM Agents

Author: Bansal Mohit
Barbieri Francesco
Fang Yuwei
Lee Dong-Ho
Maharana Adyasha
Tulyakov Sergey
Publication venue
Publication date: 27/02/2024
Field of study

Existing works on long-term open-domain dialogues focus on evaluating model responses within contexts spanning no more than five chat sessions. Despite advancements in long-context large language models (LLMs) and retrieval augmented generation (RAG) techniques, their efficacy in very long-term dialogues remains unexplored. To address this research gap, we introduce a machine-human pipeline to generate high-quality, very long-term dialogues by leveraging LLM-based agent architectures and grounding their dialogues on personas and temporal event graphs. Moreover, we equip each agent with the capability of sharing and reacting to images. The generated conversations are verified and edited by human annotators for long-range consistency and grounding to the event graphs. Using this pipeline, we collect LoCoMo, a dataset of very long-term conversations, each encompassing 300 turns and 9K tokens on avg., over up to 35 sessions. Based on LoCoMo, we present a comprehensive evaluation benchmark to measure long-term memory in models, encompassing question answering, event summarization, and multi-modal dialogue generation tasks. Our experimental results indicate that LLMs exhibit challenges in understanding lengthy conversations and comprehending long-range temporal and causal dynamics within dialogues. Employing strategies like long-context LLMs or RAG can offer improvements but these models still substantially lag behind human performance.Comment: 19 pages; Project page: https://snap-research.github.io/locomo

arXiv.org e-Print Archive

Extraction of Clinical Timeline from Discharge Summaries using Neural Networks

Author: Maharana Adyasha
Publication venue
Publication date: 01/12/2017
Field of study

Thesis (Master's)--University of Washington, 2017-12Discharge summaries are a concise representation of the most important bits of information about a patient’s time in the hospital. Converting the free-text into a clinical timeline can facilitate accurate assimilation of information by physicians and the structured data can be used to populate knowledge bases, in clinical decision support systems, etc. Conventional methods for temporal evaluation of discharge summaries employ structured inference and extensive feature engineering. However, they also run the risk of overfitting to the training domain and thus, not being efficient in deployment. Novel methods of natural language processing leverage semantics from large corpuses and produce results with minimum feature engineering. This work explores the use of neural network architectures in clinical entity recognition and temporal evaluation. Recurrent neural networks are found to perform at par with conditional random field systems in clinical entity recognition, scoring 94.04% on the i2b2 2012 dataset. Moreover, they perform better for under-represented entity classes like ‘Occurrence’, ‘Evidential’ and ‘Clinical Department’ in a skewed dataset. The out-of-domain evaluation of conditional random fields and neural networks has favorable results on a corpus of ER visit, progress, consult and ICU notes from various medical centers. Neural networks are more agreeable to domain adaptation. This work also explores the use of convolutional neural nets for extraction of within-sentence temporal relations. Preliminary results show that convolutional networks might not be well suited to the task

DSpace at The University of Washington

Recommended from our members

Detecting reports of unsafe foods in consumer product reviews.

Author: Cai Kunlin
Hellerstein Joseph
Hswen Yulin
Maharana Adyasha
Munsell Michael
Nsoesie Elaine O
Staneva Valentina
Verma Miki
Vint Cynthia
Wijaya Derry
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

ObjectivesAccess to safe and nutritious food is essential for good health. However, food can become unsafe due to contamination with pathogens, chemicals or toxins, or mislabeling of allergens. Illness resulting from the consumption of unsafe foods is a global health problem. Here, we develop a machine learning approach for detecting reports of unsafe food products in consumer product reviews from Amazon.com.Materials and methodsWe linked Amazon.com food product reviews to Food and Drug Administration (FDA) food recalls from 2012 to 2014 using text matching approaches in a PostGres relational database. We applied machine learning methods and over- and under-sampling methods to the linked data to automate the detection of reports of unsafe food products.ResultsOur data consisted of 1 297 156 product reviews from Amazon.com. Only 5149 (0.4%) were linked to recalled food products. Bidirectional Encoder Representation from Transformations performed best in identifying unsafe food reviews, achieving an F1 score, precision and recall of 0.74, 0.78, and 0.71, respectively. We also identified synonyms for terms associated with FDA recalls in more than 20 000 reviews, most of which were associated with nonrecalled products. This might suggest that many more products should have been recalled or investigated.Discussion and conclusionChallenges to improving food safety include, urbanization which has led to a longer food chain, underreporting of illness and difficulty in linking contaminated food to illness. Our approach can improve food safety by enabling early identification of unsafe foods which can lead to timely recall thereby limiting the health and economic impact on the public

eScholarship - University of California

Recommended from our members

Detecting reports of unsafe foods in consumer product reviews

Author: Cai Kunlin
Hellerstein Joseph
Hswen Yulin
Maharana Adyasha
Munsell Michael
Nsoesie Elaine O
Staneva Valentina
Verma Miki
Vint Cynthia
Wijaya Derry
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

eScholarship - University of California

Mapping disparities in homicide trends across Brazil: 2000–2014

Author: Jay Jonathan
Lima Neto Antonio S.
Maharana Adyasha
Marinho Fatima
Nsoesie Elaine O.
Saha Sudipta
Soares Filho Adauto Martins
Wang Hailun
Zinszer Kate
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/08/2022
Field of study

Abstract Background Homicides are a major problem in Brazil. Drugs and arms trafficking, and land conflicts are three of the many factors driving homicide rates in Brazil. Understanding long-term spatiotemporal trends and social structural factors associated with homicides in Brazil would be useful for designing policies aimed at reducing homicide rates. Methods We obtained data from 2000 to 2014 from the Brazil Ministry of Health (MOH) Mortality Information System and sociodemographic data from the Brazil Institute of Geography and Statistics (IBGE). First, we quantified the rate of change in homicides at the municipality and state levels. Second, we used principal component regression and k-medoids clustering to examine differences in temporal trends across municipalities. Lastly, we used Bayesian hierarchical space-time models to describe spatio-temporal patterns and to assess the contribution of structural factors. Results There were significant variations in homicide rates across states and municipalities. We noted the largest decrease in homicide rates in the western and southeastern states of Sao Paulo, Rio de Janeiro and Espirito Santo, which coincided with an increase in homicide rates in the northeastern states of Ceará, Alagoas, Paraiba, Rio Grande Norte, Sergipe and Bahia during the fifteen-year period. The decrease in homicides in municipalities with populations of at least 250,000 coincided with an increase in municipalities with 25,000 people or less. Structural factors that predicted municipality-level homicide rates included crude domestic product, urbanization, border with neighboring countries and proportion of population aged fifteen to twenty-nine. Conclusions Our findings support both a dissemination hypothesis and an interiorization hypothesis. These findings should be considered when designing interventions to curb homicide rates.http://deepblue.lib.umich.edu/bitstream/2027.42/174011/1/40621_2020_Article_273.pd

Deep Blue Documents