4 research outputs found
Distributionally Robust Optimization with Probabilistic Group
Modern machine learning models may be susceptible to learning spurious
correlations that hold on average but not for the atypical group of samples. To
address the problem, previous approaches minimize the empirical worst-group
risk. Despite the promise, they often assume that each sample belongs to one
and only one group, which does not allow expressing the uncertainty in group
labeling. In this paper, we propose a novel framework PG-DRO, which explores
the idea of probabilistic group membership for distributionally robust
optimization. Key to our framework, we consider soft group membership instead
of hard group annotations. The group probabilities can be flexibly generated
using either supervised learning or zero-shot approaches. Our framework
accommodates samples with group membership ambiguity, offering stronger
flexibility and generality than the prior art. We comprehensively evaluate
PG-DRO on both image classification and natural language processing benchmarks,
establishing superior performanceComment: Published at AAAI 202
Towards Possibilities & Impossibilities of AI-generated Text Detection: A Survey
Large Language Models (LLMs) have revolutionized the domain of natural
language processing (NLP) with remarkable capabilities of generating human-like
text responses. However, despite these advancements, several works in the
existing literature have raised serious concerns about the potential misuse of
LLMs such as spreading misinformation, generating fake news, plagiarism in
academia, and contaminating the web. To address these concerns, a consensus
among the research community is to develop algorithmic solutions to detect
AI-generated text. The basic idea is that whenever we can tell if the given
text is either written by a human or an AI, we can utilize this information to
address the above-mentioned concerns. To that end, a plethora of detection
frameworks have been proposed, highlighting the possibilities of AI-generated
text detection. But in parallel to the development of detection frameworks,
researchers have also concentrated on designing strategies to elude detection,
i.e., focusing on the impossibilities of AI-generated text detection. This is a
crucial step in order to make sure the detection frameworks are robust enough
and it is not too easy to fool a detector. Despite the huge interest and the
flurry of research in this domain, the community currently lacks a
comprehensive analysis of recent developments. In this survey, we aim to
provide a concise categorization and overview of current work encompassing both
the prospects and the limitations of AI-generated text detection. To enrich the
collective knowledge, we engage in an exhaustive discussion on critical and
challenging open questions related to ongoing research on AI-generated text
detection