101 research outputs found
Unlearn What You Want to Forget: Efficient Unlearning for LLMs
Large language models (LLMs) have achieved significant progress from
pre-training on and memorizing a wide range of textual data, however, this
process might suffer from privacy issues and violations of data protection
regulations. As a result, the ability to easily remove data related to
individual users from such models while not deteriorating their predictive
quality after the removal becomes increasingly important. To address these
issues, in this work, we propose an efficient unlearning framework that could
efficiently update LLMs without having to retrain the whole model after data
removals, by introducing lightweight unlearning layers learned with a selective
teacher-student objective into the transformers. In addition, we introduce a
fusion mechanism to effectively combine different unlearning layers that learns
to forget different sets of data to handle a sequence of forgetting operations.
Experiments on classification and generation tasks demonstrate the
effectiveness of our proposed methods compared to the state-of-the-art
baselines.Comment: EMNLP 202
Tuiteamos o pongamos un tuit? Investigating the Social Constraints of Loanword Integration in Spanish Social Media
Speakers of non-English languages often adopt loanwords from English to express new or unusual concepts. While these loanwords may be borrowed unchanged, speakers may also integrate the words to fit the constraints of their native language, e.g. creating Spanish tuitear from English tweet. Linguists have often considered the process of loanword integration to be more dependent on language-internal constraints, but sociolinguistic constraints such as speaker background remain only qualitatively understood. We investigate the role of social context and speaker background in Spanish speakers\u27 use of integrated loanwords on social media. We find first that newspaper authors use the integrated forms of loanwords and native words more often than social media authors, showing that integration is associated with formal domains. In social media, we find that speaker background and expectations of formality explain loanword and native word integration, such that authors who use more Spanish and who write to a wider audience tend to use integrated verb forms more often. This study shows that loanword integration reflects not only language-internal constraints but also social expectations that vary by conversation and speaker
CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations
Recent work has aimed to capture nuances of human behavior by using LLMs to
simulate responses from particular demographics in settings like social science
experiments and public opinion surveys. However, there are currently no
established ways to discuss or evaluate the quality of such LLM simulations.
Moreover, there is growing concern that these LLM simulations are flattened
caricatures of the personas that they aim to simulate, failing to capture the
multidimensionality of people and perpetuating stereotypes. To bridge these
gaps, we present CoMPosT, a framework to characterize LLM simulations using
four dimensions: Context, Model, Persona, and Topic. We use this framework to
measure open-ended LLM simulations' susceptibility to caricature, defined via
two criteria: individuation and exaggeration. We evaluate the level of
caricature in scenarios from existing work on LLM simulations. We find that for
GPT-4, simulations of certain demographics (political and marginalized groups)
and topics (general, uncontroversial) are highly susceptible to caricature.Comment: To appear at EMNLP 2023 (Main
Characterizing Collective Attention via Descriptor Context: A Case Study of Public Discussions of Crisis Events
Social media datasets make it possible to rapidly quantify collective
attention to emerging topics and breaking news, such as crisis events.
Collective attention is typically measured by aggregate counts, such as the
number of posts that mention a name or hashtag. But according to rationalist
models of natural language communication, the collective salience of each
entity will be expressed not only in how often it is mentioned, but in the form
that those mentions take. This is because natural language communication is
premised on (and customized to) the expectations that speakers and writers have
about how their messages will be interpreted by the intended audience. We test
this idea by conducting a large-scale analysis of public online discussions of
breaking news events on Facebook and Twitter, focusing on five recent crisis
events. We examine how people refer to locations, focusing specifically on
contextual descriptors, such as "San Juan" versus "San Juan, Puerto Rico."
Rationalist accounts of natural language communication predict that such
descriptors will be unnecessary (and therefore omitted) when the named entity
is expected to have high prior salience to the reader. We find that the use of
contextual descriptors is indeed associated with proxies for social and
informational expectations, including macro-level factors like the location's
global salience and micro-level factors like audience engagement. We also find
a consistent decrease in descriptor context use over the lifespan of each
crisis event. These findings provide evidence about how social media users
communicate with their audiences, and point towards more fine-grained models of
collective attention that may help researchers and crisis response
organizations to better understand public perception of unfolding crisis
events.Comment: ICWSM 202
DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules
Existing large language models (LLMs) that mainly focus on Standard American
English (SAE) often lead to significantly worse performance when being applied
to other English dialects. While existing mitigations tackle discrepancies for
individual target dialects, they assume access to high-accuracy dialect
identification systems. The boundaries between dialects are inherently
flexible, making it difficult to categorize language into discrete predefined
categories. In this paper, we propose DADA (Dialect Adaptation via Dynamic
Aggregation), a modular approach to imbue SAE-trained models with
multi-dialectal robustness by composing adapters which handle specific
linguistic features. The compositional architecture of DADA allows for both
targeted adaptation to specific dialect variants and simultaneous adaptation to
various dialects. We show that DADA is effective for both single task and
instruction finetuned language models, offering an extensible and interpretable
framework for adapting existing LLMs to different English dialects
Anchor Points: Benchmarking Models with Much Fewer Examples
Modern language models often exhibit powerful but brittle behavior, leading
to the development of larger and more diverse benchmarks to reliably assess
their behavior. Here, we suggest that model performance can be benchmarked and
elucidated with much smaller evaluation sets. We first show that in six popular
language classification benchmarks, model confidence in the correct class on
many pairs of points is strongly correlated across models. We build upon this
phenomenon to propose Anchor Point Selection, a technique to select small
subsets of datasets that capture model behavior across the entire dataset.
Anchor points reliably rank models: across 87 diverse language model-prompt
pairs, evaluating models using 1-30 anchor points outperforms uniform sampling
and other baselines at accurately ranking models. Moreover, just several anchor
points can be used to estimate model per-class predictions on all other points
in a dataset with low mean absolute error, sufficient for gauging where the
model is likely to fail. Lastly, we present Anchor Point Maps for visualizing
these insights and facilitating comparisons of the performance of different
models on various regions within the dataset distribution
- …