94 research outputs found
Estimating Example Difficulty using Variance of Gradients
In machine learning, a question of great interest is understanding what
examples are challenging for a model to classify. Identifying atypical examples
helps inform safe deployment of models, isolates examples that require further
human inspection, and provides interpretability into model behavior. In this
work, we propose Variance of Gradients (VOG) as a proxy metric for detecting
outliers in the data distribution. We provide quantitative and qualitative
support that VOG is a meaningful way to rank data by difficulty and to surface
a tractable subset of the most challenging examples for human-in-the-loop
auditing. Data points with high VOG scores are more difficult for the model to
classify and over-index on examples that require memorization.Comment: Accepted to Workshop on Human Interpretability in Machine Learning
(WHI), ICML, 202
Locally Differentially Private Document Generation Using Zero Shot Prompting
Numerous studies have highlighted the privacy risks associated with
pretrained large language models. In contrast, our research offers a unique
perspective by demonstrating that pretrained large language models can
effectively contribute to privacy preservation. We propose a locally
differentially private mechanism called DP-Prompt, which leverages the power of
pretrained large language models and zero-shot prompting to counter author
de-anonymization attacks while minimizing the impact on downstream utility.
When DP-Prompt is used with a powerful language model like ChatGPT (gpt-3.5),
we observe a notable reduction in the success rate of de-anonymization attacks,
showing that it surpasses existing approaches by a considerable margin despite
its simpler design. For instance, in the case of the IMDB dataset, DP-Prompt
(with ChatGPT) perfectly recovers the clean sentiment F1 score while achieving
a 46\% reduction in author identification F1 score against static attackers and
a 26\% reduction against adaptive attackers. We conduct extensive experiments
across six open-source large language models, ranging up to 7 billion
parameters, to analyze various effects of the privacy-utility tradeoff.Comment: Accepted at EMNLP 2023 (Findings
The Grand Illusion: The Myth of Software Portability and Implications for ML Progress
Pushing the boundaries of machine learning often requires exploring different
hardware and software combinations. However, the freedom to experiment across
different tooling stacks can be at odds with the drive for efficiency, which
has produced increasingly specialized AI hardware and incentivized
consolidation around a narrow set of ML frameworks. Exploratory research can be
restricted if software and hardware are co-evolving, making it even harder to
stray away from mainstream ideas that work well with popular tooling stacks.
While this friction increasingly impacts the rate of innovation in machine
learning, to our knowledge the lack of portability in tooling has not been
quantified. In this work, we ask: How portable are popular ML software
frameworks? We conduct a large-scale study of the portability of mainstream ML
frameworks across different hardware types. Our findings paint an uncomfortable
picture -- frameworks can lose more than 40% of their key functions when ported
to other hardware. Worse, even when functions are portable, the slowdown in
their performance can be extreme and render performance untenable.
Collectively, our results reveal how costly straying from a narrow set of
hardware-software combinations can be - and suggest that specialization of
hardware impedes innovation in machine learning research.Comment: 28 pages, 13 figures, repo can be found at associated
https://github.com/for-ai/portabilit
- …