26 research outputs found
Class based Influence Functions for Error Detection
Influence functions (IFs) are a powerful tool for detecting anomalous
examples in large scale datasets. However, they are unstable when applied to
deep networks. In this paper, we provide an explanation for the instability of
IFs and develop a solution to this problem. We show that IFs are unreliable
when the two data points belong to two different classes. Our solution
leverages class information to improve the stability of IFs. Extensive
experiments show that our modification significantly improves the performance
and stability of IFs while incurring no additional computational cost.Comment: Thang Nguyen-Duc, Hoang Thanh-Tung, and Quan Hung Tran are co-first
authors of this paper. 12 pages, 12 figures. Accepted to ACL 202
ASIF: Coupled Data Turns Unimodal Models to Multimodal Without Training
Aligning the visual and language spaces requires to train deep neural
networks from scratch on giant multimodal datasets; CLIP trains both an image
and a text encoder, while LiT manages to train just the latter by taking
advantage of a pretrained vision network. In this paper, we show that sparse
relative representations are sufficient to align text and images without
training any network. Our method relies on readily available single-domain
encoders (trained with or without supervision) and a modest (in comparison)
number of image-text pairs. ASIF redefines what constitutes a multimodal model
by explicitly disentangling memory from processing: here the model is defined
by the embedded pairs of all the entries in the multimodal dataset, in addition
to the parameters of the two encoders. Experiments on standard zero-shot visual
benchmarks demonstrate the typical transfer ability of image-text models.
Overall, our method represents a simple yet surprisingly strong baseline for
foundation multimodal models, raising important questions on their data
efficiency and on the role of retrieval in machine learning.Comment: 13 pages, 5 figure
Indiscriminate Data Poisoning Attacks on Neural Networks
Data poisoning attacks, in which a malicious adversary aims to influence a
model by injecting "poisoned" data into the training process, have attracted
significant recent attention. In this work, we take a closer look at existing
poisoning attacks and connect them with old and new algorithms for solving
sequential Stackelberg games. By choosing an appropriate loss function for the
attacker and optimizing with algorithms that exploit second-order information,
we design poisoning attacks that are effective on neural networks. We present
efficient implementations that exploit modern auto-differentiation packages and
allow simultaneous and coordinated generation of tens of thousands of poisoned
points, in contrast to existing methods that generate poisoned points one by
one. We further perform extensive experiments that empirically explore the
effect of data poisoning attacks on deep neural networks
Deployment of a Robust and Explainable Mortality Prediction Model: The COVID-19 Pandemic and Beyond
This study investigated the performance, explainability, and robustness of
deployed artificial intelligence (AI) models in predicting mortality during the
COVID-19 pandemic and beyond. The first study of its kind, we found that
Bayesian Neural Networks (BNNs) and intelligent training techniques allowed our
models to maintain performance amidst significant data shifts. Our results
emphasize the importance of developing robust AI models capable of matching or
surpassing clinician predictions, even under challenging conditions. Our
exploration of model explainability revealed that stochastic models generate
more diverse and personalized explanations thereby highlighting the need for AI
models that provide detailed and individualized insights in real-world clinical
settings. Furthermore, we underscored the importance of quantifying uncertainty
in AI models which enables clinicians to make better-informed decisions based
on reliable predictions. Our study advocates for prioritizing implementation
science in AI research for healthcare and ensuring that AI solutions are
practical, beneficial, and sustainable in real-world clinical environments. By
addressing unique challenges and complexities in healthcare settings,
researchers can develop AI models that effectively improve clinical practice
and patient outcomes
Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural Networks
Uncertainty quantification (UQ) is important for reliability assessment and
enhancement of machine learning models. In deep learning, uncertainties arise
not only from data, but also from the training procedure that often injects
substantial noises and biases. These hinder the attainment of statistical
guarantees and, moreover, impose computational challenges on UQ due to the need
for repeated network retraining. Building upon the recent neural tangent kernel
theory, we create statistically guaranteed schemes to principally
\emph{quantify}, and \emph{remove}, the procedural uncertainty of
over-parameterized neural networks with very low computation effort. In
particular, our approach, based on what we call a procedural-noise-correcting
(PNC) predictor, removes the procedural uncertainty by using only \emph{one}
auxiliary network that is trained on a suitably labeled data set, instead of
many retrained networks employed in deep ensembles. Moreover, by combining our
PNC predictor with suitable light-computation resampling methods, we build
several approaches to construct asymptotically exact-coverage confidence
intervals using as low as four trained networks without additional overheads
Training Data Attribution for Diffusion Models
Diffusion models have become increasingly popular for synthesizing
high-quality samples based on training datasets. However, given the oftentimes
enormous sizes of the training datasets, it is difficult to assess how training
data impact the samples produced by a trained diffusion model. The difficulty
of relating diffusion model inputs and outputs poses significant challenges to
model explainability and training data attribution. Here we propose a novel
solution that reveals how training data influence the output of diffusion
models through the use of ensembles. In our approach individual models in an
encoded ensemble are trained on carefully engineered splits of the overall
training data to permit the identification of influential training examples.
The resulting model ensembles enable efficient ablation of training data
influence, allowing us to assess the impact of training data on model outputs.
We demonstrate the viability of these ensembles as generative models and the
validity of our approach to assessing influence.Comment: 14 pages, 6 figure
On Memorization in Probabilistic Deep Generative Models
Recent advances in deep generative models have led to impressive results in a
variety of application domains. Motivated by the possibility that deep learning
models might memorize part of the input data, there have been increased efforts
to understand how memorization arises. In this work, we extend a recently
proposed measure of memorization for supervised learning (Feldman, 2019) to the
unsupervised density estimation problem and adapt it to be more computationally
efficient. Next, we present a study that demonstrates how memorization can
occur in probabilistic deep generative models such as variational autoencoders.
This reveals that the form of memorization to which these models are
susceptible differs fundamentally from mode collapse and overfitting.
Furthermore, we show that the proposed memorization score measures a phenomenon
that is not captured by commonly-used nearest neighbor tests. Finally, we
discuss several strategies that can be used to limit memorization in practice.
Our work thus provides a framework for understanding problematic memorization
in probabilistic generative models.Comment: Accepted for publication at NeurIPS 202