20 research outputs found
Everyone Can Attack: Repurpose Lossy Compression as a Natural Backdoor Attack
The vulnerabilities to backdoor attacks have recently threatened the
trustworthiness of machine learning models in practical applications.
Conventional wisdom suggests that not everyone can be an attacker since the
process of designing the trigger generation algorithm often involves
significant effort and extensive experimentation to ensure the attack's
stealthiness and effectiveness. Alternatively, this paper shows that there
exists a more severe backdoor threat: anyone can exploit an easily-accessible
algorithm for silent backdoor attacks. Specifically, this attacker can employ
the widely-used lossy image compression from a plethora of compression tools to
effortlessly inject a trigger pattern into an image without leaving any
noticeable trace; i.e., the generated triggers are natural artifacts. One does
not require extensive knowledge to click on the "convert" or "save as" button
while using tools for lossy image compression. Via this attack, the adversary
does not need to design a trigger generator as seen in prior works and only
requires poisoning the data. Empirically, the proposed attack consistently
achieves 100% attack success rate in several benchmark datasets such as MNIST,
CIFAR-10, GTSRB and CelebA. More significantly, the proposed attack can still
achieve almost 100% attack success rate with very small (approximately 10%)
poisoning rates in the clean label setting. The generated trigger of the
proposed attack using one lossy compression algorithm is also transferable
across other related compression algorithms, exacerbating the severity of this
backdoor threat. This work takes another crucial step toward understanding the
extensive risks of backdoor attacks in practice, urging practitioners to
investigate similar attacks and relevant backdoor mitigation methods.Comment: 14 pages. This paper shows everyone can mount a powerful and stealthy
backdoor attack with the widely-used lossy image compressio
CoopHash: Cooperative Learning of Multipurpose Descriptor and Contrastive Pair Generator via Variational MCMC Teaching for Supervised Image Hashing
Leveraging supervised information can lead to superior retrieval performance
in the image hashing domain but the performance degrades significantly without
enough labeled data. One effective solution to boost the performance is to
employ generative models, such as Generative Adversarial Networks (GANs), to
generate synthetic data in an image hashing model. However, GAN-based methods
are difficult to train and suffer from mode collapse issue, which prevents the
hashing approaches from jointly training the generative models and the hash
functions. This limitation results in sub-optimal retrieval performance. To
overcome this limitation, we propose a novel framework, the generative
cooperative hashing network (CoopHash), which is based on the energy-based
cooperative learning. CoopHash jointly learns a powerful generative
representation of the data and a robust hash function. CoopHash has two
components: a top-down contrastive pair generator that synthesizes contrastive
images and a bottom-up multipurpose descriptor that simultaneously represents
the images from multiple perspectives, including probability density, hash
code, latent code, and category. The two components are jointly learned via a
novel likelihood-based cooperative learning scheme. We conduct experiments on
several real-world datasets and show that the proposed method outperforms the
competing hashing supervised methods, achieving up to 10% relative improvement
over the current state-of-the-art supervised hashing methods, and exhibits a
significantly better performance in out-of-distribution retrieval
A Cosine Similarity-based Method for Out-of-Distribution Detection
The ability to detect OOD data is a crucial aspect of practical machine
learning applications. In this work, we show that cosine similarity between the
test feature and the typical ID feature is a good indicator of OOD data. We
propose Class Typical Matching (CTM), a post hoc OOD detection algorithm that
uses a cosine similarity scoring function. Extensive experiments on multiple
benchmarks show that CTM outperforms existing post hoc OOD detection methods.Comment: Accepted paper at ICML 2023 Workshop on Spurious Correlations,
Invariance, and Stability. 10 pages (4 main + appendix
Tactical and Strategic Communication Network Simulation and Performance Analysis
We describe a framework for the efficient modeling and performance evaluation of large networks consisting of mixture of strategic and tactical components. The method emphasizes hierarchical, layered techniques that are fed parametric models at the lower level. In addition to the algorithmic structure, and some initial algorithms we describe an object oriented software architecture that is under development to support these algorithmic methods in a distributed environment
Antibiotic use and prescription and its effects on Enterobacteriaceae in the gut in children with mild respiratory infections in Ho Chi Minh City, Vietnam. A prospective observational outpatient study.
BACKGROUND AND OBJECTIVES: Treatment guidelines do not recommend antibiotic use for acute respiratory infections (ARI), except for streptococcal pharyngitis/tonsillitis and pneumonia. However, antibiotics are prescribed frequently for children with ARI, often in absence of evidence for bacterial infection. The objectives of this study were 1) to assess the appropriateness of antibiotic prescriptions for mild ARI in paediatric outpatients in relation to available guidelines and detected pathogens, 2) to assess antibiotic use on presentation using questionnaires and detection in urine 3) to assess the carriage rates and proportions of resistant intestinal Enterobacteriaceae before, during and after consultation. MATERIALS AND METHODS: Patients were prospectively enrolled in Children's Hospital 1, Ho Chi Minh City, Vietnam and diagnoses, prescribed therapy and outcome were recorded on first visit and on follow-up after 7 days. Respiratory bacterial and viral pathogens were detected using molecular assays. Antibiotic use before presentation was assessed using questionnaires and urine HPLC. The impact of antibiotic usage on intestinal Enterobacteriaceae was assessed with semi-quantitative culture on agar with and without antibiotics on presentation and after 7 and 28 days. RESULTS: A total of 563 patients were enrolled between February 2009 and February 2010. Antibiotics were prescribed for all except 2 of 563 patients. The majority were 2nd and 3rd generation oral cephalosporins and amoxicillin with or without clavulanic acid. Respiratory viruses were detected in respiratory specimens of 72.5% of patients. Antibiotic use was considered inappropriate in 90.1% and 67.5%, based on guidelines and detected pathogens, respectively. On presentation parents reported antibiotic use for 22% of patients, 41% of parents did not know and 37% denied antibiotic use. Among these three groups, six commonly used antibiotics were detected with HPLC in patients' urine in 49%, 40% and 14%, respectively. Temporary selection of 3rd generation cephalosporin resistant intestinal Enterobacteriaceae during antibiotic use was observed, with co-selection of resistance to aminoglycosides and fluoroquinolones. CONCLUSIONS: We report overuse and overprescription of antibiotics for uncomplicated ARI with selection of resistant intestinal Enterobacteriaceae, posing a risk for community transmission and persistence in a setting of a highly granular healthcare system and unrestricted access to antibiotics through private pharmacies. REGISTRATION: This study was registered at the International Standard Randomised Controlled Trials Number registry under number ISRCTN32862422: http://www.isrctn.com/ISRCTN32862422
One Loss for Quantization: Deep Hashing with Discrete Wasserstein Distributional Matching
Image hashing is a principled approximate nearest neighbor approach to find
similar items to a query in a large collection of images. Hashing aims to learn
a binary-output function that maps an image to a binary vector. For optimal
retrieval performance, producing balanced hash codes with low-quantization
error to bridge the gap between the learning stage's continuous relaxation and
the inference stage's discrete quantization is important. However, in the
existing deep supervised hashing methods, coding balance and low-quantization
error are difficult to achieve and involve several losses. We argue that this
is because the existing quantization approaches in these methods are
heuristically constructed and not effective to achieve these objectives. This
paper considers an alternative approach to learning the quantization
constraints. The task of learning balanced codes with low quantization error is
re-formulated as matching the learned distribution of the continuous codes to a
pre-defined discrete, uniform distribution. This is equivalent to minimizing
the distance between two distributions. We then propose a computationally
efficient distributional distance by leveraging the discrete property of the
hash functions. This distributional distance is a valid distance and enjoys
lower time and sample complexities. The proposed single-loss quantization
objective can be integrated into any existing supervised hashing method to
improve code balance and quantization error. Experiments confirm that the
proposed approach substantially improves the performance of several
representative hashing~methods.Comment: CVPR 202
Defending Backdoor Attacks on Vision Transformer via Patch Processing
Vision Transformers (ViTs) have a radically different architecture with significantly less inductive bias than Convolutional Neural Networks. Along with the improvement in performance, security and robustness of ViTs are also of great importance to study. In contrast to many recent works that exploit the robustness of ViTs against adversarial examples, this paper investigates a representative causative attack, i.e., backdoor. We first examine the vulnerability of ViTs against various backdoor attacks and find that ViTs are also quite vulnerable to existing attacks. However, we observe that the clean-data accuracy and backdoor attack success rate of ViTs respond distinctively to patch transformations before the positional encoding. Then, based on this finding, we propose an effective method for ViTs to defend both patch-based and blending-based trigger backdoor attacks via patch processing.
The performances are evaluated on several benchmark datasets, including CIFAR10, GTSRB, and TinyImageNet, which show the proposedds defense is very successful in mitigating backdoor attacks for ViTs. To the best of our knowledge, this paper presents the first defensive strategy that utilizes a unique characteristic of ViTs against backdoor attacks
Synthesizing Physical Backdoor Datasets: An Automated Framework Leveraging Deep Generative Models
Backdoor attacks, representing an emerging threat to the integrity of deep
neural networks, have garnered significant attention due to their ability to
compromise deep learning systems clandestinely. While numerous backdoor attacks
occur within the digital realm, their practical implementation in real-world
prediction systems remains limited and vulnerable to disturbances in the
physical world. Consequently, this limitation has given rise to the development
of physical backdoor attacks, where trigger objects manifest as physical
entities within the real world. However, creating the requisite dataset to
train or evaluate a physical backdoor model is a daunting task, limiting the
backdoor researchers and practitioners from studying such physical attack
scenarios. This paper unleashes a recipe that empowers backdoor researchers to
effortlessly create a malicious, physical backdoor dataset based on advances in
generative modeling. Particularly, this recipe involves 3 automatic modules:
suggesting the suitable physical triggers, generating the poisoned candidate
samples (either by synthesizing new samples or editing existing clean samples),
and finally refining for the most plausible ones. As such, it effectively
mitigates the perceived complexity associated with creating a physical backdoor
dataset, transforming it from a daunting task into an attainable objective.
Extensive experiment results show that datasets created by our "recipe" enable
adversaries to achieve an impressive attack success rate on real physical world
data and exhibit similar properties compared to previous physical backdoor
attack studies. This paper offers researchers a valuable toolkit for studies of
physical backdoors, all within the confines of their laboratories