214 research outputs found
Adaptive Test-Time Personalization for Federated Learning
Personalized federated learning algorithms have shown promising results in
adapting models to various distribution shifts. However, most of these methods
require labeled data on testing clients for personalization, which is usually
unavailable in real-world scenarios. In this paper, we introduce a novel
setting called test-time personalized federated learning (TTPFL), where clients
locally adapt a global model in an unsupervised way without relying on any
labeled data during test-time. While traditional test-time adaptation (TTA) can
be used in this scenario, most of them inherently assume training data come
from a single domain, while they come from multiple clients (source domains)
with different distributions. Overlooking these domain interrelationships can
result in suboptimal generalization. Moreover, most TTA algorithms are designed
for a specific kind of distribution shift and lack the flexibility to handle
multiple kinds of distribution shifts in FL. In this paper, we find that this
lack of flexibility partially results from their pre-defining which modules to
adapt in the model. To tackle this challenge, we propose a novel algorithm
called ATP to adaptively learns the adaptation rates for each module in the
model from distribution shifts among source domains. Theoretical analysis
proves the strong generalization of ATP. Extensive experiments demonstrate its
superiority in handling various distribution shifts including label shift,
image corruptions, and domain shift, outperforming existing TTA methods across
multiple datasets and model architectures. Our code is available at
https://github.com/baowenxuan/ATP .Comment: Accepted by NeurIPS 202
MMNet: Multi-Mask Network for Referring Image Segmentation
Referring image segmentation aims to segment an object referred to by natural
language expression from an image. However, this task is challenging due to the
distinct data properties between text and image, and the randomness introduced
by diverse objects and unrestricted language expression. Most of previous work
focus on improving cross-modal feature fusion while not fully addressing the
inherent uncertainty caused by diverse objects and unrestricted language. To
tackle these problems, we propose an end-to-end Multi-Mask Network for
referring image segmentation(MMNet). we first combine picture and language and
then employ an attention mechanism to generate multiple queries that represent
different aspects of the language expression. We then utilize these queries to
produce a series of corresponding segmentation masks, assigning a score to each
mask that reflects its importance. The final result is obtained through the
weighted sum of all masks, which greatly reduces the randomness of the language
expression. Our proposed framework demonstrates superior performance compared
to state-of-the-art approaches on the two most commonly used datasets, RefCOCO,
RefCOCO+ and G-Ref, without the need for any post-processing. This further
validates the efficacy of our proposed framework.Comment: 10 pages, 5 figure
Boosting Adversarial Transferability by Block Shuffle and Rotation
Adversarial examples mislead deep neural networks with imperceptible
perturbations and have brought significant threats to deep learning. An
important aspect is their transferability, which refers to their ability to
deceive other models, thus enabling attacks in the black-box setting. Though
various methods have been proposed to boost transferability, the performance
still falls short compared with white-box attacks. In this work, we observe
that existing input transformation based attacks, one of the mainstream
transfer-based attacks, result in different attention heatmaps on various
models, which might limit the transferability. We also find that breaking the
intrinsic relation of the image can disrupt the attention heatmap of the
original image. Based on this finding, we propose a novel input transformation
based attack called block shuffle and rotation (BSR). Specifically, BSR splits
the input image into several blocks, then randomly shuffles and rotates these
blocks to construct a set of new images for gradient calculation. Empirical
evaluations on the ImageNet dataset demonstrate that BSR could achieve
significantly better transferability than the existing input transformation
based methods under single-model and ensemble-model settings. Combining BSR
with the current input transformation method can further improve the
transferability, which significantly outperforms the state-of-the-art methods
Privacy-Preserving Graph Machine Learning from Data to Computation: A Survey
In graph machine learning, data collection, sharing, and analysis often
involve multiple parties, each of which may require varying levels of data
security and privacy. To this end, preserving privacy is of great importance in
protecting sensitive information. In the era of big data, the relationships
among data entities have become unprecedentedly complex, and more applications
utilize advanced data structures (i.e., graphs) that can support network
structures and relevant attribute information. To date, many graph-based AI
models have been proposed (e.g., graph neural networks) for various domain
tasks, like computer vision and natural language processing. In this paper, we
focus on reviewing privacy-preserving techniques of graph machine learning. We
systematically review related works from the data to the computational aspects.
We first review methods for generating privacy-preserving graph data. Then we
describe methods for transmitting privacy-preserved information (e.g., graph
model parameters) to realize the optimization-based computation when data
sharing among multiple parties is risky or impossible. In addition to
discussing relevant theoretical methodology and software tools, we also discuss
current challenges and highlight several possible future research opportunities
for privacy-preserving graph machine learning. Finally, we envision a unified
and comprehensive secure graph machine learning system.Comment: Accepted by SIGKDD Explorations 2023, Volume 25, Issue
EAVL: Explicitly Align Vision and Language for Referring Image Segmentation
Referring image segmentation aims to segment an object mentioned in natural
language from an image. A main challenge is language-related localization,
which means locating the object with the relevant language. Previous approaches
mainly focus on the fusion of vision and language features without fully
addressing language-related localization. In previous approaches, fused
vision-language features are directly fed into a decoder and pass through a
convolution with a fixed kernel to obtain the result, which follows a similar
pattern as traditional image segmentation. This approach does not explicitly
align language and vision features in the segmentation stage, resulting in a
suboptimal language-related localization. Different from previous methods, we
propose Explicitly Align the Vision and Language for Referring Image
Segmentation (EAVL). Instead of using a fixed convolution kernel, we propose an
Aligner which explicitly aligns the vision and language features in the
segmentation stage. Specifically, a series of unfixed convolution kernels are
generated based on the input l, and then are use to explicitly align the vision
and language features. To achieve this, We generate multiple queries that
represent different emphases of the language expression. These queries are
transformed into a series of query-based convolution kernels. Then, we utilize
these kernels to do convolutions in the segmentation stage and obtain a series
of segmentation masks. The final result is obtained through the aggregation of
all masks. Our method can not only fuse vision and language features
effectively but also exploit their potential in the segmentation stage. And
most importantly, we explicitly align language features of different emphases
with the image features to achieve language-related localization. Our method
surpasses previous state-of-the-art methods on RefCOCO, RefCOCO+, and G-Ref by
large margins.Comment: 10 pages, 4 figures. arXiv admin note: text overlap with
arXiv:2305.1496
In Vitro and In Silico Characterization of the Aggregation of Thrombi on Ventricular Assist Device Cannula
The unacceptably high stroke rate of HeartMate III VAD without signs of
adherent pump thrombosis is hypothesized to be the result of the thrombi
originating on the inflow cannula, ingesting and ejecting emboli from the VAD.
Therefore, inflow cannula thrombosis has been an emerging focus. The inflow
cannula of contemporary VADs, which incorporate both polished and rough regions
serve as useful benchmarks to study the effects of roughness and shear on
thrombogenesis. An in vitro study was conducted to emulate the
micro-hemodynamic condition on a sintered inflow cannula, and to observe the
deposition and detachment patterns. Together with a computational fluid dynamic
tool, this study aimed to provide insight into the optimization of inflow
cannula and potentially reducing adverse neurological events due to upstream
thrombus
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
Safety lies at the core of the development of Large Language Models (LLMs).
There is ample work on aligning LLMs with human ethics and preferences,
including data filtering in pretraining, supervised fine-tuning, reinforcement
learning from human feedback, and red teaming, etc. In this study, we discover
that chat in cipher can bypass the safety alignment techniques of LLMs, which
are mainly conducted in natural languages. We propose a novel framework
CipherChat to systematically examine the generalizability of safety alignment
to non-natural languages -- ciphers. CipherChat enables humans to chat with
LLMs through cipher prompts topped with system role descriptions and few-shot
enciphered demonstrations. We use CipherChat to assess state-of-the-art LLMs,
including ChatGPT and GPT-4 for different representative human ciphers across
11 safety domains in both English and Chinese. Experimental results show that
certain ciphers succeed almost 100% of the time to bypass the safety alignment
of GPT-4 in several safety domains, demonstrating the necessity of developing
safety alignment for non-natural languages. Notably, we identify that LLMs seem
to have a ''secret cipher'', and propose a novel SelfCipher that uses only role
play and several demonstrations in natural language to evoke this capability.
SelfCipher surprisingly outperforms existing human ciphers in almost all cases.
Our code and data will be released at https://github.com/RobustNLP/CipherChat.Comment: 13 pages, 4 figures, 9 table
Validating Multimedia Content Moderation Software via Semantic Fusion
The exponential growth of social media platforms, such as Facebook and
TikTok, has revolutionized communication and content publication in human
society. Users on these platforms can publish multimedia content that delivers
information via the combination of text, audio, images, and video. Meanwhile,
the multimedia content release facility has been increasingly exploited to
propagate toxic content, such as hate speech, malicious advertisements, and
pornography. To this end, content moderation software has been widely deployed
on these platforms to detect and blocks toxic content. However, due to the
complexity of content moderation models and the difficulty of understanding
information across multiple modalities, existing content moderation software
can fail to detect toxic content, which often leads to extremely negative
impacts.
We introduce Semantic Fusion, a general, effective methodology for validating
multimedia content moderation software. Our key idea is to fuse two or more
existing single-modal inputs (e.g., a textual sentence and an image) into a new
input that combines the semantics of its ancestors in a novel manner and has
toxic nature by construction. This fused input is then used for validating
multimedia content moderation software. We realized Semantic Fusion as DUO, a
practical content moderation software testing tool. In our evaluation, we
employ DUO to test five commercial content moderation software and two
state-of-the-art models against three kinds of toxic content. The results show
that DUO achieves up to 100% error finding rate (EFR) when testing moderation
software. In addition, we leverage the test cases generated by DUO to retrain
the two models we explored, which largely improves model robustness while
maintaining the accuracy on the original test set.Comment: Accepted by ISSTA 202
Kick Bad Guys Out! Zero-Knowledge-Proof-Based Anomaly Detection in Federated Learning
Federated learning (FL) systems are vulnerable to malicious clients that
submit poisoned local models to achieve their adversarial goals, such as
preventing the convergence of the global model or inducing the global model to
misclassify some data. Many existing defense mechanisms are impractical in
real-world FL systems, as they require prior knowledge of the number of
malicious clients or rely on re-weighting or modifying submissions. This is
because adversaries typically do not announce their intentions before
attacking, and re-weighting might change aggregation results even in the
absence of attacks. To address these challenges in real FL systems, this paper
introduces a cutting-edge anomaly detection approach with the following
features: i) Detecting the occurrence of attacks and performing defense
operations only when attacks happen; ii) Upon the occurrence of an attack,
further detecting the malicious client models and eliminating them without
harming the benign ones; iii) Ensuring honest execution of defense mechanisms
at the server by leveraging a zero-knowledge proof mechanism. We validate the
superior performance of the proposed approach with extensive experiments
Kick Bad Guys Out!:Conditionally Activated Anomaly Detection in Federated Learning with Zero-Knowledge Proof Verification
Federated Learning (FL) systems are susceptible to adversarial attacks, where malicious clients submit poisoned models to disrupt the convergence or plant backdoors that cause the global model to misclassify some samples. Current defense methods are often impractical for real-world FL systems, as they either rely on unrealistic prior knowledge or cause accuracy loss even in the absence of attacks. Furthermore, these methods lack a protocol for verifying execution, leaving participants uncertain about the correct execution of the mechanism. To address these challenges, we propose a novel anomaly detection strategy that is designed for real-world FL systems. Our approach activates the defense only when potential attacks are detected, and enables the removal of malicious models without affecting the benign ones. Additionally, we incorporate zero-knowledge proofs to ensure the integrity of the proposed defense mechanism. Experimental results demonstrate the effectiveness of our approach in enhancing FL system security against a comprehensive set of adversarial attacks in various ML tasks
- …