156 research outputs found
DABS: Data-Agnostic Backdoor attack at the Server in Federated Learning
Federated learning (FL) attempts to train a global model by aggregating local
models from distributed devices under the coordination of a central server.
However, the existence of a large number of heterogeneous devices makes FL
vulnerable to various attacks, especially the stealthy backdoor attack.
Backdoor attack aims to trick a neural network to misclassify data to a target
label by injecting specific triggers while keeping correct predictions on
original training data. Existing works focus on client-side attacks which try
to poison the global model by modifying the local datasets. In this work, we
propose a new attack model for FL, namely Data-Agnostic Backdoor attack at the
Server (DABS), where the server directly modifies the global model to backdoor
an FL system. Extensive simulation results show that this attack scheme
achieves a higher attack success rate compared with baseline methods while
maintaining normal accuracy on the clean data.Comment: Accepted by Backdoor Attacks and Defenses in Machine Learning (BANDS)
Workshop at ICLR 202
The Devil is in the Data: Learning Fair Graph Neural Networks via Partial Knowledge Distillation
Graph neural networks (GNNs) are being increasingly used in many high-stakes
tasks, and as a result, there is growing attention on their fairness recently.
GNNs have been shown to be unfair as they tend to make discriminatory decisions
toward certain demographic groups, divided by sensitive attributes such as
gender and race. While recent works have been devoted to improving their
fairness performance, they often require accessible demographic information.
This greatly limits their applicability in real-world scenarios due to legal
restrictions. To address this problem, we present a demographic-agnostic method
to learn fair GNNs via knowledge distillation, namely FairGKD. Our work is
motivated by the empirical observation that training GNNs on partial data
(i.e., only node attributes or topology data) can improve their fairness,
albeit at the cost of utility. To make a balanced trade-off between fairness
and utility performance, we employ a set of fairness experts (i.e., GNNs
trained on different partial data) to construct the synthetic teacher, which
distills fairer and informative knowledge to guide the learning of the GNN
student. Experiments on several benchmark datasets demonstrate that FairGKD,
which does not require access to demographic information, significantly
improves the fairness of GNNs by a large margin while maintaining their
utility.Comment: Accepted by WSDM 202
Stochastic Coded Federated Learning: Theoretical Analysis and Incentive Mechanism Design
Federated learning (FL) has achieved great success as a privacy-preserving
distributed training paradigm, where many edge devices collaboratively train a
machine learning model by sharing the model updates instead of the raw data
with a server. However, the heterogeneous computational and communication
resources of edge devices give rise to stragglers that significantly decelerate
the training process. To mitigate this issue, we propose a novel FL framework
named stochastic coded federated learning (SCFL) that leverages coded computing
techniques. In SCFL, before the training process starts, each edge device
uploads a privacy-preserving coded dataset to the server, which is generated by
adding Gaussian noise to the projected local dataset. During training, the
server computes gradients on the global coded dataset to compensate for the
missing model updates of the straggling devices. We design a gradient
aggregation scheme to ensure that the aggregated model update is an unbiased
estimate of the desired global update. Moreover, this aggregation scheme
enables periodical model averaging to improve the training efficiency. We
characterize the tradeoff between the convergence performance and privacy
guarantee of SCFL. In particular, a more noisy coded dataset provides stronger
privacy protection for edge devices but results in learning performance
degradation. We further develop a contract-based incentive mechanism to
coordinate such a conflict. The simulation results show that SCFL learns a
better model within the given time and achieves a better privacy-performance
tradeoff than the baseline methods. In addition, the proposed incentive
mechanism grants better training performance than the conventional Stackelberg
game approach
Serine/Threonine Kinase 35, a Target Gene of STAT3, Regulates the Proliferation and Apoptosis of Osteosarcoma Cells
Scaling Up, Scaling Deep: Blockwise Graph Contrastive Learning
Oversmoothing is a common phenomenon in graph neural networks (GNNs), in
which an increase in the network depth leads to a deterioration in their
performance. Graph contrastive learning (GCL) is emerging as a promising way of
leveraging vast unlabeled graph data. As a marriage between GNNs and
contrastive learning, it remains unclear whether GCL inherits the same
oversmoothing defect from GNNs. This work undertakes a fundamental analysis of
GCL from the perspective of oversmoothing on the first hand. We demonstrate
empirically that increasing network depth in GCL also leads to oversmoothing in
their deep representations, and surprisingly, the shallow ones. We refer to
this phenomenon in GCL as long-range starvation', wherein lower layers in deep
networks suffer from degradation due to the lack of sufficient guidance from
supervision (e.g., loss computing). Based on our findings, we present BlockGCL,
a remarkably simple yet effective blockwise training framework to prevent GCL
from notorious oversmoothing. Without bells and whistles, BlockGCL consistently
improves robustness and stability for well-established GCL methods with
increasing numbers of layers on real-world graph benchmarks. We believe our
work will provide insights for future improvements of scalable and deep GCL
frameworks.Comment: Preprint; Code is available at
https://github.com/EdisonLeeeee/BlockGC
A Survey of What to Share in Federated Learning: Perspectives on Model Utility, Privacy Leakage, and Communication Efficiency
Federated learning (FL) has emerged as a highly effective paradigm for
privacy-preserving collaborative training among different parties. Unlike
traditional centralized learning, which requires collecting data from each
party, FL allows clients to share privacy-preserving information without
exposing private datasets. This approach not only guarantees enhanced privacy
protection but also facilitates more efficient and secure collaboration among
multiple participants. Therefore, FL has gained considerable attention from
researchers, promoting numerous surveys to summarize the related works.
However, the majority of these surveys concentrate on methods sharing model
parameters during the training process, while overlooking the potential of
sharing other forms of local information. In this paper, we present a
systematic survey from a new perspective, i.e., what to share in FL, with an
emphasis on the model utility, privacy leakage, and communication efficiency.
This survey differs from previous ones due to four distinct contributions.
First, we present a new taxonomy of FL methods in terms of the sharing methods,
which includes three categories of shared information: model sharing, synthetic
data sharing, and knowledge sharing. Second, we analyze the vulnerability of
different sharing methods to privacy attacks and review the defense mechanisms
that provide certain privacy guarantees. Third, we conduct extensive
experiments to compare the performance and communication overhead of various
sharing methods in FL. Besides, we assess the potential privacy leakage through
model inversion and membership inference attacks, while comparing the
effectiveness of various defense approaches. Finally, we discuss potential
deficiencies in current methods and outline future directions for improvement
Development of an ELISA-array for simultaneous detection of five encephalitis viruses
Japanese encephalitis virus(JEV), tick-borne encephalitis virus(TBEV), and eastern equine encephalitis virus (EEEV) can cause symptoms of encephalitis. Establishment of accurate and easy methods by which to detect these viruses is essential for the prevention and treatment of associated infectious diseases. Currently, there are still no multiple antigen detection methods available clinically. An ELISA-array, which detects multiple antigens, is easy to handle, and inexpensive, has enormous potential in pathogen detection. An ELISA-array method for the simultaneous detection of five encephalitis viruses was developed in this study. Seven monoclonal antibodies against five encephalitis-associated viruses were prepared and used for development of the ELISA-array. The ELISA-array assay is based on a "sandwich" ELISA format and consists of viral antibodies printed directly on 96-well microtiter plates, allowing for direct detection of 5 viruses. The developed ELISA-array proved to have similar specificity and higher sensitivity compared with the conventional ELISAs. This method was validated by different viral cultures and three chicken eggs inoculated with infected patient serum. The results demonstrated that the developed ELISA-array is sensitive and easy to use, which would have potential for clinical use
A duplex real-time reverse transcriptase polymerase chain reaction assay for detecting western equine and eastern equine encephalitis viruses
In order to establish an accurate, ready-to-use assay for simultaneous detection of Eastern equine encephalitis virus (EEEV) and Western equine encephalitis virus (WEEV), we developed one duplex TaqMan real-time reverse transcriptase polymerase chain reaction (RT-PCR) assay, which can be used in human and vector surveillance. First, we selected the primers and FAM-labeled TaqMan-probe specific for WEEV from the consensus sequence of NSP3 and the primers and HEX-labeled TaqMan-probe specific for EEEV from the consensus sequence of E3, respectively. Then we constructed and optimized the duplex real-time RT-PCR assay by adjusting the concentrations of primers and probes. Using a series of dilutions of transcripts containing target genes as template, we showed that the sensitivity of the assay reached 1 copy/reaction for EEEV and WEEV, and the performance was linear within the range of at least 10(6 )transcript copies. Moreover, we evaluated the specificity of the duplex system using other encephalitis virus RNA as template, and found no cross-reactivity. Compared with virus isolation, the gold standard, the duplex real time RT-PCR assay we developed was 10-fold more sensitive for both WEEV and EEEV detection
Role of extrathyroidal TSHR expression in adipocyte differentiation and its association with obesity
<p>Abstract</p> <p>Background</p> <p>Obesity is known to be associated with higher risks of cardiovascular disease, metabolic syndrome, and diabetes mellitus. Thyroid-stimulating hormone (TSHR) is the receptor for thyroid-stimulating hormone (TSH, or thyrotropin), the key regulator of thyroid functions. The expression of TSHR, once considered to be limited to thyrocytes, has been so far detected in many extrathyroidal tissues including liver and fat. Previous studies have shown that TSHR expression is upregulated when preadipocytes differentiate into mature adipocytes, suggestive of a possible role of TSHR in adipogenesis. However, it remains unclear whether TSHR expression in adipocytes is implicated in the pathogenesis of obesity.</p> <p>Methods</p> <p>In the present study, TSHR expression in adipose tissues from both mice and human was analyzed, and its association with obesity was evaluated.</p> <p>Results</p> <p>We here showed that TSHR expression was increased at both mRNA and protein levels when 3T3-L1 preadipocytes were induced to differentiate. Knockdown of TSHR blocked the adipocyte differentiation of 3T3-L1 preadipocytes as evaluated by Oil-red-O staining for lipid accumulation and by RT-PCR analyses of PPAR-γ and ALBP mRNA expression. We generated obesity mice (C57/BL6) by high-fat diet feeding and found that the TSHR protein expression in visceral adipose tissues from obesity mice was significantly higher in comparison with the non-obesity control mice (<it>P </it>< 0.05). Finally, the TSHR expression in adipose tissues was determined in 120 patients. The results showed that TSHR expression in subcutaneous adipose tissue is correlated with BMI (body mass index).</p> <p>Conclusion</p> <p>Taken together, these results suggested that TSHR is an important regulator of adipocyte differentiation. Dysregulated expression of TSHR in adipose tissues is associated with obesity, which may involve a mechanism of excess adipogenesis.</p
HeteroNet: Heterophily-aware Representation Learning on Heterogenerous Graphs
Real-world graphs are typically complex, exhibiting heterogeneity in the
global structure, as well as strong heterophily within local neighborhoods.
While a growing body of literature has revealed the limitations of common graph
neural networks (GNNs) in handling homogeneous graphs with heterophily, little
work has been conducted on investigating the heterophily properties in the
context of heterogeneous graphs. To bridge this research gap, we identify the
heterophily in heterogeneous graphs using metapaths and propose two practical
metrics to quantitatively describe the levels of heterophily. Through in-depth
investigations on several real-world heterogeneous graphs exhibiting varying
levels of heterophily, we have observed that heterogeneous graph neural
networks (HGNNs), which inherit many mechanisms from GNNs designed for
homogeneous graphs, fail to generalize to heterogeneous graphs with heterophily
or low level of homophily. To address the challenge, we present HeteroNet,
a heterophily-aware HGNN that incorporates both masked metapath prediction and
masked label prediction tasks to effectively and flexibly handle both
homophilic and heterophilic heterogeneous graphs. We evaluate the performance
of HeteroNet on five real-world heterogeneous graph benchmarks with varying
levels of heterophily. The results demonstrate that HeteroNet outperforms
strong baselines in the semi-supervised node classification task, providing
valuable insights into effectively handling more complex heterogeneous graphs.Comment: Preprin
- …