156 research outputs found

    DABS: Data-Agnostic Backdoor attack at the Server in Federated Learning

    Full text link
    Federated learning (FL) attempts to train a global model by aggregating local models from distributed devices under the coordination of a central server. However, the existence of a large number of heterogeneous devices makes FL vulnerable to various attacks, especially the stealthy backdoor attack. Backdoor attack aims to trick a neural network to misclassify data to a target label by injecting specific triggers while keeping correct predictions on original training data. Existing works focus on client-side attacks which try to poison the global model by modifying the local datasets. In this work, we propose a new attack model for FL, namely Data-Agnostic Backdoor attack at the Server (DABS), where the server directly modifies the global model to backdoor an FL system. Extensive simulation results show that this attack scheme achieves a higher attack success rate compared with baseline methods while maintaining normal accuracy on the clean data.Comment: Accepted by Backdoor Attacks and Defenses in Machine Learning (BANDS) Workshop at ICLR 202

    The Devil is in the Data: Learning Fair Graph Neural Networks via Partial Knowledge Distillation

    Full text link
    Graph neural networks (GNNs) are being increasingly used in many high-stakes tasks, and as a result, there is growing attention on their fairness recently. GNNs have been shown to be unfair as they tend to make discriminatory decisions toward certain demographic groups, divided by sensitive attributes such as gender and race. While recent works have been devoted to improving their fairness performance, they often require accessible demographic information. This greatly limits their applicability in real-world scenarios due to legal restrictions. To address this problem, we present a demographic-agnostic method to learn fair GNNs via knowledge distillation, namely FairGKD. Our work is motivated by the empirical observation that training GNNs on partial data (i.e., only node attributes or topology data) can improve their fairness, albeit at the cost of utility. To make a balanced trade-off between fairness and utility performance, we employ a set of fairness experts (i.e., GNNs trained on different partial data) to construct the synthetic teacher, which distills fairer and informative knowledge to guide the learning of the GNN student. Experiments on several benchmark datasets demonstrate that FairGKD, which does not require access to demographic information, significantly improves the fairness of GNNs by a large margin while maintaining their utility.Comment: Accepted by WSDM 202

    Stochastic Coded Federated Learning: Theoretical Analysis and Incentive Mechanism Design

    Full text link
    Federated learning (FL) has achieved great success as a privacy-preserving distributed training paradigm, where many edge devices collaboratively train a machine learning model by sharing the model updates instead of the raw data with a server. However, the heterogeneous computational and communication resources of edge devices give rise to stragglers that significantly decelerate the training process. To mitigate this issue, we propose a novel FL framework named stochastic coded federated learning (SCFL) that leverages coded computing techniques. In SCFL, before the training process starts, each edge device uploads a privacy-preserving coded dataset to the server, which is generated by adding Gaussian noise to the projected local dataset. During training, the server computes gradients on the global coded dataset to compensate for the missing model updates of the straggling devices. We design a gradient aggregation scheme to ensure that the aggregated model update is an unbiased estimate of the desired global update. Moreover, this aggregation scheme enables periodical model averaging to improve the training efficiency. We characterize the tradeoff between the convergence performance and privacy guarantee of SCFL. In particular, a more noisy coded dataset provides stronger privacy protection for edge devices but results in learning performance degradation. We further develop a contract-based incentive mechanism to coordinate such a conflict. The simulation results show that SCFL learns a better model within the given time and achieves a better privacy-performance tradeoff than the baseline methods. In addition, the proposed incentive mechanism grants better training performance than the conventional Stackelberg game approach

    Scaling Up, Scaling Deep: Blockwise Graph Contrastive Learning

    Full text link
    Oversmoothing is a common phenomenon in graph neural networks (GNNs), in which an increase in the network depth leads to a deterioration in their performance. Graph contrastive learning (GCL) is emerging as a promising way of leveraging vast unlabeled graph data. As a marriage between GNNs and contrastive learning, it remains unclear whether GCL inherits the same oversmoothing defect from GNNs. This work undertakes a fundamental analysis of GCL from the perspective of oversmoothing on the first hand. We demonstrate empirically that increasing network depth in GCL also leads to oversmoothing in their deep representations, and surprisingly, the shallow ones. We refer to this phenomenon in GCL as long-range starvation', wherein lower layers in deep networks suffer from degradation due to the lack of sufficient guidance from supervision (e.g., loss computing). Based on our findings, we present BlockGCL, a remarkably simple yet effective blockwise training framework to prevent GCL from notorious oversmoothing. Without bells and whistles, BlockGCL consistently improves robustness and stability for well-established GCL methods with increasing numbers of layers on real-world graph benchmarks. We believe our work will provide insights for future improvements of scalable and deep GCL frameworks.Comment: Preprint; Code is available at https://github.com/EdisonLeeeee/BlockGC

    A Survey of What to Share in Federated Learning: Perspectives on Model Utility, Privacy Leakage, and Communication Efficiency

    Full text link
    Federated learning (FL) has emerged as a highly effective paradigm for privacy-preserving collaborative training among different parties. Unlike traditional centralized learning, which requires collecting data from each party, FL allows clients to share privacy-preserving information without exposing private datasets. This approach not only guarantees enhanced privacy protection but also facilitates more efficient and secure collaboration among multiple participants. Therefore, FL has gained considerable attention from researchers, promoting numerous surveys to summarize the related works. However, the majority of these surveys concentrate on methods sharing model parameters during the training process, while overlooking the potential of sharing other forms of local information. In this paper, we present a systematic survey from a new perspective, i.e., what to share in FL, with an emphasis on the model utility, privacy leakage, and communication efficiency. This survey differs from previous ones due to four distinct contributions. First, we present a new taxonomy of FL methods in terms of the sharing methods, which includes three categories of shared information: model sharing, synthetic data sharing, and knowledge sharing. Second, we analyze the vulnerability of different sharing methods to privacy attacks and review the defense mechanisms that provide certain privacy guarantees. Third, we conduct extensive experiments to compare the performance and communication overhead of various sharing methods in FL. Besides, we assess the potential privacy leakage through model inversion and membership inference attacks, while comparing the effectiveness of various defense approaches. Finally, we discuss potential deficiencies in current methods and outline future directions for improvement

    Development of an ELISA-array for simultaneous detection of five encephalitis viruses

    Get PDF
    Japanese encephalitis virus(JEV), tick-borne encephalitis virus(TBEV), and eastern equine encephalitis virus (EEEV) can cause symptoms of encephalitis. Establishment of accurate and easy methods by which to detect these viruses is essential for the prevention and treatment of associated infectious diseases. Currently, there are still no multiple antigen detection methods available clinically. An ELISA-array, which detects multiple antigens, is easy to handle, and inexpensive, has enormous potential in pathogen detection. An ELISA-array method for the simultaneous detection of five encephalitis viruses was developed in this study. Seven monoclonal antibodies against five encephalitis-associated viruses were prepared and used for development of the ELISA-array. The ELISA-array assay is based on a "sandwich" ELISA format and consists of viral antibodies printed directly on 96-well microtiter plates, allowing for direct detection of 5 viruses. The developed ELISA-array proved to have similar specificity and higher sensitivity compared with the conventional ELISAs. This method was validated by different viral cultures and three chicken eggs inoculated with infected patient serum. The results demonstrated that the developed ELISA-array is sensitive and easy to use, which would have potential for clinical use

    A duplex real-time reverse transcriptase polymerase chain reaction assay for detecting western equine and eastern equine encephalitis viruses

    Get PDF
    In order to establish an accurate, ready-to-use assay for simultaneous detection of Eastern equine encephalitis virus (EEEV) and Western equine encephalitis virus (WEEV), we developed one duplex TaqMan real-time reverse transcriptase polymerase chain reaction (RT-PCR) assay, which can be used in human and vector surveillance. First, we selected the primers and FAM-labeled TaqMan-probe specific for WEEV from the consensus sequence of NSP3 and the primers and HEX-labeled TaqMan-probe specific for EEEV from the consensus sequence of E3, respectively. Then we constructed and optimized the duplex real-time RT-PCR assay by adjusting the concentrations of primers and probes. Using a series of dilutions of transcripts containing target genes as template, we showed that the sensitivity of the assay reached 1 copy/reaction for EEEV and WEEV, and the performance was linear within the range of at least 10(6 )transcript copies. Moreover, we evaluated the specificity of the duplex system using other encephalitis virus RNA as template, and found no cross-reactivity. Compared with virus isolation, the gold standard, the duplex real time RT-PCR assay we developed was 10-fold more sensitive for both WEEV and EEEV detection

    Role of extrathyroidal TSHR expression in adipocyte differentiation and its association with obesity

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Obesity is known to be associated with higher risks of cardiovascular disease, metabolic syndrome, and diabetes mellitus. Thyroid-stimulating hormone (TSHR) is the receptor for thyroid-stimulating hormone (TSH, or thyrotropin), the key regulator of thyroid functions. The expression of TSHR, once considered to be limited to thyrocytes, has been so far detected in many extrathyroidal tissues including liver and fat. Previous studies have shown that TSHR expression is upregulated when preadipocytes differentiate into mature adipocytes, suggestive of a possible role of TSHR in adipogenesis. However, it remains unclear whether TSHR expression in adipocytes is implicated in the pathogenesis of obesity.</p> <p>Methods</p> <p>In the present study, TSHR expression in adipose tissues from both mice and human was analyzed, and its association with obesity was evaluated.</p> <p>Results</p> <p>We here showed that TSHR expression was increased at both mRNA and protein levels when 3T3-L1 preadipocytes were induced to differentiate. Knockdown of TSHR blocked the adipocyte differentiation of 3T3-L1 preadipocytes as evaluated by Oil-red-O staining for lipid accumulation and by RT-PCR analyses of PPAR-γ and ALBP mRNA expression. We generated obesity mice (C57/BL6) by high-fat diet feeding and found that the TSHR protein expression in visceral adipose tissues from obesity mice was significantly higher in comparison with the non-obesity control mice (<it>P </it>< 0.05). Finally, the TSHR expression in adipose tissues was determined in 120 patients. The results showed that TSHR expression in subcutaneous adipose tissue is correlated with BMI (body mass index).</p> <p>Conclusion</p> <p>Taken together, these results suggested that TSHR is an important regulator of adipocyte differentiation. Dysregulated expression of TSHR in adipose tissues is associated with obesity, which may involve a mechanism of excess adipogenesis.</p

    Hetero2^2Net: Heterophily-aware Representation Learning on Heterogenerous Graphs

    Full text link
    Real-world graphs are typically complex, exhibiting heterogeneity in the global structure, as well as strong heterophily within local neighborhoods. While a growing body of literature has revealed the limitations of common graph neural networks (GNNs) in handling homogeneous graphs with heterophily, little work has been conducted on investigating the heterophily properties in the context of heterogeneous graphs. To bridge this research gap, we identify the heterophily in heterogeneous graphs using metapaths and propose two practical metrics to quantitatively describe the levels of heterophily. Through in-depth investigations on several real-world heterogeneous graphs exhibiting varying levels of heterophily, we have observed that heterogeneous graph neural networks (HGNNs), which inherit many mechanisms from GNNs designed for homogeneous graphs, fail to generalize to heterogeneous graphs with heterophily or low level of homophily. To address the challenge, we present Hetero2^2Net, a heterophily-aware HGNN that incorporates both masked metapath prediction and masked label prediction tasks to effectively and flexibly handle both homophilic and heterophilic heterogeneous graphs. We evaluate the performance of Hetero2^2Net on five real-world heterogeneous graph benchmarks with varying levels of heterophily. The results demonstrate that Hetero2^2Net outperforms strong baselines in the semi-supervised node classification task, providing valuable insights into effectively handling more complex heterogeneous graphs.Comment: Preprin
    corecore