114 research outputs found

    Federated Generalization via Information-Theoretic Distribution Diversification

    Full text link
    Federated Learning (FL) has surged in prominence due to its capability of collaborative model training without direct data sharing. However, the vast disparity in local data distributions among clients, often termed the non-Independent Identically Distributed (non-IID) challenge, poses a significant hurdle to FL's generalization efficacy. The scenario becomes even more complex when not all clients participate in the training process, a common occurrence due to unstable network connections or limited computational capacities. This can greatly complicate the assessment of the trained models' generalization abilities. While a plethora of recent studies has centered on the generalization gap pertaining to unseen data from participating clients with diverse distributions, the divergence between the training distributions of participating clients and the testing distributions of non-participating ones has been largely overlooked. In response, our paper unveils an information-theoretic generalization framework for FL. Specifically, it quantifies generalization errors by evaluating the information entropy of local distributions and discerning discrepancies across these distributions. Inspired by our deduced generalization bounds, we introduce a weighted aggregation approach and a duo of client selection strategies. These innovations aim to bolster FL's generalization prowess by encompassing a more varied set of client data distributions. Our extensive empirical evaluations reaffirm the potency of our proposed methods, aligning seamlessly with our theoretical construct

    Structure and Color Gradients of Ultra-diffuse Galaxies in Distant Massive Galaxy Clusters

    Full text link
    We have measured structural parameters and radial color profiles of 108 ultra-diffuse galaxies (UDGs), carefully selected from six distant massive galaxy clusters in the Hubble Frontier Fields (HFF) in redshift range from 0.308 to 0.545. Our best-fitting GALFIT models show that the HFF UDGs have a median S\'ersic index of 1.09, which is close to 0.86 for local UDGs in the Coma cluster. The median axis-ratio value is 0.68 for HFF UDGs and 0.74 for Coma UDGs, respectively. The structural similarity between HFF and Coma UDGs suggests that they are the same kind of galaxies seen at different times and the structures of UDGs do not change at least for several billion years. By checking the distribution of HFF UDGs in the rest-frame UVJUVJ and UVIUVI diagrams, we find a large fraction of them are star-forming. Furthermore, a majority of HFF UDGs show small U−V\rm U-V color gradients within \,1\,*\,Re,SMAR_{e,SMA} region, the fluctuation of the median radial color profile of HFF UDGs is smaller than 0.1\,mag, which is compatible to Coma UDGs. Our results indicate that cluster UDGs may fade or quench in a self-similar way, irrespective of the radial distance, in less than ∼\sim 4 Gyrs.Comment: 17 pages, 8 figures, accepted for publication in Ap

    Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World

    Full text link
    Scene Graph Generation (SGG) aims to extract relationships in images for vision understanding. Although recent works have made steady progress on SGG, they still suffer long-tail distribution issues that tail-predicates are more costly to train and hard to distinguish due to a small amount of annotated data compared to frequent predicates. Existing re-balancing strategies try to handle it via prior rules but are still confined to pre-defined conditions, which are not scalable for various models and datasets. In this paper, we propose a Cross-modal prediCate boosting (CaCao) framework, where a visually-prompted language model is learned to generate diverse fine-grained predicates in a low-resource way. The proposed CaCao can be applied in a plug-and-play fashion and automatically strengthen existing SGG to tackle the long-tailed problem. Based on that, we further introduce a novel Entangled cross-modal prompt approach for open-world predicate scene graph generation (Epic), where models can generalize to unseen predicates in a zero-shot manner. Comprehensive experiments on three benchmark datasets show that CaCao consistently boosts the performance of multiple scene graph generation models in a model-agnostic way. Moreover, our Epic achieves competitive performance on open-world predicate prediction. The data and code for this paper are publicly available.Comment: Accepted by ICCV 202

    MiniDisc: Minimal Distillation Schedule for Language Model Compression

    Full text link
    Recent studies have uncovered that language model distillation is less effective when facing a large capacity gap between the teacher and the student, and introduced teacher assistant-based distillation to bridge the gap. As a connection, the scale and the performance of the teacher assistant is of vital importance to bring the knowledge from the teacher to the student. However, existing teacher assistant-based methods require maximally many trials before scheduling an optimal teacher assistant. To this end, we propose a minimal distillation schedule (MiniDisc) for scheduling the optimal teacher assistant in minimally one trial. In particular, motivated by the finding that the performance of the student is positively correlated to the scale-performance tradeoff of the teacher assistant, MiniDisc is designed with a λ\lambda-tradeoff to measure the optimality of the teacher assistant without trial distillation to the student. MiniDisc then can schedule the optimal teacher assistant with the best λ\lambda-tradeoff in a sandwich framework. MiniDisc is evaluated with an extensive set of experiments on GLUE. Experimental results demonstrate the improved efficiency our MiniDisc compared to several state-of-the-art baselines. We further apply MiniDisc to a language model with billions of parameters and show its scalability.Comment: Accepted to EACL 2024. Code is available at https://github.com/GeneZC/MiniDis

    Curcumin inhibits gastric cancer growth via downregulation of zinc finger protein, ZNF139

    Get PDF
    Purpose: To investigate the effect of curcumin on gastric cancer cell proliferation and the mechanism of action involved. Methods: Viability of gastric cells following curcumin treatment was determined by 3 (4,5 dimethyl thiazol 2 yl) 2,5 diphenyl 2H tetrazolium bromide (MTT) assay. Flow cytometry was used for the assessment of apoptosis induction in SGC 7901 cells. Reverse transcriptase polymerase chain reaction (RT-PCR) and western blotting assay were used for the analysis of Znf139, survivin and Bcl 2 protein expressions. Results: The results showed that curcumin treatment reduced the viability of gastric cancer cell line SGC 7901 cells at 30 µM concentration to 29.67 % after 48 h compared to 99.78 % for control culture. Apoptotic cell population increased significantly (p < 0.05) following treatment with curcumin. Zinc finger protein-139 mRNA and protein expression decreased significantly (p < 0.05) on treatment with curcumin. Furthermore, curcumin suppressed the levels of B cell lymphoma 2 (Bcl 2) and survivin protein. In the mice model of gastric cancer, treatment with 50 mg/kg dose of curcumin inhibited tumor growth and development significantly, compared to the untreated group (p < 0.05). Conclusion: The results demonstrate that curcumin treatment inhibits gastric cancer cell proliferation via down-regulation of zinc finger protein-139. It also suppresses tumor growth in mice. Therefore, curcumin is a promising gastric cancer inhibitor and should be further investigated for the management of gastric cancer

    Kick Bad Guys Out! Zero-Knowledge-Proof-Based Anomaly Detection in Federated Learning

    Full text link
    Federated learning (FL) systems are vulnerable to malicious clients that submit poisoned local models to achieve their adversarial goals, such as preventing the convergence of the global model or inducing the global model to misclassify some data. Many existing defense mechanisms are impractical in real-world FL systems, as they require prior knowledge of the number of malicious clients or rely on re-weighting or modifying submissions. This is because adversaries typically do not announce their intentions before attacking, and re-weighting might change aggregation results even in the absence of attacks. To address these challenges in real FL systems, this paper introduces a cutting-edge anomaly detection approach with the following features: i) Detecting the occurrence of attacks and performing defense operations only when attacks happen; ii) Upon the occurrence of an attack, further detecting the malicious client models and eliminating them without harming the benign ones; iii) Ensuring honest execution of defense mechanisms at the server by leveraging a zero-knowledge proof mechanism. We validate the superior performance of the proposed approach with extensive experiments

    XPrompt: Exploring the Extreme of Prompt Tuning

    Full text link
    Prompt tuning learns soft prompts to condition frozen Pre-trained Language Models (PLMs) for performing downstream tasks in a parameter-efficient manner. While prompt tuning has gradually reached the performance level of fine-tuning as the model scale increases, there is still a large performance gap between prompt tuning and fine-tuning for models of moderate and small scales (typically less than 11B parameters). In this paper, we empirically show that the trained prompt tokens can have a negative impact on a downstream task and thus degrade its performance. To bridge the gap, we propose a novel Prompt tuning model with an eXtremely small scale (XPrompt) under the regime of lottery tickets hypothesis. Specifically, XPrompt eliminates the negative prompt tokens at different granularity levels through a hierarchical structured pruning, yielding a more parameter-efficient prompt yet with a competitive performance. Comprehensive experiments are carried out on SuperGLUE tasks, and the extensive results indicate that XPrompt is able to close the performance gap at smaller model scales.Comment: 15 pages, accepted to EMNLP 2022 main conferenc
    • …
    corecore