51 research outputs found

    FedPop: Federated Population-based Hyperparameter Tuning

    Full text link
    Federated Learning (FL) is a distributed machine learning (ML) paradigm, in which multiple clients collaboratively train ML models without centralizing their local data. Similar to conventional ML pipelines, the client local optimization and server aggregation procedure in FL are sensitive to the hyperparameter (HP) selection. Despite extensive research on tuning HPs for centralized ML, these methods yield suboptimal results when employed in FL. This is mainly because their "training-after-tuning" framework is unsuitable for FL with limited client computation power. While some approaches have been proposed for HP-Tuning in FL, they are limited to the HPs for client local updates. In this work, we propose a novel HP-tuning algorithm, called Federated Population-based Hyperparameter Tuning (FedPop), to address this vital yet challenging problem. FedPop employs population-based evolutionary algorithms to optimize the HPs, which accommodates various HP types at both client and server sides. Compared with prior tuning methods, FedPop employs an online "tuning-while-training" framework, offering computational efficiency and enabling the exploration of a broader HP search space. Our empirical validation on the common FL benchmarks and complex real-world FL datasets demonstrates the effectiveness of the proposed method, which substantially outperforms the concurrent state-of-the-art HP tuning methods for FL

    FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning

    Full text link
    Recently, foundation models have exhibited remarkable advancements in multi-modal learning. These models, equipped with millions (or billions) of parameters, typically require a substantial amount of data for finetuning. However, collecting and centralizing training data from diverse sectors becomes challenging due to distinct privacy regulations. Federated Learning (FL) emerges as a promising solution, enabling multiple clients to collaboratively train neural networks without centralizing their local data. To alleviate client computation burdens and communication overheads, previous works have adapted Parameter-efficient Finetuning (PEFT) methods for FL. Hereby, only a small fraction of the model parameters are optimized and communicated during federated communications. Nevertheless, most previous works have focused on a single modality and neglected one common phenomenon, i.e., the presence of data heterogeneity across the clients. Therefore, in this work, we propose a finetuning framework tailored to heterogeneous multi-modal FL, called Federated Dual-Aadapter Teacher (FedDAT). Specifically, our approach leverages a Dual-Adapter Teacher (DAT) to address data heterogeneity by regularizing the client local updates and applying Mutual Knowledge Distillation (MKD) for an efficient knowledge transfer. FedDAT is the first approach that enables an efficient distributed finetuning of foundation models for a variety of heterogeneous Vision-Language tasks. To demonstrate its effectiveness, we conduct extensive experiments on four multi-modality FL benchmarks with different types of data heterogeneity, where FedDAT substantially outperforms the existing centralized PEFT methods adapted for FL

    Exploring Diverse In-Context Configurations for Image Captioning

    Full text link
    After discovering that Language Models (LMs) can be good in-context few-shot learners, numerous strategies have been proposed to optimize in-context sequence configurations. Recently, researchers in Vision-Language (VL) domains also develop their few-shot learners, while they only use the simplest way, ie., randomly sampling, to configure in-context image-text pairs. In order to explore the effects of varying configurations on VL in-context learning, we devised four strategies for image selection and four for caption assignment to configure in-context image-text pairs for image captioning. Here Image Captioning is used as the case study since it can be seen as the visually-conditioned LM. Our comprehensive experiments yield two counter-intuitive but valuable insights, highlighting the distinct characteristics of VL in-context learning due to multi-modal synergy, as compared to the NLP case. Furthermore, in our exploration of optimal combination strategies, we observed an average performance enhancement of 20.9 of CIDEr scores compared to the baseline. The code is given in https://github.com/yongliang-wu/ExploreCfg.Comment: Accepted by NeurIPS202

    FRAug: Tackling Federated Learning with Non-IID Features via Representation Augmentation

    Full text link
    Federated Learning (FL) is a decentralized learning paradigm, in which multiple clients collaboratively train deep learning models without centralizing their local data, and hence preserve data privacy. Real-world applications usually involve a distribution shift across the datasets of the different clients, which hurts the generalization ability of the clients to unseen samples from their respective data distributions. In this work, we address the recently proposed feature shift problem where the clients have different feature distributions, while the label distribution is the same. We propose Federated Representation Augmentation (FRAug) to tackle this practical and challenging problem. Our approach generates synthetic client-specific samples in the embedding space to augment the usually small client datasets. For that, we train a shared generative model to fuse the clients knowledge learned from their different feature distributions. This generator synthesizes client-agnostic embeddings, which are then locally transformed into client-specific embeddings by Representation Transformation Networks (RTNets). By transferring knowledge across the clients, the generated embeddings act as a regularizer for the client models and reduce overfitting to the local original datasets, hence improving generalization. Our empirical evaluation on public benchmarks and a real-world medical dataset demonstrates the effectiveness of the proposed method, which substantially outperforms the current state-of-the-art FL methods for non-IID features, including PartialFed and FedBN.Comment: ICCV 202

    AutoTrans: A Complete Planning and Control Framework for Autonomous UAV Payload Transportation

    Full text link
    The robotics community is increasingly interested in autonomous aerial transportation. Unmanned aerial vehicles with suspended payloads have advantages over other systems, including mechanical simplicity and agility, but pose great challenges in planning and control. To realize fully autonomous aerial transportation, this paper presents a systematic solution to address these difficulties. First, we present a real-time planning method that generates smooth trajectories considering the time-varying shape and non-linear dynamics of the system, ensuring whole-body safety and dynamic feasibility. Additionally, an adaptive NMPC with a hierarchical disturbance compensation strategy is designed to overcome unknown external perturbations and inaccurate model parameters. Extensive experiments show that our method is capable of generating high-quality trajectories online, even in highly constrained environments, and tracking aggressive flight trajectories accurately, even under significant uncertainty. We plan to release our code to benefit the community.Comment: Accepted by IEEE Robotics and Automation Letter

    Large-scale Interactive Recommendation with Tree-structured Policy Gradient

    Full text link
    Reinforcement learning (RL) has recently been introduced to interactive recommender systems (IRS) because of its nature of learning from dynamic interactions and planning for long-run performance. As IRS is always with thousands of items to recommend (i.e., thousands of actions), most existing RL-based methods, however, fail to handle such a large discrete action space problem and thus become inefficient. The existing work that tries to deal with the large discrete action space problem by utilizing the deep deterministic policy gradient framework suffers from the inconsistency between the continuous action representation (the output of the actor network) and the real discrete action. To avoid such inconsistency and achieve high efficiency and recommendation effectiveness, in this paper, we propose a Tree-structured Policy Gradient Recommendation (TPGR) framework, where a balanced hierarchical clustering tree is built over the items and picking an item is formulated as seeking a path from the root to a certain leaf of the tree. Extensive experiments on carefully-designed environments based on two real-world datasets demonstrate that our model provides superior recommendation performance and significant efficiency improvement over state-of-the-art methods

    CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering

    Full text link
    Visual Question Answering (VQA) is a multi-discipline research task. To produce the right answer, it requires an understanding of the visual content of images, the natural language questions, as well as commonsense reasoning over the information contained in the image and world knowledge. Recently, large-scale Vision-and-Language Pre-trained Models (VLPMs) have been the mainstream approach to VQA tasks due to their superior performance. The standard practice is to fine-tune large-scale VLPMs pre-trained on huge general-domain datasets using the domain-specific VQA datasets. However, in reality, the application domain can change over time, necessitating VLPMs to continually learn and adapt to new domains without forgetting previously acquired knowledge. Most existing continual learning (CL) research concentrates on unimodal tasks, whereas a more practical application scenario, i.e, CL on cross-domain VQA, has not been studied. Motivated by this, we introduce CL-CrossVQA, a rigorous Continual Learning benchmark for Cross-domain Visual Question Answering, through which we conduct extensive experiments on 4 VLPMs, 4 CL approaches, and 5 VQA datasets from different domains. In addition, by probing the forgetting phenomenon of the intermediate layers, we provide insights into how model architecture affects CL performance, why CL approaches can help mitigate forgetting in VLPMs to some extent, and how to design CL approaches suitable for VLPMs in this challenging continual learning environment. To facilitate future work on CL for cross-domain VQA, we will release our datasets and code.Comment: 10 pages, 6 figure

    Effects of food restriction on growth, body composition and gene expression related in regulation of lipid metabolism and food intake in grass carp

    Get PDF
    It is well known that most fish would prefer to use body lipid stores for energy expenditure when receiving a long-term food restriction. However, the mechanism of this is still not clear. In the present study, a growth experiment was carried out to investigate the effects of food restriction on growth performance, gene expression related in regulation of lipid metabolism and food ingestion in grass carp (Ctenopharyngodon idellus). Four rations, satiation (S), 80% S, 60% S and 40% S, were adopted in this study. Each treatment was randomly assigned to triplicate net cages of 15 fish (177.3 +/- 3.3 g) per cage. The experiment lasted for 49 days at 30.0 +/- 3.0 degrees C. The experimental results showed that a significant increase in feeding rate and weight gain was found in grass carp with the increased ration level. The body lipid and energy content of the grass carp exhibited a significant decrease when receiving food restriction. The transcriptional levels of the genes involved in lipogenesis (srebp-1c, fas, ppar gamma) were down-regulated at the rations of food restriction. The relative expression of hepatic fas (fatty acid synthetase) and srebp-1c (sterol regulatory element-binding protein 1c) in the fish at satiation were significantly higher than the restricted-fed groups. Similarly, the expressions of hepatic ppar. (peroxisome proliferator-activated receptor-gamma) in the fish at the ration of satiation and 80% S were significantly higher than the group at the low ration of 40% S. However, the expression of hepatic cpt-1a (carnitine palmitoyl transferase I) involved in fatty acid beta-oxidation in fish was significantly up-regulated when receiving food restriction. Other hepatic lipolysis genes of ppar alpha (peroxisome proliferators-activated receptor alpha) and hl (hepatic lipase) didn&#39;t show any significant changes in restricted-fed fish. The transcriptional levels of hepatic leptin and hypothalamus pomc (proopiomelanocortin) were significantly down-regulated in fish fed with restricted rations. But the hypothalamus npy (neuropeptide Y) and lepr (leptin receptor) had no change. The present results indicated that a long-term food restriction could cause less accumulation of lipid and could be through a way of down-regulating lipogenesis genes and up-regulating lipolysis genes. Long-term restriction could also activate the appetite of grass carp by down-regulating some anorexigenic genes. Statement of relevance: Food restriction for some time could lead to a suitable lipid storage, in case of accumulation of fatty acid profile and lipid, in cultured grass carp. (C) 2016 Elsevier B.V. All rights reserved.</p
    corecore