59 research outputs found

    Consistency-Guided Temperature Scaling Using Style and Content Information for Out-of-Domain Calibration

    Full text link
    Research interests in the robustness of deep neural networks against domain shifts have been rapidly increasing in recent years. Most existing works, however, focus on improving the accuracy of the model, not the calibration performance which is another important requirement for trustworthy AI systems. Temperature scaling (TS), an accuracy-preserving post-hoc calibration method, has been proven to be effective in in-domain settings, but not in out-of-domain (OOD) due to the difficulty in obtaining a validation set for the unseen domain beforehand. In this paper, we propose consistency-guided temperature scaling (CTS), a new temperature scaling strategy that can significantly enhance the OOD calibration performance by providing mutual supervision among data samples in the source domains. Motivated by our observation that over-confidence stemming from inconsistent sample predictions is the main obstacle to OOD calibration, we propose to guide the scaling process by taking consistencies into account in terms of two different aspects -- style and content -- which are the key components that can well-represent data samples in multi-domain settings. Experimental results demonstrate that our proposed strategy outperforms existing works, achieving superior OOD calibration performance on various datasets. This can be accomplished by employing only the source domains without compromising accuracy, making our scheme directly applicable to various trustworthy AI systems.Comment: Accepted at AAAI-24 (The 38th AAAI Conference on Artificial Intelligence, February 2024

    Hydra: Multi-head Low-rank Adaptation for Parameter Efficient Fine-tuning

    Full text link
    The recent surge in large-scale foundation models has spurred the development of efficient methods for adapting these models to various downstream tasks. Low-rank adaptation methods, such as LoRA, have gained significant attention due to their outstanding parameter efficiency and no additional inference latency. This paper investigates a more general form of adapter module based on the analysis that parallel and sequential adaptation branches learn novel and general features during fine-tuning, respectively. The proposed method, named Hydra, due to its multi-head computational branches, combines parallel and sequential branch to integrate capabilities, which is more expressive than existing single branch methods and enables the exploration of a broader range of optimal points in the fine-tuning process. In addition, the proposed adaptation method explicitly leverages the pre-trained weights by performing a linear combination of the pre-trained features. It allows the learned features to have better generalization performance across diverse downstream tasks. Furthermore, we perform a comprehensive analysis of the characteristics of each adaptation branch with empirical evidence. Through an extensive range of experiments, encompassing comparisons and ablation studies, we substantiate the efficiency and demonstrate the superior performance of Hydra. This comprehensive evaluation underscores the potential impact and effectiveness of Hydra in a variety of applications. Our code is available on \url{https://github.com/extremebird/Hydra

    EvoFed: Leveraging Evolutionary Strategies for Communication-Efficient Federated Learning

    Full text link
    Federated Learning (FL) is a decentralized machine learning paradigm that enables collaborative model training across dispersed nodes without having to force individual nodes to share data. However, its broad adoption is hindered by the high communication costs of transmitting a large number of model parameters. This paper presents EvoFed, a novel approach that integrates Evolutionary Strategies (ES) with FL to address these challenges. EvoFed employs a concept of 'fitness-based information sharing', deviating significantly from the conventional model-based FL. Rather than exchanging the actual updated model parameters, each node transmits a distance-based similarity measure between the locally updated model and each member of the noise-perturbed model population. Each node, as well as the server, generates an identical population set of perturbed models in a completely synchronized fashion using the same random seeds. With properly chosen noise variance and population size, perturbed models can be combined to closely reflect the actual model updated using the local dataset, allowing the transmitted similarity measures (or fitness values) to carry nearly the complete information about the model parameters. As the population size is typically much smaller than the number of model parameters, the savings in communication load is large. The server aggregates these fitness values and is able to update the global model. This global fitness vector is then disseminated back to the nodes, each of which applies the same update to be synchronized to the global model. Our analysis shows that EvoFed converges, and our experimental results validate that at the cost of increased local processing loads, EvoFed achieves performance comparable to FedAvg while reducing overall communication requirements drastically in various practical settings

    Julia Cloud Matrix Machine: Dynamic Matrix Language Acceleration on Multicore Clusters in the Cloud

    Full text link
    In emerging scientific computing environments, matrix computations of increasing size and complexity are increasingly becoming prevalent. However, contemporary matrix language implementations are insufficient in their support for efficient utilization of cloud computing resources, particularly on the user side. We thus developed an extension of the Julia high-performance computation language such that matrix computations are automatically parallelized in the cloud, where users are separated from directly interacting with complex explicitly-parallel computations. We implement lazy evaluation semantics combined with directed graphs to optimize matrix operations on the fly while dynamic simulation finds the optimal tile size and schedule for a given cluster of cloud nodes. A time model prediction of the cluster's performance capacity is constructed to enable simulations. Automatic configuration of communication and worker processes on the cloud networks allow for the framework to automatically scale up for clusters of heterogeneous nodes. Our framework's experimental evaluation comprises eleven benchmarks on an fourteen node (564 CPUs) cluster in the AWS public cloud, revealing speedups of up to a factor of 5.1, with an average 74.39% of the upper bound for speedups

    MobiCon: A mobile context-monitoring platform

    Get PDF
    User context is defined by data generated through everyday physical activity in sensor-rich, resource-limited mobile environments.</jats:p

    PowerForecaster: Predicting Smartphone Power Impact of Continuous Sensing Applications at Pre-installation Time

    Get PDF
    Today&apos;s smartphone application (hereinafter &apos;app&apos;) markets miss a key piece of information, power consumption of apps. This causes a severe problem for continuous sensing apps as they consume significant power without users&apos; awareness. Users have no choice but to repeatedly install one app after another and experience their power use. To break such an exhaustive cycle, we propose PowerForecaster, a system that provides users with power use of sensing apps at pre-installation time. Such advanced power estimation is extremely challenging since the power cost of a sensing app largely varies with users&apos; physical activities and phone use patterns. We observe that the time for active sensing and processing of an app can vary up to three times with 27 people&apos;s sensor traces collected over three weeks. PowerForecaster adopts a novel power emulator that emulates the power use of a sensing app while reproducing users&apos; physical activities and phone use patterns, achieving accurate, personalized power estimation. Our experiments with three commercial apps and two research prototypes show that PowerForecaster achieves 93.4% accuracy under 20 use cases. Also, we optimize the system to accelerate emulation speed and reduce overheads, and show the effectiveness of such optimization techniques.

    Fast Pareto Front Exploration for Design of Reconfigurable Energy Storage

    No full text

    Power Generation and Microbial Community Shift According to Applied Anodic Potential in Electroactive Biofilm Reactors Treating Synthetic and Domestic Wastewater

    No full text
    This study investigated the effect of initially set anodic potentials (&minus;0.3, &minus;0.2, &minus;0.1 and +0.1 V) on voltage production and microbial community in electroactive biofilm reactors (EBRs) treating synthetic and domestic wastewater (WW). In phase 1, EBRs were acclimated with different anodic potentials for synthetic and domestic WW. EBR (SE4) poised with +0.1 V showed the highest maximum power density (420 mW/m2) for synthetic WW, while EBR (DE3) poised with &minus;0.1 V showed the highest maximum power density (235 mW/m2) for domestic WW. In phase 2, the EBRs were operated with a fixed external resistance (100 &Omega; for synthetic WW and 500 &Omega; for domestic WW) after the applied potentials were stopped. The EBRs showed slightly different voltage productions depending on the WW type and the initial anodic potential, but both EBRs applied with +0.1 V for synthetic (SE4) and domestic (DE4) WW showed the highest voltage production. Principal component analysis results based on denaturing gel gradient electrophoresis band profiles showed that the microbial community was completely different depending on the WW type. Nevertheless, it was found that the microbial community of EBRs applied with a negative potential (&minus;0.3, &minus;0.2, and &minus;0.1 V) seemed to shift to those of EBRs applied with a positive potential (+0.1 V) regardless of WW type. Therefore, positive anodic potential is an important operating factor in electroactive biofilm development and voltage generation for rapid start-up

    Power Generation and Microbial Community Shift According to Applied Anodic Potential in Electroactive Biofilm Reactors Treating Synthetic and Domestic Wastewater

    No full text
    This study investigated the effect of initially set anodic potentials (−0.3, −0.2, −0.1 and +0.1 V) on voltage production and microbial community in electroactive biofilm reactors (EBRs) treating synthetic and domestic wastewater (WW). In phase 1, EBRs were acclimated with different anodic potentials for synthetic and domestic WW. EBR (SE4) poised with +0.1 V showed the highest maximum power density (420 mW/m2) for synthetic WW, while EBR (DE3) poised with −0.1 V showed the highest maximum power density (235 mW/m2) for domestic WW. In phase 2, the EBRs were operated with a fixed external resistance (100 Ω for synthetic WW and 500 Ω for domestic WW) after the applied potentials were stopped. The EBRs showed slightly different voltage productions depending on the WW type and the initial anodic potential, but both EBRs applied with +0.1 V for synthetic (SE4) and domestic (DE4) WW showed the highest voltage production. Principal component analysis results based on denaturing gel gradient electrophoresis band profiles showed that the microbial community was completely different depending on the WW type. Nevertheless, it was found that the microbial community of EBRs applied with a negative potential (−0.3, −0.2, and −0.1 V) seemed to shift to those of EBRs applied with a positive potential (+0.1 V) regardless of WW type. Therefore, positive anodic potential is an important operating factor in electroactive biofilm development and voltage generation for rapid start-up
    corecore