8 research outputs found

    Sensi-BERT: Towards Sensitivity Driven Fine-Tuning for Parameter-Efficient BERT

    Full text link
    Large pre-trained language models have recently gained significant traction due to their improved performance on various down-stream tasks like text classification and question answering, requiring only few epochs of fine-tuning. However, their large model sizes often prohibit their applications on resource-constrained edge devices. Existing solutions of yielding parameter-efficient BERT models largely rely on compute-exhaustive training and fine-tuning. Moreover, they often rely on additional compute heavy models to mitigate the performance gap. In this paper, we present Sensi-BERT, a sensitivity driven efficient fine-tuning of BERT models that can take an off-the-shelf pre-trained BERT model and yield highly parameter-efficient models for downstream tasks. In particular, we perform sensitivity analysis to rank each individual parameter tensor, that then is used to trim them accordingly during fine-tuning for a given parameter or FLOPs budget. Our experiments show the efficacy of Sensi-BERT across different downstream tasks including MNLI, QQP, QNLI, and SST-2, demonstrating better performance at similar or smaller parameter budget compared to various existing alternatives.Comment: 6 pages, 4 figures, 2 table

    SuNeRF: Validation of a 3D Global Reconstruction of the Solar Corona Using Simulated EUV Images

    Full text link
    Extreme Ultraviolet (EUV) light emitted by the Sun impacts satellite operations and communications and affects the habitability of planets. Currently, EUV-observing instruments are constrained to viewing the Sun from its equator (i.e., ecliptic), limiting our ability to forecast EUV emission for other viewpoints (e.g. solar poles), and to generalize our knowledge of the Sun-Earth system to other host stars. In this work, we adapt Neural Radiance Fields (NeRFs) to the physical properties of the Sun and demonstrate that non-ecliptic viewpoints could be reconstructed from observations limited to the solar ecliptic. To validate our approach, we train on simulations of solar EUV emission that provide a ground truth for all viewpoints. Our model accurately reconstructs the simulated 3D structure of the Sun, achieving a peak signal-to-noise ratio of 43.3 dB and a mean absolute relative error of 0.3\% for non-ecliptic viewpoints. Our method provides a consistent 3D reconstruction of the Sun from a limited number of viewpoints, thus highlighting the potential to create a virtual instrument for satellite observations of the Sun. Its extension to real observations will provide the missing link to compare the Sun to other stars and to improve space-weather forecasting.Comment: Accepted at Machine Learning and the Physical Sciences workshop, NeurIPS 202

    A Hardware-Aware Framework for Accelerating Neural Architecture Search Across Modalities

    Full text link
    Recent advances in Neural Architecture Search (NAS) such as one-shot NAS offer the ability to extract specialized hardware-aware sub-network configurations from a task-specific super-network. While considerable effort has been employed towards improving the first stage, namely, the training of the super-network, the search for derivative high-performing sub-networks is still under-explored. Popular methods decouple the super-network training from the sub-network search and use performance predictors to reduce the computational burden of searching on different hardware platforms. We propose a flexible search framework that automatically and efficiently finds optimal sub-networks that are optimized for different performance metrics and hardware configurations. Specifically, we show how evolutionary algorithms can be paired with lightly trained objective predictors in an iterative cycle to accelerate architecture search in a multi-objective setting for various modalities including machine translation and image classification

    Sun Neural Radiance Fields (SuNeRFs): From Images to 4D Models of the Solar Atmosphere

    No full text
    EUV-observing instruments are limited in their numbers and have mainly been constrained to viewing the Sun from the ecliptic. For example, the Solar Dynamics Observatory (SDO; 2010-present) provides images of the Sun in EUV from the perspective of the Earth-Sun line. Two additional viewpoints are provided by the STEREO twin satellites pulling Ahead (STEREO-A; 2006-present) and falling Behind (STEREO-B; 2006-2014) of Earth's orbit. No satellites observe the solar poles directly. However, a complete image of the 3D Sun is required to fully understand the dynamics of the Sun (from eruptive events to space weather in the solar system), to forecast EUV radiation to protect our assets in space, to relate the Sun to other stars in the universe, and to generalize our knowledge of the Sun-Earth system to other host stars. To maximize the science return of multiple viewpoints, we propose a novel approach that unifies and smoothly integrates data from multiple perspectives into a consistent 3D representation of the solar corona. More specifically, we leverage Neural Radiance Fields (NeRFs) which are neural networks that achieve state-of-the-art 3D scene representation and generate novel views from a limited number of input images. We adapted a Sun NeRF (SuNeRF) to generate a physically-consistent representation of the 3D Sun, with the inclusion of radiative transfer and geometric ray sampling that matches the physical reality of optically thin plasma in the solar atmosphere. SuNeRFs leverage existing multi-viewpoint observations and act as virtual instruments that can fly out of the ecliptic, that can view the poles, and that can be placed anywhere in the solar system to generate novel views. Our pipeline is an example of how novel deep learning techniques can be used to significantly enhance observational capabilities by the creation of virtual instruments
    corecore