8 research outputs found
Sensi-BERT: Towards Sensitivity Driven Fine-Tuning for Parameter-Efficient BERT
Large pre-trained language models have recently gained significant traction
due to their improved performance on various down-stream tasks like text
classification and question answering, requiring only few epochs of
fine-tuning. However, their large model sizes often prohibit their applications
on resource-constrained edge devices. Existing solutions of yielding
parameter-efficient BERT models largely rely on compute-exhaustive training and
fine-tuning. Moreover, they often rely on additional compute heavy models to
mitigate the performance gap. In this paper, we present Sensi-BERT, a
sensitivity driven efficient fine-tuning of BERT models that can take an
off-the-shelf pre-trained BERT model and yield highly parameter-efficient
models for downstream tasks. In particular, we perform sensitivity analysis to
rank each individual parameter tensor, that then is used to trim them
accordingly during fine-tuning for a given parameter or FLOPs budget. Our
experiments show the efficacy of Sensi-BERT across different downstream tasks
including MNLI, QQP, QNLI, and SST-2, demonstrating better performance at
similar or smaller parameter budget compared to various existing alternatives.Comment: 6 pages, 4 figures, 2 table
SuNeRF: Validation of a 3D Global Reconstruction of the Solar Corona Using Simulated EUV Images
Extreme Ultraviolet (EUV) light emitted by the Sun impacts satellite
operations and communications and affects the habitability of planets.
Currently, EUV-observing instruments are constrained to viewing the Sun from
its equator (i.e., ecliptic), limiting our ability to forecast EUV emission for
other viewpoints (e.g. solar poles), and to generalize our knowledge of the
Sun-Earth system to other host stars. In this work, we adapt Neural Radiance
Fields (NeRFs) to the physical properties of the Sun and demonstrate that
non-ecliptic viewpoints could be reconstructed from observations limited to the
solar ecliptic. To validate our approach, we train on simulations of solar EUV
emission that provide a ground truth for all viewpoints. Our model accurately
reconstructs the simulated 3D structure of the Sun, achieving a peak
signal-to-noise ratio of 43.3 dB and a mean absolute relative error of 0.3\%
for non-ecliptic viewpoints. Our method provides a consistent 3D reconstruction
of the Sun from a limited number of viewpoints, thus highlighting the potential
to create a virtual instrument for satellite observations of the Sun. Its
extension to real observations will provide the missing link to compare the Sun
to other stars and to improve space-weather forecasting.Comment: Accepted at Machine Learning and the Physical Sciences workshop,
NeurIPS 202
A Hardware-Aware Framework for Accelerating Neural Architecture Search Across Modalities
Recent advances in Neural Architecture Search (NAS) such as one-shot NAS
offer the ability to extract specialized hardware-aware sub-network
configurations from a task-specific super-network. While considerable effort
has been employed towards improving the first stage, namely, the training of
the super-network, the search for derivative high-performing sub-networks is
still under-explored. Popular methods decouple the super-network training from
the sub-network search and use performance predictors to reduce the
computational burden of searching on different hardware platforms. We propose a
flexible search framework that automatically and efficiently finds optimal
sub-networks that are optimized for different performance metrics and hardware
configurations. Specifically, we show how evolutionary algorithms can be paired
with lightly trained objective predictors in an iterative cycle to accelerate
architecture search in a multi-objective setting for various modalities
including machine translation and image classification
Sun Neural Radiance Fields (SuNeRFs): From Images to 4D Models of the Solar Atmosphere
EUV-observing instruments are limited in their numbers and have mainly been constrained to viewing the Sun from the ecliptic. For example, the Solar Dynamics Observatory (SDO; 2010-present) provides images of the Sun in EUV from the perspective of the Earth-Sun line. Two additional viewpoints are provided by the STEREO twin satellites pulling Ahead (STEREO-A; 2006-present) and falling Behind (STEREO-B; 2006-2014) of Earth's orbit. No satellites observe the solar poles directly. However, a complete image of the 3D Sun is required to fully understand the dynamics of the Sun (from eruptive events to space weather in the solar system), to forecast EUV radiation to protect our assets in space, to relate the Sun to other stars in the universe, and to generalize our knowledge of the Sun-Earth system to other host stars. To maximize the science return of multiple viewpoints, we propose a novel approach that unifies and smoothly integrates data from multiple perspectives into a consistent 3D representation of the solar corona. More specifically, we leverage Neural Radiance Fields (NeRFs) which are neural networks that achieve state-of-the-art 3D scene representation and generate novel views from a limited number of input images. We adapted a Sun NeRF (SuNeRF) to generate a physically-consistent representation of the 3D Sun, with the inclusion of radiative transfer and geometric ray sampling that matches the physical reality of optically thin plasma in the solar atmosphere. SuNeRFs leverage existing multi-viewpoint observations and act as virtual instruments that can fly out of the ecliptic, that can view the poles, and that can be placed anywhere in the solar system to generate novel views. Our pipeline is an example of how novel deep learning techniques can be used to significantly enhance observational capabilities by the creation of virtual instruments