27 research outputs found
LSPT: Long-term Spatial Prompt Tuning for Visual Representation Learning
Visual Prompt Tuning (VPT) techniques have gained prominence for their
capacity to adapt pre-trained Vision Transformers (ViTs) to downstream visual
tasks using specialized learnable tokens termed as prompts. Contemporary VPT
methodologies, especially when employed with self-supervised vision
transformers, often default to the introduction of new learnable prompts or
gated prompt tokens predominantly sourced from the model's previous block. A
pivotal oversight in such approaches is their failure to harness the potential
of long-range previous blocks as sources of prompts within each self-supervised
ViT. To bridge this crucial gap, we introduce Long-term Spatial Prompt Tuning
(LSPT) - a revolutionary approach to visual representation learning. Drawing
inspiration from the intricacies of the human brain, LSPT ingeniously
incorporates long-term gated prompts. This feature serves as temporal coding,
curbing the risk of forgetting parameters acquired from earlier blocks. Further
enhancing its prowess, LSPT brings into play patch tokens, serving as spatial
coding. This is strategically designed to perpetually amass class-conscious
features, thereby fortifying the model's prowess in distinguishing and
identifying visual categories. To validate the efficacy of our proposed method,
we engaged in rigorous experimentation across 5 FGVC and 19 VTAB-1K benchmarks.
Our empirical findings underscore the superiority of LSPT, showcasing its
ability to set new benchmarks in visual prompt tuning performance
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Conventional reinforcement learning (RL) needs an environment to collect
fresh data, which is impractical when online interactions are costly. Offline
RL provides an alternative solution by directly learning from the previously
collected dataset. However, it will yield unsatisfactory performance if the
quality of the offline datasets is poor. In this paper, we consider an
offline-to-online setting where the agent is first learned from the offline
dataset and then trained online, and propose a framework called Adaptive Policy
Learning for effectively taking advantage of offline and online data.
Specifically, we explicitly consider the difference between the online and
offline data and apply an adaptive update scheme accordingly, that is, a
pessimistic update strategy for the offline dataset and an optimistic/greedy
update scheme for the online dataset. Such a simple and effective method
provides a way to mix the offline and online RL and achieve the best of both
worlds. We further provide two detailed algorithms for implementing the
framework through embedding value or policy-based RL algorithms into it.
Finally, we conduct extensive experiments on popular continuous control tasks,
and results show that our algorithm can learn the expert policy with high
sample efficiency even when the quality of offline dataset is poor, e.g.,
random dataset.Comment: AAAI202
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression
In long context scenarios, large language models (LLMs) face three main
challenges: higher computational/financial cost, longer latency, and inferior
performance. Some studies reveal that the performance of LLMs depends on both
the density and the position of the key information (question relevant) in the
input prompt. Inspired by these findings, we propose LongLLMLingua for prompt
compression towards improving LLMs' perception of the key information to
simultaneously address the three challenges. We conduct evaluation on a wide
range of long context scenarios including single-/multi-document QA, few-shot
learning, summarization, synthetic tasks, and code completion. The experimental
results show that LongLLMLingua compressed prompt can derive higher performance
with much less cost. The latency of the end-to-end system is also reduced. For
example, on NaturalQuestions benchmark, LongLLMLingua gains a performance boost
of up to 17.1% over the original prompt with ~4x fewer tokens as input to
GPT-3.5-Turbo. It can derive cost savings of \$28.5 and \$27.4 per 1,000
samples from the LongBench and ZeroScrolls benchmark, respectively.
Additionally, when compressing prompts of ~10k tokens at a compression rate of
2x-10x, LongLLMLingua can speed up the end-to-end latency by 1.4x-3.8x. Our
code is available at https://aka.ms/LLMLingua
Unified Medical Image Pre-training in Language-Guided Common Semantic Space
Vision-Language Pre-training (VLP) has shown the merits of analysing medical
images, by leveraging the semantic congruence between medical images and their
corresponding reports. It efficiently learns visual representations, which in
turn facilitates enhanced analysis and interpretation of intricate imaging
data. However, such observation is predominantly justified on single-modality
data (mostly 2D images like X-rays), adapting VLP to learning unified
representations for medical images in real scenario remains an open challenge.
This arises from medical images often encompass a variety of modalities,
especially modalities with different various number of dimensions (e.g., 3D
images like Computed Tomography). To overcome the aforementioned challenges, we
propose an Unified Medical Image Pre-training framework, namely UniMedI, which
utilizes diagnostic reports as common semantic space to create unified
representations for diverse modalities of medical images (especially for 2D and
3D images). Under the text's guidance, we effectively uncover visual modality
information, identifying the affected areas in 2D X-rays and slices containing
lesion in sophisticated 3D CT scans, ultimately enhancing the consistency
across various medical imaging modalities. To demonstrate the effectiveness and
versatility of UniMedI, we evaluate its performance on both 2D and 3D images
across 10 different datasets, covering a wide range of medical image tasks such
as classification, segmentation, and retrieval. UniMedI has demonstrated
superior performance in downstream tasks, showcasing its effectiveness in
establishing a universal medical visual representation
Protecting the Future: Neonatal Seizure Detection with Spatial-Temporal Modeling
A timely detection of seizures for newborn infants with electroencephalogram
(EEG) has been a common yet life-saving practice in the Neonatal Intensive Care
Unit (NICU). However, it requires great human efforts for real-time monitoring,
which calls for automated solutions to neonatal seizure detection. Moreover,
the current automated methods focusing on adult epilepsy monitoring often fail
due to (i) dynamic seizure onset location in human brains; (ii) different
montages on neonates and (iii) huge distribution shift among different
subjects. In this paper, we propose a deep learning framework, namely STATENet,
to address the exclusive challenges with exquisite designs at the temporal,
spatial and model levels. The experiments over the real-world large-scale
neonatal EEG dataset illustrate that our framework achieves significantly
better seizure detection performance.Comment: Accepted in IEEE International Conference on Systems, Man, and
Cybernetics (SMC) 202
Insight-HXMT on-orbit thermal control status and thermal deformation impact analysis
Purpose: The Hard X-ray Modulation Telescope is China's first X-ray astronomy
satellite launched on June 15th, 2017, dubbed Insight-HXMT. Active and passive
thermal control measures are employed to keep devices at suitable temperatures.
In this paper, we analyzed the on-orbit thermal monitoring data of the first 5
years and investigated the effect of thermal deformation on the point spread
function (PSF) of the telescopes.
Methods: We examined the data of the on-orbit temperatures measured using 157
thermistors placed on the collimators, detectors and their support structures
and compared the results with the thermal control requirements. The thermal
deformation was evaluated by the relative orientation of the two star sensors
installed on the main support structure. its effect was estimated with
evolution of the PSF obtained with calibration scanning observations of the
Crab nebula.
Conclusion: The on-orbit temperatures met the thermal control requirements
thus far, and the effect of thermal deformation on the PSF was negligible after
the on-orbit pointing calibration.Comment: 25 pages, 35 figures, submitte
Overview to the Hard X-ray Modulation Telescope (Insight-HXMT) Satellite
As China's first X-ray astronomical satellite, the Hard X-ray Modulation
Telescope (HXMT), which was dubbed as Insight-HXMT after the launch on June 15,
2017, is a wide-band (1-250 keV) slat-collimator-based X-ray astronomy
satellite with the capability of all-sky monitoring in 0.2-3 MeV. It was
designed to perform pointing, scanning and gamma-ray burst (GRB) observations
and, based on the Direct Demodulation Method (DDM), the image of the scanned
sky region can be reconstructed. Here we give an overview of the mission and
its progresses, including payload, core sciences, ground calibration/facility,
ground segment, data archive, software, in-orbit performance, calibration,
background model, observations and some preliminary results.Comment: 29 pages, 40 figures, 6 tables, to appear in Sci. China-Phys. Mech.
Astron. arXiv admin note: text overlap with arXiv:1910.0443
Insight-HXMT observations of Swift J0243.6+6124 during its 2017-2018 outburst
The recently discovered neutron star transient Swift J0243.6+6124 has been
monitored by {\it the Hard X-ray Modulation Telescope} ({\it Insight-\rm HXMT).
Based on the obtained data, we investigate the broadband spectrum of the source
throughout the outburst. We estimate the broadband flux of the source and
search for possible cyclotron line in the broadband spectrum. No evidence of
line-like features is, however, found up to . In the absence of
any cyclotron line in its energy spectrum, we estimate the magnetic field of
the source based on the observed spin evolution of the neutron star by applying
two accretion torque models. In both cases, we get consistent results with
, and peak luminosity of which makes the source the first Galactic ultraluminous
X-ray source hosting a neutron star.Comment: publishe