176 research outputs found
Solar Ultraviolet Bursts in a Coordinated Observation of IRIS, Hinode and SDO
Solar ultraviolet (UV) bursts are small-scale compact brightenings in
transition region images. The spectral profiles of transition region lines in
these bursts are significantly enhanced and broadened, often with chromospheric
absorption lines such as Ni~{\sc{ii}} 1335.203 and 1393.330 {\AA} superimposed.
We investigate the properties of several UV bursts using a coordinated
observation of the Interface Region Imaging Spectrograph (IRIS), Solar Dynamics
Observatory (SDO), and \textit{Hinode} on 2015 February 7. We have identified
12 UV bursts, and 11 of them reveal small blueshifts of the Ni~{\sc{ii}}
absorption lines. However, the Ni~{\sc{ii}} lines in one UV burst exhibit
obvious redshifts of 20 km s, which appear to be related to the
cold plasma downflows observed in the IRIS slit-jaw images. We also examine the
three-dimensional magnetic field topology using a magnetohydrostatic model, and
find that some UV bursts are associated with magnetic null points or bald
patches. In addition, we find that these UV bursts reveal no obvious coronal
signatures from the observations of the Atmospheric Imaging Assembly (AIA) on
board SDO and the EUV Imaging Spectrometer (EIS) on board \textit{Hinode}.Comment: will appear in the journal of Science China Technological Science
Decoupling Pseudo Label Disambiguation and Representation Learning for Generalized Intent Discovery
Generalized intent discovery aims to extend a closed-set in-domain intent
classifier to an open-world intent set including in-domain and out-of-domain
intents. The key challenges lie in pseudo label disambiguation and
representation learning. Previous methods suffer from a coupling of pseudo
label disambiguation and representation learning, that is, the reliability of
pseudo labels relies on representation learning, and representation learning is
restricted by pseudo labels in turn. In this paper, we propose a decoupled
prototype learning framework (DPL) to decouple pseudo label disambiguation and
representation learning. Specifically, we firstly introduce prototypical
contrastive representation learning (PCL) to get discriminative
representations. And then we adopt a prototype-based label disambiguation
method (PLD) to obtain pseudo labels. We theoretically prove that PCL and PLD
work in a collaborative fashion and facilitate pseudo label disambiguation.
Experiments and analysis on three benchmark datasets show the effectiveness of
our method.Comment: Accepted at ACL2023 main conferenc
Semi-Supervised Panoptic Narrative Grounding
Despite considerable progress, the advancement of Panoptic Narrative
Grounding (PNG) remains hindered by costly annotations. In this paper, we
introduce a novel Semi-Supervised Panoptic Narrative Grounding (SS-PNG)
learning scheme, capitalizing on a smaller set of labeled image-text pairs and
a larger set of unlabeled pairs to achieve competitive performance. Unlike
visual segmentation tasks, PNG involves one pixel belonging to multiple
open-ended nouns. As a result, existing multi-class based semi-supervised
segmentation frameworks cannot be directly applied to this task. To address
this challenge, we first develop a novel SS-PNG Network (SS-PNG-NW) tailored to
the SS-PNG setting. We thoroughly investigate strategies such as Burn-In and
data augmentation to determine the optimal generic configuration for the
SS-PNG-NW. Additionally, to tackle the issue of imbalanced pseudo-label
quality, we propose a Quality-Based Loss Adjustment (QLA) approach to adjust
the semi-supervised objective, resulting in an enhanced SS-PNG-NW+. Employing
our proposed QLA, we improve BCE Loss and Dice loss at pixel and mask levels,
respectively. We conduct extensive experiments on PNG datasets, with our
SS-PNG-NW+ demonstrating promising results comparable to fully-supervised
models across all data ratios. Remarkably, our SS-PNG-NW+ outperforms
fully-supervised models with only 30% and 50% supervision data, exceeding their
performance by 0.8% and 1.1% respectively. This highlights the effectiveness of
our proposed SS-PNG-NW+ in overcoming the challenges posed by limited
annotations and enhancing the applicability of PNG tasks. The source code is
available at https://github.com/nini0919/SSPNG.Comment: ACM MM 202
Towards Efficient Visual Adaption via Structural Re-parameterization
Parameter-efficient transfer learning (PETL) is an emerging research spot
aimed at inexpensively adapting large-scale pre-trained models to downstream
tasks. Recent advances have achieved great success in saving storage costs for
various vision tasks by updating or injecting a small number of parameters
instead of full fine-tuning. However, we notice that most existing PETL methods
still incur non-negligible latency during inference. In this paper, we propose
a parameter-efficient and computationally friendly adapter for giant vision
models, called RepAdapter. Specifically, we prove that the adaption modules,
even with a complex structure, can be seamlessly integrated into most giant
vision models via structural re-parameterization. This property makes
RepAdapter zero-cost during inference. In addition to computation efficiency,
RepAdapter is more effective and lightweight than existing PETL methods due to
its sparse structure and our careful deployment. To validate RepAdapter, we
conduct extensive experiments on 27 benchmark datasets of three vision tasks,
i.e., image and video classifications and semantic segmentation. Experimental
results show the superior performance and efficiency of RepAdapter than the
state-of-the-art PETL methods. For instance, by updating only 0.6% parameters,
we can improve the performance of ViT from 38.8 to 55.1 on Sun397. Its
generalizability is also well validated by a bunch of vision models, i.e., ViT,
CLIP, Swin-Transformer and ConvNeXt. Our source code is released at
https://github.com/luogen1996/RepAdapter
NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning
Panoptic Narrative Detection (PND) and Segmentation (PNS) are two challenging
tasks that involve identifying and locating multiple targets in an image
according to a long narrative description. In this paper, we propose a unified
and effective framework called NICE that can jointly learn these two panoptic
narrative recognition tasks. Existing visual grounding tasks use a two-branch
paradigm, but applying this directly to PND and PNS can result in prediction
conflict due to their intrinsic many-to-many alignment property. To address
this, we introduce two cascading modules based on the barycenter of the mask,
which are Coordinate Guided Aggregation (CGA) and Barycenter Driven
Localization (BDL), responsible for segmentation and detection, respectively.
By linking PNS and PND in series with the barycenter of segmentation as the
anchor, our approach naturally aligns the two tasks and allows them to
complement each other for improved performance. Specifically, CGA provides the
barycenter as a reference for detection, reducing BDL's reliance on a large
number of candidate boxes. BDL leverages its excellent properties to
distinguish different instances, which improves the performance of CGA for
segmentation. Extensive experiments demonstrate that NICE surpasses all
existing methods by a large margin, achieving 4.1% for PND and 2.9% for PNS
over the state-of-the-art. These results validate the effectiveness of our
proposed collaborative learning strategy. The project of this work is made
publicly available at https://github.com/Mr-Neko/NICE.Comment: 18 pages. 9 figures, 9 table
- …