51 research outputs found
Gradual Network for Single Image De-raining
Most advances in single image de-raining meet a key challenge, which is
removing rain streaks with different scales and shapes while preserving image
details. Existing single image de-raining approaches treat rain-streak removal
as a process of pixel-wise regression directly. However, they are lacking in
mining the balance between over-de-raining (e.g. removing texture details in
rain-free regions) and under-de-raining (e.g. leaving rain streaks). In this
paper, we firstly propose a coarse-to-fine network called Gradual Network
(GraNet) consisting of coarse stage and fine stage for delving into single
image de-raining with different granularities. Specifically, to reveal
coarse-grained rain-streak characteristics (e.g. long and thick rain
streaks/raindrops), we propose a coarse stage by utilizing local-global spatial
dependencies via a local-global subnetwork composed of region-aware blocks.
Taking the residual result (the coarse de-rained result) between the rainy
image sample (i.e. the input data) and the output of coarse stage (i.e. the
learnt rain mask) as input, the fine stage continues to de-rain by removing the
fine-grained rain streaks (e.g. light rain streaks and water mist) to get a
rain-free and well-reconstructed output image via a unified contextual merging
sub-network with dense blocks and a merging block. Solid and comprehensive
experiments on synthetic and real data demonstrate that our GraNet can
significantly outperform the state-of-the-art methods by removing rain streaks
with various densities, scales and shapes while keeping the image details of
rain-free regions well-preserved.Comment: In Proceedings of the 27th ACM International Conference on Multimedia
(MM 2019
Diverse Cotraining Makes Strong Semi-Supervised Segmentor
Deep co-training has been introduced to semi-supervised segmentation and
achieves impressive results, yet few studies have explored the working
mechanism behind it. In this work, we revisit the core assumption that supports
co-training: multiple compatible and conditionally independent views. By
theoretically deriving the generalization upper bound, we prove the prediction
similarity between two models negatively impacts the model's generalization
ability. However, most current co-training models are tightly coupled together
and violate this assumption. Such coupling leads to the homogenization of
networks and confirmation bias which consequently limits the performance. To
this end, we explore different dimensions of co-training and systematically
increase the diversity from the aspects of input domains, different
augmentations and model architectures to counteract homogenization. Our Diverse
Co-training outperforms the state-of-the-art (SOTA) methods by a large margin
across different evaluation protocols on the Pascal and Cityscapes. For
example. we achieve the best mIoU of 76.2%, 77.7% and 80.2% on Pascal with only
92, 183 and 366 labeled images, surpassing the previous best results by more
than 5%.Comment: ICCV2023, Camera Ready Version, Code:
\url{https://github.com/williamium3000/diverse-cotraining
SmooSeg: Smoothness Prior for Unsupervised Semantic Segmentation
Unsupervised semantic segmentation is a challenging task that segments images
into semantic groups without manual annotation. Prior works have primarily
focused on leveraging prior knowledge of semantic consistency or priori
concepts from self-supervised learning methods, which often overlook the
coherence property of image segments. In this paper, we demonstrate that the
smoothness prior, asserting that close features in a metric space share the
same semantics, can significantly simplify segmentation by casting unsupervised
semantic segmentation as an energy minimization problem. Under this paradigm,
we propose a novel approach called SmooSeg that harnesses self-supervised
learning methods to model the closeness relationships among observations as
smoothness signals. To effectively discover coherent semantic segments, we
introduce a novel smoothness loss that promotes piecewise smoothness within
segments while preserving discontinuities across different segments.
Additionally, to further enhance segmentation quality, we design an asymmetric
teacher-student style predictor that generates smoothly updated pseudo labels,
facilitating an optimal fit between observations and labeling outputs. Thanks
to the rich supervision cues of the smoothness prior, our SmooSeg significantly
outperforms STEGO in terms of pixel accuracy on three datasets: COCOStuff
(+14.9%), Cityscapes (+13.0%), and Potsdam-3 (+5.7%).Comment: Accepted by NeurIPS 2023. Code available:
https://github.com/mc-lan/SmooSe
Consistent Targets Provide Better Supervision in Semi-supervised Object Detection
In this study, we dive deep into the inconsistency of pseudo targets in
semi-supervised object detection (SSOD). Our core observation is that the
oscillating pseudo targets undermine the training of an accurate
semi-supervised detector. It not only inject noise into student training but
also lead to severe overfitting on the classification task. Therefore, we
propose a systematic solution, termed Consistent-Teacher, to reduce the
inconsistency. First, adaptive anchor assignment~(ASA) substitutes the static
IoU-based strategy, which enables the student network to be resistant to noisy
pseudo bounding boxes; Then we calibrate the subtask predictions by designing a
3D feature alignment module~(FAM-3D). It allows each classification feature to
adaptively query the optimal feature vector for the regression task at
arbitrary scales and locations. Lastly, a Gaussian Mixture Model (GMM)
dynamically revises the score threshold of the pseudo-bboxes, which stabilizes
the number of ground-truths at an early stage and remedies the unreliable
supervision signal during training. Consistent-Teacher provides strong results
on a large range of SSOD evaluations. It achieves 40.0 mAP with ResNet-50
backbone given only 10\% of annotated MS-COCO data, which surpasses previous
baselines using pseudo labels by around 3 mAP. When trained on fully annotated
MS-COCO with additional unlabeled data, the performance further increases to
47.2 mAP. Our code will be open-sourced soon
H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model
The generic large Vision-Language Models (VLMs) is rapidly developing, but
still perform poorly in Remote Sensing (RS) domain, which is due to the unique
and specialized nature of RS imagery and the comparatively limited spatial
perception of current VLMs. Existing Remote Sensing specific Vision Language
Models (RSVLMs) still have considerable potential for improvement, primarily
owing to the lack of large-scale, high-quality RS vision-language datasets. We
constructed HqDC-1.4M, the large scale High quality and Detailed Captions for
RS images, containing 1.4 million image-caption pairs, which not only enhance
the RSVLM's understanding of RS images but also significantly improve the
model's spatial perception abilities, such as localization and counting,
thereby increasing the helpfulness of the RSVLM. Moreover, to address the
inevitable "hallucination" problem in RSVLM, we developed RSSA, the first
dataset aimed at enhancing the Self-Awareness capability of RSVLMs. By
incorporating a variety of unanswerable questions into typical RS visual
question-answering tasks, RSSA effectively improves the truthfulness and
reduces the hallucinations of the model's outputs, thereby enhancing the
honesty of the RSVLM. Based on these datasets, we proposed the H2RSVLM, the
Helpful and Honest Remote Sensing Vision Language Model. H2RSVLM has achieved
outstanding performance on multiple RS public datasets and is capable of
recognizing and refusing to answer the unanswerable questions, effectively
mitigating the incorrect generations. We will release the code, data and model
weights at https://github.com/opendatalab/H2RSVLM .Comment: Equal contribution: Chao Pang, Jiang Wu; Corresponding author:
Gui-Song Xia, Conghui H
DePARylation Is Critical for S Phase Progression and Cell Survival
Poly(ADP-ribose)ylation or PARylation by PAR polymerase 1 (PARP1) and dePARylation by poly(ADP-ribose) glycohydrolase (PARG) are equally important for the dynamic regulation of DNA damage response. PARG, the most active dePARylation enzyme, is recruited to sites of DNA damage via pADPr-dependent and PCNA-dependent mechanisms. Targeting dePARylation is considered an alternative strategy to overcome PARP inhibitor resistance. However, precisely how dePARylation functions in normal unperturbed cells remains elusive. To address this challenge, we conducted multiple CRISPR screens and revealed that dePARylation of S phase pADPr by PARG is essential for cell viability. Loss of dePARylation activity initially induced S-phase-specific pADPr signaling, which resulted from unligated Okazaki fragments and eventually led to uncontrolled pADPr accumulation and PARP1/2-dependent cytotoxicity. Moreover, we demonstrated that proteins involved in Okazaki fragment ligation and/or base excision repair regulate pADPr signaling and cell death induced by PARG inhibition. In addition, we determined that PARG expression is critical for cellular sensitivity to PARG inhibition. Additionally, we revealed that PARG is essential for cell survival by suppressing pADPr. Collectively, our data not only identify an essential role for PARG in normal proliferating cells but also provide a potential biomarker for the further development of PARG inhibitors in cancer therapy
- …