147 research outputs found
Softmax Dissection: Towards Understanding Intra- and Inter-class Objective for Embedding Learning
The softmax loss and its variants are widely used as objectives for embedding
learning, especially in applications like face recognition. However, the intra-
and inter-class objectives in the softmax loss are entangled, therefore a
well-optimized inter-class objective leads to relaxation on the intra-class
objective, and vice versa. In this paper, we propose to dissect the softmax
loss into independent intra- and inter-class objective (D-Softmax). With
D-Softmax as objective, we can have a clear understanding of both the intra-
and inter-class objective, therefore it is straightforward to tune each part to
the best state. Furthermore, we find the computation of the inter-class
objective is redundant and propose two sampling-based variants of D-Softmax to
reduce the computation cost. Training with regular-scale data, experiments in
face verification show D-Softmax is favorably comparable to existing losses
such as SphereFace and ArcFace. Training with massive-scale data, experiments
show the fast variants of D-Softmax significantly accelerates the training
process (such as 64x) with only a minor sacrifice in performance, outperforming
existing acceleration methods of softmax in terms of both performance and
efficiency.Comment: Accepted to AAAI-2020, Oral presentatio
ShadowDiffusion: When Degradation Prior Meets Diffusion Model for Shadow Removal
Recent deep learning methods have achieved promising results in image shadow
removal. However, their restored images still suffer from unsatisfactory
boundary artifacts, due to the lack of degradation prior embedding and the
deficiency in modeling capacity. Our work addresses these issues by proposing a
unified diffusion framework that integrates both the image and degradation
priors for highly effective shadow removal. In detail, we first propose a
shadow degradation model, which inspires us to build a novel unrolling
diffusion model, dubbed ShandowDiffusion. It remarkably improves the model's
capacity in shadow removal via progressively refining the desired output with
both degradation prior and diffusive generative prior, which by nature can
serve as a new strong baseline for image restoration. Furthermore,
ShadowDiffusion progressively refines the estimated shadow mask as an auxiliary
task of the diffusion generator, which leads to more accurate and robust
shadow-free image generation. We conduct extensive experiments on three popular
public datasets, including ISTD, ISTD+, and SRD, to validate our method's
effectiveness. Compared to the state-of-the-art methods, our model achieves a
significant improvement in terms of PSNR, increasing from 31.69dB to 34.73dB
over SRD dataset
ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real Novel View Synthesis via Contrastive Learning
Although many recent works have investigated generalizable NeRF-based novel
view synthesis for unseen scenes, they seldom consider the synthetic-to-real
generalization, which is desired in many practical applications. In this work,
we first investigate the effects of synthetic data in synthetic-to-real novel
view synthesis and surprisingly observe that models trained with synthetic data
tend to produce sharper but less accurate volume densities. For pixels where
the volume densities are correct, fine-grained details will be obtained.
Otherwise, severe artifacts will be produced. To maintain the advantages of
using synthetic data while avoiding its negative effects, we propose to
introduce geometry-aware contrastive learning to learn multi-view consistent
features with geometric constraints. Meanwhile, we adopt cross-view attention
to further enhance the geometry perception of features by querying features
across input views. Experiments demonstrate that under the synthetic-to-real
setting, our method can render images with higher quality and better
fine-grained details, outperforming existing generalizable novel view synthesis
methods in terms of PSNR, SSIM, and LPIPS. When trained on real data, our
method also achieves state-of-the-art results
ExposureDiffusion: Learning to Expose for Low-light Image Enhancement
Previous raw image-based low-light image enhancement methods predominantly
relied on feed-forward neural networks to learn deterministic mappings from
low-light to normally-exposed images. However, they failed to capture critical
distribution information, leading to visually undesirable results. This work
addresses the issue by seamlessly integrating a diffusion model with a
physics-based exposure model. Different from a vanilla diffusion model that has
to perform Gaussian denoising, with the injected physics-based exposure model,
our restoration process can directly start from a noisy image instead of pure
noise. As such, our method obtains significantly improved performance and
reduced inference time compared with vanilla diffusion models. To make full use
of the advantages of different intermediate steps, we further propose an
adaptive residual layer that effectively screens out the side-effect in the
iterative refinement when the intermediate results have been already
well-exposed. The proposed framework can work with both real-paired datasets,
SOTA noise models, and different backbone networks. Note that, the proposed
framework is compatible with real-paired datasets, real/synthetic noise models,
and different backbone networks. We evaluate the proposed method on various
public benchmarks, achieving promising results with consistent improvements
using different exposure models and backbones. Besides, the proposed method
achieves better generalization capacity for unseen amplifying ratios and better
performance than a larger feedforward neural model when few parameters are
adopted.Comment: accepted by ICCV202
MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation
Perception systems in modern autonomous driving vehicles typically take
inputs from complementary multi-modal sensors, e.g., LiDAR and cameras.
However, in real-world applications, sensor corruptions and failures lead to
inferior performances, thus compromising autonomous safety. In this paper, we
propose a robust framework, called MetaBEV, to address extreme real-world
environments involving overall six sensor corruptions and two extreme
sensor-missing situations. In MetaBEV, signals from multiple sensors are first
processed by modal-specific encoders. Subsequently, a set of dense BEV queries
are initialized, termed meta-BEV. These queries are then processed iteratively
by a BEV-Evolving decoder, which selectively aggregates deep features from
either LiDAR, cameras, or both modalities. The updated BEV representations are
further leveraged for multiple 3D prediction tasks. Additionally, we introduce
a new M2oE structure to alleviate the performance drop on distinct tasks in
multi-task joint learning. Finally, MetaBEV is evaluated on the nuScenes
dataset with 3D object detection and BEV map segmentation tasks. Experiments
show MetaBEV outperforms prior arts by a large margin on both full and
corrupted modalities. For instance, when the LiDAR signal is missing, MetaBEV
improves 35.5% detection NDS and 17.7% segmentation mIoU upon the vanilla
BEVFusion model; and when the camera signal is absent, MetaBEV still achieves
69.2% NDS and 53.7% mIoU, which is even higher than previous works that perform
on full-modalities. Moreover, MetaBEV performs fairly against previous methods
in both canonical perception and multi-task learning settings, refreshing
state-of-the-art nuScenes BEV map segmentation with 70.4% mIoU.Comment: Project page: https://chongjiange.github.io/metabev.htm
SinSR: Diffusion-Based Image Super-Resolution in a Single Step
While super-resolution (SR) methods based on diffusion models exhibit
promising results, their practical application is hindered by the substantial
number of required inference steps. Recent methods utilize degraded images in
the initial state, thereby shortening the Markov chain. Nevertheless, these
solutions either rely on a precise formulation of the degradation process or
still necessitate a relatively lengthy generation path (e.g., 15 iterations).
To enhance inference speed, we propose a simple yet effective method for
achieving single-step SR generation, named SinSR. Specifically, we first derive
a deterministic sampling process from the most recent state-of-the-art (SOTA)
method for accelerating diffusion-based SR. This allows the mapping between the
input random noise and the generated high-resolution image to be obtained in a
reduced and acceptable number of inference steps during training. We show that
this deterministic mapping can be distilled into a student model that performs
SR within only one inference step. Additionally, we propose a novel
consistency-preserving loss to simultaneously leverage the ground-truth image
during the distillation process, ensuring that the performance of the student
model is not solely bound by the feature manifold of the teacher model,
resulting in further performance improvement. Extensive experiments conducted
on synthetic and real-world datasets demonstrate that the proposed method can
achieve comparable or even superior performance compared to both previous SOTA
methods and the teacher model, in just one sampling step, resulting in a
remarkable up to x10 speedup for inference. Our code will be released at
https://github.com/wyf0912/SinS
Nomenclatural and taxonomic notes on Rubus davidianus Kuntze and R. viburnifolius Franch
Critical examinations of specimens, with literature reviews, have shown that Rubus davidianus is conspecific with R. lambertianus. Therefore, we treat R. davidianus as a new synonym within Rubus. We propose a new name, Rubus loirensis Ti R. Huang nom. nov. to replace the later homonym of R. pycnanthus Genev. Additionally, lectotypification of three names, R. davidianus Kuntze, R. malifolius Focke and R. viburnifolius Franch., are designated here after examination of previous works
CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving
Contemporary deep-learning object detection methods for autonomous driving
usually assume prefixed categories of common traffic participants, such as
pedestrians and cars. Most existing detectors are unable to detect uncommon
objects and corner cases (e.g., a dog crossing a street), which may lead to
severe accidents in some situations, making the timeline for the real-world
application of reliable autonomous driving uncertain. One main reason that
impedes the development of truly reliably self-driving systems is the lack of
public datasets for evaluating the performance of object detectors on corner
cases. Hence, we introduce a challenging dataset named CODA that exposes this
critical problem of vision-based detectors. The dataset consists of 1500
carefully selected real-world driving scenes, each containing four object-level
corner cases (on average), spanning more than 30 object categories. On CODA,
the performance of standard object detectors trained on large-scale autonomous
driving datasets significantly drops to no more than 12.8% in mAR. Moreover, we
experiment with the state-of-the-art open-world object detector and find that
it also fails to reliably identify the novel objects in CODA, suggesting that a
robust perception system for autonomous driving is probably still far from
reach. We expect our CODA dataset to facilitate further research in reliable
detection for real-world autonomous driving. Our dataset will be released at
https://coda-dataset.github.io.Comment: ECCV 202
Streptococcus sputorum, a novel member of Streptococcus with multidrug resistance, exhibits cytotoxicity
We describe the genomic and phenotypic characteristics of a novel member of Streptococcus with multidrug resistance (MDR) isolated from hospital samples. Strains SP218 and SP219 were identified as a novel Streptococcus, S. sputorum, using whole-genome sequencing and biochemical tests. Average nucleotide identity values of strains SP218 and SP219 with S. pseudopneumoniae IS7493 and S. pneumoniae ST556 were 94.3% and 93.3%, respectively. Genome-to-genome distance values of strains SP218 and SP219 with S. pseudopneumoniae IS7493 and S. pneumoniae ST556 were 56.70% (54–59.5%) and 56.40% (52.8–59.9%), respectively. The biochemical test results distinguished these strains from S. pseudopneumoniae and S. pneumoniae, particularly hydrolysis of equine urate and utilization of ribose to produce acid. These isolates were resistant to six major classes of antibiotics, which correlated with horizontal gene transfer and mutation. Notably, strain SP219 exhibited cytotoxicity against human lung epithelial cell line A549. Our results indicate the pathogenic potential of S. sputorum, and provide valuable insights into mitis group of streptococci.National Natural Science Foundation of Chin
- …