188 research outputs found
Under-Display Camera Image Restoration with Scattering Effect
The under-display camera (UDC) provides consumers with a full-screen visual
experience without any obstruction due to notches or punched holes. However,
the semi-transparent nature of the display inevitably introduces the severe
degradation into UDC images. In this work, we address the UDC image restoration
problem with the specific consideration of the scattering effect caused by the
display. We explicitly model the scattering effect by treating the display as a
piece of homogeneous scattering medium. With the physical model of the
scattering effect, we improve the image formation pipeline for the image
synthesis to construct a realistic UDC dataset with ground truths. To suppress
the scattering effect for the eventual UDC image recovery, a two-branch
restoration network is designed. More specifically, the scattering branch
leverages global modeling capabilities of the channel-wise self-attention to
estimate parameters of the scattering effect from degraded images. While the
image branch exploits the local representation advantage of CNN to recover
clear scenes, implicitly guided by the scattering branch. Extensive experiments
are conducted on both real-world and synthesized data, demonstrating the
superiority of the proposed method over the state-of-the-art UDC restoration
techniques. The source code and dataset are available at
\url{https://github.com/NamecantbeNULL/SRUDC}.Comment: Accepted to ICCV202
CNN Injected Transformer for Image Exposure Correction
Capturing images with incorrect exposure settings fails to deliver a
satisfactory visual experience. Only when the exposure is properly set, can the
color and details of the images be appropriately preserved. Previous exposure
correction methods based on convolutions often produce exposure deviation in
images as a consequence of the restricted receptive field of convolutional
kernels. This issue arises because convolutions are not capable of capturing
long-range dependencies in images accurately. To overcome this challenge, we
can apply the Transformer to address the exposure correction problem,
leveraging its capability in modeling long-range dependencies to capture global
representation. However, solely relying on the window-based Transformer leads
to visually disturbing blocking artifacts due to the application of
self-attention in small patches. In this paper, we propose a CNN Injected
Transformer (CIT) to harness the individual strengths of CNN and Transformer
simultaneously. Specifically, we construct the CIT by utilizing a window-based
Transformer to exploit the long-range interactions among different regions in
the entire image. Within each CIT block, we incorporate a channel attention
block (CAB) and a half-instance normalization block (HINB) to assist the
window-based self-attention to acquire the global statistics and refine local
features. In addition to the hybrid architecture design for exposure
correction, we apply a set of carefully formulated loss functions to improve
the spatial coherence and rectify potential color deviations. Extensive
experiments demonstrate that our image exposure correction method outperforms
state-of-the-art approaches in terms of both quantitative and qualitative
metrics
LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching
The recent advancements in text-to-3D generation mark a significant milestone
in generative models, unlocking new possibilities for creating imaginative 3D
assets across various real-world scenarios. While recent advancements in
text-to-3D generation have shown promise, they often fall short in rendering
detailed and high-quality 3D models. This problem is especially prevalent as
many methods base themselves on Score Distillation Sampling (SDS). This paper
identifies a notable deficiency in SDS, that it brings inconsistent and
low-quality updating direction for the 3D model, causing the over-smoothing
effect. To address this, we propose a novel approach called Interval Score
Matching (ISM). ISM employs deterministic diffusing trajectories and utilizes
interval-based score matching to counteract over-smoothing. Furthermore, we
incorporate 3D Gaussian Splatting into our text-to-3D generation pipeline.
Extensive experiments show that our model largely outperforms the
state-of-the-art in quality and training efficiency.Comment: The first two authors contributed equally to this work. Our code will
be available at: https://github.com/EnVision-Research/LucidDreame
Let Images Give You More:Point Cloud Cross-Modal Training for Shape Analysis
Although recent point cloud analysis achieves impressive progress, the
paradigm of representation learning from a single modality gradually meets its
bottleneck. In this work, we take a step towards more discriminative 3D point
cloud representation by fully taking advantages of images which inherently
contain richer appearance information, e.g., texture, color, and shade.
Specifically, this paper introduces a simple but effective point cloud
cross-modality training (PointCMT) strategy, which utilizes view-images, i.e.,
rendered or projected 2D images of the 3D object, to boost point cloud
analysis. In practice, to effectively acquire auxiliary knowledge from view
images, we develop a teacher-student framework and formulate the cross modal
learning as a knowledge distillation problem. PointCMT eliminates the
distribution discrepancy between different modalities through novel feature and
classifier enhancement criteria and avoids potential negative transfer
effectively. Note that PointCMT effectively improves the point-only
representation without architecture modification. Sufficient experiments verify
significant gains on various datasets using appealing backbones, i.e., equipped
with PointCMT, PointNet++ and PointMLP achieve state-of-the-art performance on
two benchmarks, i.e., 94.4% and 86.7% accuracy on ModelNet40 and ScanObjectNN,
respectively. Code will be made available at
https://github.com/ZhanHeshen/PointCMT.Comment: To appear in NIPS202
Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion
LiDAR point cloud analysis is a core task for 3D computer vision, especially
for autonomous driving. However, due to the severe sparsity and noise
interference in the single sweep LiDAR point cloud, the accurate semantic
segmentation is non-trivial to achieve. In this paper, we propose a novel
sparse LiDAR point cloud semantic segmentation framework assisted by learned
contextual shape priors. In practice, an initial semantic segmentation (SS) of
a single sweep point cloud can be achieved by any appealing network and then
flows into the semantic scene completion (SSC) module as the input. By merging
multiple frames in the LiDAR sequence as supervision, the optimized SSC module
has learned the contextual shape priors from sequential LiDAR data, completing
the sparse single sweep point cloud to the dense one. Thus, it inherently
improves SS optimization through fully end-to-end training. Besides, a
Point-Voxel Interaction (PVI) module is proposed to further enhance the
knowledge fusion between SS and SSC tasks, i.e., promoting the interaction of
incomplete local geometry of point cloud and complete voxel-wise global
structure. Furthermore, the auxiliary SSC and PVI modules can be discarded
during inference without extra burden for SS. Extensive experiments confirm
that our JS3C-Net achieves superior performance on both SemanticKITTI and
SemanticPOSS benchmarks, i.e., 4% and 3% improvement correspondingly.Comment: To appear in AAAI 2021. Codes are available at
https://github.com/yanx27/JS3C-Ne
High-Performance Direct Methanol Fuel Cells with Precious-Metal-Free Cathode
Direct methanol fuel cells (DMFCs) hold great promise for applications ranging from portable power for electronics to transportation. However, apart from the high costs, current Pt-based cathodes in DMFCs suffer significantly from performance loss due to severe methanol crossover from anode to cathode. The migrated methanol in cathodes tends to contaminate Pt active sites through yielding a mixed potential region resulting from oxygen reduction reaction and methanol oxidation reaction. Therefore, highly methanol-tolerant cathodes must be developed before DMFC technologies become viable. The newly developed reduced graphene oxide (rGO)-based Fe-N-C cathode exhibits high methanol tolerance and exceeds the performance of current Pt cathodes, as evidenced by both rotating disk electrode and DMFC tests. While the morphology of 2D rGO is largely preserved, the resulting Fe-N-rGO catalyst provides a more unique porous structure. DMFC tests with various methanol concentrations are systematically studied using the best performing Fe-N-rGO catalyst. At feed concentrations greater than 2.0 m, the obtained DMFC performance from the Fe-N-rGO cathode is found to start exceeding that of a Pt/C cathode. This work will open a new avenue to use nonprecious metal cathode for advanced DMFC technologies with increased performance and at significantly reduced cost.open0
High-Performance Direct Methanol Fuel Cells with Precious-Metal-Free Cathode
Direct methanol fuel cells (DMFCs) hold great promise for applications ranging from portable power for electronics to transportation. However, apart from the high costs, current Pt-based cathodes in DMFCs suffer significantly from performance loss due to severe methanol crossover from anode to cathode. The migrated methanol in cathodes tends to contaminate Pt active sites through yielding a mixed potential region resulting from oxygen reduction reaction and methanol oxidation reaction. Therefore, highly methanol-tolerant cathodes must be developed before DMFC technologies become viable. The newly developed reduced graphene oxide (rGO)-based Fe-N-C cathode exhibits high methanol tolerance and exceeds the performance of current Pt cathodes, as evidenced by both rotating disk electrode and DMFC tests. While the morphology of 2D rGO is largely preserved, the resulting Fe-N-rGO catalyst provides a more unique porous structure. DMFC tests with various methanol concentrations are systematically studied using the best performing Fe-N-rGO catalyst. At feed concentrations greater than 2.0 m, the obtained DMFC performance from the Fe-N-rGO cathode is found to start exceeding that of a Pt/C cathode. This work will open a new avenue to use nonprecious metal cathode for advanced DMFC technologies with increased performance and at significantly reduced cost.open0
- …