188 research outputs found

    Under-Display Camera Image Restoration with Scattering Effect

    Full text link
    The under-display camera (UDC) provides consumers with a full-screen visual experience without any obstruction due to notches or punched holes. However, the semi-transparent nature of the display inevitably introduces the severe degradation into UDC images. In this work, we address the UDC image restoration problem with the specific consideration of the scattering effect caused by the display. We explicitly model the scattering effect by treating the display as a piece of homogeneous scattering medium. With the physical model of the scattering effect, we improve the image formation pipeline for the image synthesis to construct a realistic UDC dataset with ground truths. To suppress the scattering effect for the eventual UDC image recovery, a two-branch restoration network is designed. More specifically, the scattering branch leverages global modeling capabilities of the channel-wise self-attention to estimate parameters of the scattering effect from degraded images. While the image branch exploits the local representation advantage of CNN to recover clear scenes, implicitly guided by the scattering branch. Extensive experiments are conducted on both real-world and synthesized data, demonstrating the superiority of the proposed method over the state-of-the-art UDC restoration techniques. The source code and dataset are available at \url{https://github.com/NamecantbeNULL/SRUDC}.Comment: Accepted to ICCV202

    CNN Injected Transformer for Image Exposure Correction

    Full text link
    Capturing images with incorrect exposure settings fails to deliver a satisfactory visual experience. Only when the exposure is properly set, can the color and details of the images be appropriately preserved. Previous exposure correction methods based on convolutions often produce exposure deviation in images as a consequence of the restricted receptive field of convolutional kernels. This issue arises because convolutions are not capable of capturing long-range dependencies in images accurately. To overcome this challenge, we can apply the Transformer to address the exposure correction problem, leveraging its capability in modeling long-range dependencies to capture global representation. However, solely relying on the window-based Transformer leads to visually disturbing blocking artifacts due to the application of self-attention in small patches. In this paper, we propose a CNN Injected Transformer (CIT) to harness the individual strengths of CNN and Transformer simultaneously. Specifically, we construct the CIT by utilizing a window-based Transformer to exploit the long-range interactions among different regions in the entire image. Within each CIT block, we incorporate a channel attention block (CAB) and a half-instance normalization block (HINB) to assist the window-based self-attention to acquire the global statistics and refine local features. In addition to the hybrid architecture design for exposure correction, we apply a set of carefully formulated loss functions to improve the spatial coherence and rectify potential color deviations. Extensive experiments demonstrate that our image exposure correction method outperforms state-of-the-art approaches in terms of both quantitative and qualitative metrics

    LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching

    Full text link
    The recent advancements in text-to-3D generation mark a significant milestone in generative models, unlocking new possibilities for creating imaginative 3D assets across various real-world scenarios. While recent advancements in text-to-3D generation have shown promise, they often fall short in rendering detailed and high-quality 3D models. This problem is especially prevalent as many methods base themselves on Score Distillation Sampling (SDS). This paper identifies a notable deficiency in SDS, that it brings inconsistent and low-quality updating direction for the 3D model, causing the over-smoothing effect. To address this, we propose a novel approach called Interval Score Matching (ISM). ISM employs deterministic diffusing trajectories and utilizes interval-based score matching to counteract over-smoothing. Furthermore, we incorporate 3D Gaussian Splatting into our text-to-3D generation pipeline. Extensive experiments show that our model largely outperforms the state-of-the-art in quality and training efficiency.Comment: The first two authors contributed equally to this work. Our code will be available at: https://github.com/EnVision-Research/LucidDreame

    Let Images Give You More:Point Cloud Cross-Modal Training for Shape Analysis

    Full text link
    Although recent point cloud analysis achieves impressive progress, the paradigm of representation learning from a single modality gradually meets its bottleneck. In this work, we take a step towards more discriminative 3D point cloud representation by fully taking advantages of images which inherently contain richer appearance information, e.g., texture, color, and shade. Specifically, this paper introduces a simple but effective point cloud cross-modality training (PointCMT) strategy, which utilizes view-images, i.e., rendered or projected 2D images of the 3D object, to boost point cloud analysis. In practice, to effectively acquire auxiliary knowledge from view images, we develop a teacher-student framework and formulate the cross modal learning as a knowledge distillation problem. PointCMT eliminates the distribution discrepancy between different modalities through novel feature and classifier enhancement criteria and avoids potential negative transfer effectively. Note that PointCMT effectively improves the point-only representation without architecture modification. Sufficient experiments verify significant gains on various datasets using appealing backbones, i.e., equipped with PointCMT, PointNet++ and PointMLP achieve state-of-the-art performance on two benchmarks, i.e., 94.4% and 86.7% accuracy on ModelNet40 and ScanObjectNN, respectively. Code will be made available at https://github.com/ZhanHeshen/PointCMT.Comment: To appear in NIPS202

    Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion

    Full text link
    LiDAR point cloud analysis is a core task for 3D computer vision, especially for autonomous driving. However, due to the severe sparsity and noise interference in the single sweep LiDAR point cloud, the accurate semantic segmentation is non-trivial to achieve. In this paper, we propose a novel sparse LiDAR point cloud semantic segmentation framework assisted by learned contextual shape priors. In practice, an initial semantic segmentation (SS) of a single sweep point cloud can be achieved by any appealing network and then flows into the semantic scene completion (SSC) module as the input. By merging multiple frames in the LiDAR sequence as supervision, the optimized SSC module has learned the contextual shape priors from sequential LiDAR data, completing the sparse single sweep point cloud to the dense one. Thus, it inherently improves SS optimization through fully end-to-end training. Besides, a Point-Voxel Interaction (PVI) module is proposed to further enhance the knowledge fusion between SS and SSC tasks, i.e., promoting the interaction of incomplete local geometry of point cloud and complete voxel-wise global structure. Furthermore, the auxiliary SSC and PVI modules can be discarded during inference without extra burden for SS. Extensive experiments confirm that our JS3C-Net achieves superior performance on both SemanticKITTI and SemanticPOSS benchmarks, i.e., 4% and 3% improvement correspondingly.Comment: To appear in AAAI 2021. Codes are available at https://github.com/yanx27/JS3C-Ne

    High-Performance Direct Methanol Fuel Cells with Precious-Metal-Free Cathode

    Get PDF
    Direct methanol fuel cells (DMFCs) hold great promise for applications ranging from portable power for electronics to transportation. However, apart from the high costs, current Pt-based cathodes in DMFCs suffer significantly from performance loss due to severe methanol crossover from anode to cathode. The migrated methanol in cathodes tends to contaminate Pt active sites through yielding a mixed potential region resulting from oxygen reduction reaction and methanol oxidation reaction. Therefore, highly methanol-tolerant cathodes must be developed before DMFC technologies become viable. The newly developed reduced graphene oxide (rGO)-based Fe-N-C cathode exhibits high methanol tolerance and exceeds the performance of current Pt cathodes, as evidenced by both rotating disk electrode and DMFC tests. While the morphology of 2D rGO is largely preserved, the resulting Fe-N-rGO catalyst provides a more unique porous structure. DMFC tests with various methanol concentrations are systematically studied using the best performing Fe-N-rGO catalyst. At feed concentrations greater than 2.0 m, the obtained DMFC performance from the Fe-N-rGO cathode is found to start exceeding that of a Pt/C cathode. This work will open a new avenue to use nonprecious metal cathode for advanced DMFC technologies with increased performance and at significantly reduced cost.open0

    High-Performance Direct Methanol Fuel Cells with Precious-Metal-Free Cathode

    Get PDF
    Direct methanol fuel cells (DMFCs) hold great promise for applications ranging from portable power for electronics to transportation. However, apart from the high costs, current Pt-based cathodes in DMFCs suffer significantly from performance loss due to severe methanol crossover from anode to cathode. The migrated methanol in cathodes tends to contaminate Pt active sites through yielding a mixed potential region resulting from oxygen reduction reaction and methanol oxidation reaction. Therefore, highly methanol-tolerant cathodes must be developed before DMFC technologies become viable. The newly developed reduced graphene oxide (rGO)-based Fe-N-C cathode exhibits high methanol tolerance and exceeds the performance of current Pt cathodes, as evidenced by both rotating disk electrode and DMFC tests. While the morphology of 2D rGO is largely preserved, the resulting Fe-N-rGO catalyst provides a more unique porous structure. DMFC tests with various methanol concentrations are systematically studied using the best performing Fe-N-rGO catalyst. At feed concentrations greater than 2.0 m, the obtained DMFC performance from the Fe-N-rGO cathode is found to start exceeding that of a Pt/C cathode. This work will open a new avenue to use nonprecious metal cathode for advanced DMFC technologies with increased performance and at significantly reduced cost.open0
    corecore