121 research outputs found

    DiffuStereo: High Quality Human Reconstruction via Diffusion-based Stereo Using Sparse Cameras

    Full text link
    We propose DiffuStereo, a novel system using only sparse cameras (8 in this work) for high-quality 3D human reconstruction. At its core is a novel diffusion-based stereo module, which introduces diffusion models, a type of powerful generative models, into the iterative stereo matching network. To this end, we design a new diffusion kernel and additional stereo constraints to facilitate stereo matching and depth estimation in the network. We further present a multi-level stereo network architecture to handle high-resolution (up to 4k) inputs without requiring unaffordable memory footprint. Given a set of sparse-view color images of a human, the proposed multi-level diffusion-based stereo network can produce highly accurate depth maps, which are then converted into a high-quality 3D human model through an efficient multi-view fusion strategy. Overall, our method enables automatic reconstruction of human models with quality on par to high-end dense-view camera rigs, and this is achieved using a much more light-weight hardware setup. Experiments show that our method outperforms state-of-the-art methods by a large margin both qualitatively and quantitatively.Comment: Accepted by ECCV202

    Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval

    Full text link
    Image retrieval targets to find images from a database that are visually similar to the query image. Two-stage methods following retrieve-and-rerank paradigm have achieved excellent performance, but their separate local and global modules are inefficient to real-world applications. To better trade-off retrieval efficiency and accuracy, some approaches fuse global and local feature into a joint representation to perform single-stage image retrieval. However, they are still challenging due to various situations to tackle, e.g.e.g., background, occlusion and viewpoint. In this work, we design a Coarse-to-Fine framework to learn Compact Discriminative representation (CFCD) for end-to-end single-stage image retrieval-requiring only image-level labels. Specifically, we first design a novel adaptive softmax-based loss which dynamically tunes its scale and margin within each mini-batch and increases them progressively to strengthen supervision during training and intra-class compactness. Furthermore, we propose a mechanism which attentively selects prominent local descriptors and infuse fine-grained semantic relations into the global representation by a hard negative sampling strategy to optimize inter-class distinctiveness at a global scale. Extensive experimental results have demonstrated the effectiveness of our method, which achieves state-of-the-art single-stage image retrieval performance on benchmarks such as Revisited Oxford and Revisited Paris. Code is available at https://github.com/bassyess/CFCD.Comment: Accepted to ICCV 202

    Phosphorus-doped porous carbons as efficient electrocatalysts for oxygen reduction

    Get PDF
    Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG geförderten) Allianz- bzw. Nationallizenz frei zugänglich.This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.Efficient electrocatalysts for the oxygen reduction reaction (ORR) play a critical role in the performance of fuel cells and metal–air batteries. In this study, we report a facile synthesis of phosphorus (P)-doped porous carbon as a highly active electrocatalyst for the ORR. Phosphorus-doped porous carbon was prepared by simultaneous doping and activation of carbon with phosphoric acid (H3PO4) in the presence of Co. Both phosphorus and cobalt were found to play significant roles in improving the catalytic activity of carbon for the ORR. The as-prepared phosphorus-doped porous carbon exhibited considerable catalytic activity for the ORR as evidenced by rotating ring-disk electrode studies. At the same mass loading, the Tafel slope of phosphorus-doped porous carbon electrocatalysts is comparable to that of the commercial Pt/C catalysts (20 wt% Pt on Vulcan XC-72, Johnson Matthey) with stability superior to Pt/C in alkaline solutions

    DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

    Full text link
    We present DreamCraft3D, a hierarchical 3D content generation method that produces high-fidelity and coherent 3D objects. We tackle the problem by leveraging a 2D reference image to guide the stages of geometry sculpting and texture boosting. A central focus of this work is to address the consistency issue that existing works encounter. To sculpt geometries that render coherently, we perform score distillation sampling via a view-dependent diffusion model. This 3D prior, alongside several training strategies, prioritizes the geometry consistency but compromises the texture fidelity. We further propose Bootstrapped Score Distillation to specifically boost the texture. We train a personalized diffusion model, Dreambooth, on the augmented renderings of the scene, imbuing it with 3D knowledge of the scene being optimized. The score distillation from this 3D-aware diffusion prior provides view-consistent guidance for the scene. Notably, through an alternating optimization of the diffusion prior and 3D scene representation, we achieve mutually reinforcing improvements: the optimized 3D scene aids in training the scene-specific diffusion model, which offers increasingly view-consistent guidance for 3D optimization. The optimization is thus bootstrapped and leads to substantial texture boosting. With tailored 3D priors throughout the hierarchical generation, DreamCraft3D generates coherent 3D objects with photorealistic renderings, advancing the state-of-the-art in 3D content generation. Code available at https://github.com/deepseek-ai/DreamCraft3D.Comment: Project Page: https://mrtornado24.github.io/DreamCraft3D

    Control4D: Dynamic Portrait Editing by Learning 4D GAN from 2D Diffusion-based Editor

    Full text link
    Recent years have witnessed considerable achievements in editing images with text instructions. When applying these editors to dynamic scene editing, the new-style scene tends to be temporally inconsistent due to the frame-by-frame nature of these 2D editors. To tackle this issue, we propose Control4D, a novel approach for high-fidelity and temporally consistent 4D portrait editing. Control4D is built upon an efficient 4D representation with a 2D diffusion-based editor. Instead of using direct supervisions from the editor, our method learns a 4D GAN from it and avoids the inconsistent supervision signals. Specifically, we employ a discriminator to learn the generation distribution based on the edited images and then update the generator with the discrimination signals. For more stable training, multi-level information is extracted from the edited images and used to facilitate the learning of the generator. Experimental results show that Control4D surpasses previous approaches and achieves more photo-realistic and consistent 4D editing performances. The link to our project website is https://control4darxiv.github.io.Comment: The link to our project website is https://control4darxiv.github.i

    Coherent interface strengthening of ultrahigh pressure heat-treated Mg-Li-Y alloys.

    Get PDF
    Achieving good strength-ductility of Mg alloys has always been a crucial issue for the widespread applications of Mg-based structural materials. Herein, an unexpected double-stage strengthening phenomenon was discovered in Mg-8Li-1Y(wt.%) alloys through high pressure (6 GPa) heat treatments over a range of 700-1300°C. Attractively, the yield strength values are improved remarkably without losing their ductility. The low temperature strengthening mechanism is mainly driven by the formation of large-volume nanoscale contraction twins. In contrast, the high-temperature strengthening reason is ascribed to the presence of densely nano-sized stacking faults. Both coherent interfaces contribute effectively to high mechanical strength without any tradeoff in ductility

    D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation

    Full text link
    Temporal sentence grounding (TSG) aims to locate a specific moment from an untrimmed video with a given natural language query. Recently, weakly supervised methods still have a large performance gap compared to fully supervised ones, while the latter requires laborious timestamp annotations. In this study, we aim to reduce the annotation cost yet keep competitive performance for TSG task compared to fully supervised ones. To achieve this goal, we investigate a recently proposed glance-supervised temporal sentence grounding task, which requires only single frame annotation (referred to as glance annotation) for each query. Under this setup, we propose a Dynamic Gaussian prior based Grounding framework with Glance annotation (D3G), which consists of a Semantic Alignment Group Contrastive Learning module (SA-GCL) and a Dynamic Gaussian prior Adjustment module (DGA). Specifically, SA-GCL samples reliable positive moments from a 2D temporal map via jointly leveraging Gaussian prior and semantic consistency, which contributes to aligning the positive sentence-moment pairs in the joint embedding space. Moreover, to alleviate the annotation bias resulting from glance annotation and model complex queries consisting of multiple events, we propose the DGA module, which adjusts the distribution dynamically to approximate the ground truth of target moments. Extensive experiments on three challenging benchmarks verify the effectiveness of the proposed D3G. It outperforms the state-of-the-art weakly supervised methods by a large margin and narrows the performance gap compared to fully supervised methods. Code is available at https://github.com/solicucu/D3G.Comment: ICCV202

    Unified and Dynamic Graph for Temporal Character Grouping in Long Videos

    Full text link
    Video temporal character grouping locates appearing moments of major characters within a video according to their identities. To this end, recent works have evolved from unsupervised clustering to graph-based supervised clustering. However, graph methods are built upon the premise of fixed affinity graphs, bringing many inexact connections. Besides, they extract multi-modal features with kinds of models, which are unfriendly to deployment. In this paper, we present a unified and dynamic graph (UniDG) framework for temporal character grouping. This is accomplished firstly by a unified representation network that learns representations of multiple modalities within the same space and still preserves the modality's uniqueness simultaneously. Secondly, we present a dynamic graph clustering where the neighbors of different quantities are dynamically constructed for each node via a cyclic matching strategy, leading to a more reliable affinity graph. Thirdly, a progressive association method is introduced to exploit spatial and temporal contexts among different modalities, allowing multi-modal clustering results to be well fused. As current datasets only provide pre-extracted features, we evaluate our UniDG method on a collected dataset named MTCG, which contains each character's appearing clips of face and body and speaking voice tracks. We also evaluate our key components on existing clustering and retrieval datasets to verify the generalization ability. Experimental results manifest that our method can achieve promising results and outperform several state-of-the-art approaches

    Dual-Functional PLGA Nanoparticles Co-Loaded with Indocyanine Green and Resiquimod for Prostate Cancer Treatment

    Get PDF
    Purpose: With the advance of screening techniques, there is a growing number of low-risk or intermediate-risk prostate cancer (PCa) cases, remaining a serious threat to men's health. To obtain better efficacy, a growing interest has been attracted to develop such emerging treatments as immunotherapy and focal therapy. However, few studies offer guidance on whether and how to combine these modalities against PCa. This study was designed to develop dual-functional nanoparticles (NPs) which combined photothermal therapy (PTT) with immunotherapy and determine the anti-tumor efficacy for PCa treatment. Methods: By a double emulsion technique, the drug nanocarrier, poly(lactic-co-glycolic acid) or PLGA, was applied for co-loading of a fluorescent dye, indocyanine green (ICG) and a toll-like receptor 7/8 (TLR7/8) agonist resiquimod (R848) to synthesize PLGA-ICG-R848 NPs. Next, we determined their characteristic features and evaluated whether they inhibited the cell viability in multiple PCa cell lines. After treatment with PLGA-ICG-R848, the maturation markers of bone marrow-derived dendritic cells (BMDCs) were detected by flow cytometry. By establishing a subcutaneous xenograft model of mouse PCa, we explored both the anti-tumor effect and immune response following the NPs-based laser ablation. Results: With a mean diameter of 157.7 nm, PLGA-ICG-R848 exhibited no cytotoxic effect in PCa cells, but they significantly decreased RM9 cell viability to (3.9 +/- 1.0)% after laser irradiation. Moreover, PLGA-ICG-R848 promoted BMDCs maturation with the significantly elevated proportions of CD11c+CD86+ and CD11c+CD80+ cells. Following PLGA-ICG-R848-based laser ablation in vivo, the decreased bioluminescent signals indicated a significant inhibition of PCa growth, while the ratio of splenic natural killer (NK) cells in PLGA-ICG-R848 was (3.96 +/- 1.88)% compared with (0.99 +/- 0.10)% in PBS group, revealing the enhanced immune response against PCa. Conclusion: The dual-functional PLGA-ICG-R848 NPs under laser irradiation exhibit the anti-tumor efficacy for PCa treatment by combining PTT with immunotherapy
    • …
    corecore