210 research outputs found

    Data augmentation using background replacement for automated sorting of littered waste

    Get PDF
    The introduction of sophisticated waste treatment plants is making the process of trash sorting and recycling more and more effective and eco-friendly. Studies on Automated Waste Sorting (AWS) are greatly contributing to making the whole recycling process more efficient. However, a relevant issue, which remains unsolved, is how to deal with the large amount of waste that is littered in the environment instead of being collected properly. In this paper, we introduce BackRep: a method for building waste recognizers that can be used for identifying and sorting littered waste directly where it is found. BackRep consists of a data-augmentation procedure, which expands existing datasets by cropping solid waste in images taken on a uniform (white) background and superimposing it on more realistic backgrounds. For our purpose, realistic backgrounds are those representing places where solid waste is usually littered. To experiment with our data-augmentation procedure, we produced a new dataset in realistic settings. We observed that waste recognizers trained on augmented data actually outperform those trained on existing datasets. Hence, our data-augmentation procedure seems a viable approach to support the development of waste recognizers for urban and wild environments

    Large-Mass Ultra-Low Noise Germanium Detectors: Performance and Applications in Neutrino and Astroparticle Physics

    Get PDF
    A new type of radiation detector, a p-type modified electrode germanium diode, is presented. The prototype displays, for the first time, a combination of features (mass, energy threshold and background expectation) required for a measurement of coherent neutrino-nucleus scattering in a nuclear reactor experiment. The device hybridizes the mass and energy resolution of a conventional HPGe coaxial gamma spectrometer with the low electronic noise and threshold of a small x-ray semiconductor detector, also displaying an intrinsic ability to distinguish multiple from single-site particle interactions. The present performance of the prototype and possible further improvements are discussed, as well as other applications for this new type of device in neutrino and astroparticle physics (double-beta decay, neutrino magnetic moment and WIMP searches).Comment: submitted to Phys. Rev.

    PortraitNet:Real-time portrait segmentation network for mobile device

    Get PDF
    Real-time portrait segmentation plays a significant role in many applications on mobile device, such as background replacement in video chat or teleconference. In this paper, we propose a real-time portrait segmentation model, called PortraitNet, that can run effectively and efficiently on mobile device. PortraitNet is based on a lightweight U-shape architecture with two auxiliary losses at the training stage, while no additional cost is required at the testing stage for portrait inference. The two auxiliary losses are boundary loss and consistency constraint loss. The former improves the accuracy of boundary pixels, and the latter enhances the robustness in complex lighting environment. We evaluate PortraitNet on portrait segmentation dataset EG1800 and Supervise-Portrait. Compared with the state-of-the-art methods, our approach achieves remarkable performance in terms of both accuracy and efficiency, especially for generating results with sharper boundaries and under severe illumination conditions. Meanwhile, PortraitNet is capable of processing 224 × 224 RGB images at 30 FPS on iPhone 7

    LayerDiffusion: Layered Controlled Image Editing with Diffusion Models

    Full text link
    Text-guided image editing has recently experienced rapid development. However, simultaneously performing multiple editing actions on a single image, such as background replacement and specific subject attribute changes, while maintaining consistency between the subject and the background remains challenging. In this paper, we propose LayerDiffusion, a semantic-based layered controlled image editing method. Our method enables non-rigid editing and attribute modification of specific subjects while preserving their unique characteristics and seamlessly integrating them into new backgrounds. We leverage a large-scale text-to-image model and employ a layered controlled optimization strategy combined with layered diffusion training. During the diffusion process, an iterative guidance strategy is used to generate a final image that aligns with the textual description. Experimental results demonstrate the effectiveness of our method in generating highly coherent images that closely align with the given textual description. The edited images maintain a high similarity to the features of the input image and surpass the performance of current leading image editing methods. LayerDiffusion opens up new possibilities for controllable image editing.Comment: 17 pages, 14 figure

    PFB-Diff: Progressive Feature Blending Diffusion for Text-driven Image Editing

    Full text link
    Diffusion models have showcased their remarkable capability to synthesize diverse and high-quality images, sparking interest in their application for real image editing. However, existing diffusion-based approaches for local image editing often suffer from undesired artifacts due to the pixel-level blending of the noised target images and diffusion latent variables, which lack the necessary semantics for maintaining image consistency. To address these issues, we propose PFB-Diff, a Progressive Feature Blending method for Diffusion-based image editing. Unlike previous methods, PFB-Diff seamlessly integrates text-guided generated content into the target image through multi-level feature blending. The rich semantics encoded in deep features and the progressive blending scheme from high to low levels ensure semantic coherence and high quality in edited images. Additionally, we introduce an attention masking mechanism in the cross-attention layers to confine the impact of specific words to desired regions, further improving the performance of background editing. PFB-Diff can effectively address various editing tasks, including object/background replacement and object attribute editing. Our method demonstrates its superior performance in terms of image fidelity, editing accuracy, efficiency, and faithfulness to the original image, without the need for fine-tuning or training.Comment: 18 pages, 15 figure

    HeadOn: Real-time Reenactment of Human Portrait Videos

    Get PDF
    We propose HeadOn, the first real-time source-to-target reenactment approach for complete human portrait videos that enables transfer of torso and head motion, face expression, and eye gaze. Given a short RGB-D video of the target actor, we automatically construct a personalized geometry proxy that embeds a parametric head, eye, and kinematic torso model. A novel real-time reenactment algorithm employs this proxy to photo-realistically map the captured motion from the source actor to the target actor. On top of the coarse geometric proxy, we propose a video-based rendering technique that composites the modified target portrait video via view- and pose-dependent texturing, and creates photo-realistic imagery of the target actor under novel torso and head poses, facial expressions, and gaze directions. To this end, we propose a robust tracking of the face and torso of the source actor. We extensively evaluate our approach and show significant improvements in enabling much greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at Siggraph'1
    corecore