1,949 research outputs found

    PCGAN: Partition-Controlled Human Image Generation

    Full text link
    Human image generation is a very challenging task since it is affected by many factors. Many human image generation methods focus on generating human images conditioned on a given pose, while the generated backgrounds are often blurred.In this paper,we propose a novel Partition-Controlled GAN to generate human images according to target pose and background. Firstly, human poses in the given images are extracted, and foreground/background are partitioned for further use. Secondly, we extract and fuse appearance features, pose features and background features to generate the desired images. Experiments on Market-1501 and DeepFashion datasets show that our model not only generates realistic human images but also produce the human pose and background as we want. Extensive experiments on COCO and LIP datasets indicate the potential of our method.Comment: AAAI 2019 versio

    Medical image synthesis using generative adversarial networks: towards photo-realistic image synthesis

    Full text link
    This proposed work addresses the photo-realism for synthetic images. We introduced a modified generative adversarial network: StencilGAN. It is a perceptually-aware generative adversarial network that synthesizes images based on overlaid labelled masks. This technique can be a prominent solution for the scarcity of the resources in the healthcare sector

    Amodal Instance Segmentation and Multi-Object Tracking with Deep Pixel Embedding

    Get PDF
    This thesis extends upon the representational output of semantic instance segmentation by explicitly including both visible and occluded parts. A fully convolutional network is trained to produce consistent pixel-level embedding across two layers such that, when clustered, the results convey the full spatial extent and depth ordering of each instance. Results demonstrate that the network can accurately estimate complete masks in the presence of occlusion and outperform leading top-down bounding-box approaches. The model is further extended to produce consistent pixel-level embeddings across two consecutive image frames from a video to simultaneously perform amodal instance segmentation and multi-object tracking. No post-processing trackers or Hungarian Algorithm is needed to perform multi-object tracking. The advantages and disadvantages of such a bounding-box-free approach are studied thoroughly. Experiments show that the proposed method outperforms the state-of-the-art bounding-box based approach on tracking animated moving objects. Advisor: Eric T. Psota and Lance C. Pére

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    Joint Supervised and Self-Supervised Learning for 3D Real-World Challenges

    Get PDF
    Point cloud processing and 3D shape understanding are very challenging tasks for which deep learning techniques have demonstrated great potentials. Still further progresses are essential to allow artificial intelligent agents to interact with the real world, where the amount of annotated data may be limited and integrating new sources of knowledge becomes crucial to support autonomous learning. Here we consider several possible scenarios involving synthetic and real-world point clouds where supervised learning fails due to data scarcity and large domain gaps. We propose to enrich standard feature representations by leveraging self-supervision through a multi-task model that can solve a 3D puzzle while learning the main task of shape classification or part segmentation. An extensive analysis investigating few-shot, transfer learning and cross-domain settings shows the effectiveness of our approach with state-of-the-art results for 3D shape classification and part segmentation

    Volumetric Segmentation of Dental CT Data

    Get PDF
    Hlavným cieľom tejto práce bola segmentácia objemových CT dát za použitia neurónových sietí. Ako vedľajší produkt bol vytvorený nový dataset spolu s silnými aj slabými anotáciami a nástroj pre automatický preprocessing dát. Takisto bola overená možnosť využitia transfer learningu a viacfázového trénovania. Z mnohých vykonaných testov možno vyvodiť záver, že aj tranfer learning aj viacfázové trénovanie mali pozitívny vplyv na vývoj dice skóre v porovnaní so základnou použitou metódou či už pri silných, alebo slabých anotáciách.The main goal of this work was to use neural networks for volumetric segmentation of dental CBCT data. As a byproducts, both new dataset including sparse and dense annotations and automatic preprocessing pipeline were produced. Additionally, the possibility of applying transfer learning and multi-phase training in order to improve segmentation results was tested. From the various tests that were carried out, conclusion can be drawn that both multi-phase training and transfer learning showed substantial improvement in dice score for both sparse and dense annotations compared to the baseline method.
    corecore