Search CORE

1,949 research outputs found

PCGAN: Partition-Controlled Human Image Generation

Author: Liang Dong
Tian Xiaowei
Wang Rui
Zou Cong
Publication venue
Publication date: 24/11/2018
Field of study

Human image generation is a very challenging task since it is affected by many factors. Many human image generation methods focus on generating human images conditioned on a given pose, while the generated backgrounds are often blurred.In this paper,we propose a novel Partition-Controlled GAN to generate human images according to target pose and background. Firstly, human poses in the given images are extracted, and foreground/background are partitioned for further use. Secondly, we extract and fuse appearance features, pose features and background features to generate the desired images. Experiments on Market-1501 and DeepFashion datasets show that our model not only generates realistic human images but also produce the human pose and background as we want. Extensive experiments on COCO and LIP datasets indicate the potential of our method.Comment: AAAI 2019 versio

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Medical image synthesis using generative adversarial networks: towards photo-realistic image synthesis

Author: Attia Mohamed Hassan
Publication venue: Deakin University, Deputy Vice-Chancellor Research Group, Institute for Intelligent Systems Research and Innovation
Publication date: 01/09/2018
Field of study

This proposed work addresses the photo-realism for synthetic images. We introduced a modified generative adversarial network: StencilGAN. It is a perceptually-aware generative adversarial network that synthesizes images based on overlaid labelled masks. This technique can be a prominent solution for the scarcity of the resources in the healthcare sector

Deakin Research Online

Amodal Instance Segmentation and Multi-Object Tracking with Deep Pixel Embedding

Author: Liu Yanfeng
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 05/12/2019
Field of study

This thesis extends upon the representational output of semantic instance segmentation by explicitly including both visible and occluded parts. A fully convolutional network is trained to produce consistent pixel-level embedding across two layers such that, when clustered, the results convey the full spatial extent and depth ordering of each instance. Results demonstrate that the network can accurately estimate complete masks in the presence of occlusion and outperform leading top-down bounding-box approaches. The model is further extended to produce consistent pixel-level embeddings across two consecutive image frames from a video to simultaneously perform amodal instance segmentation and multi-object tracking. No post-processing trackers or Hungarian Algorithm is needed to perform multi-object tracking. The advantages and disadvantages of such a bounding-box-free approach are studied thoroughly. Experiments show that the proposed method outperforms the state-of-the-art bounding-box based approach on tracking animated moving objects. Advisor: Eric T. Psota and Lance C. Pére

DigitalCommons@University of Nebraska

Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

Author: Fernandez-Chaves David
Gonzalez-Jimenez Javier
Matez-Bandera Jose Luis
Monroy Javier
Petkov Nicolai
Ruiz-Sarmiento Jose Raul
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Analysis of Retinal Images

Author: Sadegh Zadeh Reyhaneh
Publication venue
Publication date: 31/12/2014
Field of study

The University of Manchester - Institutional Repository

Joint Supervised and Self-Supervised Learning for 3D Real-World Challenges

Author: Alliegro Antonio
Boscaini Davide
Tommasi Tatiana
Publication venue
Publication date: 01/01/2020
Field of study

Point cloud processing and 3D shape understanding are very challenging tasks for which deep learning techniques have demonstrated great potentials. Still further progresses are essential to allow artificial intelligent agents to interact with the real world, where the amount of annotated data may be limited and integrating new sources of knowledge becomes crucial to support autonomous learning. Here we consider several possible scenarios involving synthetic and real-world point clouds where supervised learning fails due to data scarcity and large domain gaps. We propose to enrich standard feature representations by leveraging self-supervision through a multi-task model that can solve a 3D puzzle while learning the main task of shape classification or part segmentation. An extensive analysis investigating few-shot, transfer learning and cross-domain settings shows the effectiveness of our approach with state-of-the-art results for 3D shape classification and part segmentation

arXiv.org e-Print Archive

Archivio della ricerca - Fondazione Bruno Kessler

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Volumetric Segmentation of Dental CT Data

Author: Berezný Matej
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2021
Field of study

Hlavným cieľom tejto práce bola segmentácia objemových CT dát za použitia neurónových sietí. Ako vedľajší produkt bol vytvorený nový dataset spolu s silnými aj slabými anotáciami a nástroj pre automatický preprocessing dát. Takisto bola overená možnosť využitia transfer learningu a viacfázového trénovania. Z mnohých vykonaných testov možno vyvodiť záver, že aj tranfer learning aj viacfázové trénovanie mali pozitívny vplyv na vývoj dice skóre v porovnaní so základnou použitou metódou či už pri silných, alebo slabých anotáciách.The main goal of this work was to use neural networks for volumetric segmentation of dental CBCT data. As a byproducts, both new dataset including sparse and dense annotations and automatic preprocessing pipeline were produced. Additionally, the possibility of applying transfer learning and multi-phase training in order to improve segmentation results was tested. From the various tests that were carried out, conclusion can be drawn that both multi-phase training and transfer learning showed substantial improvement in dice score for both sparse and dense annotations compared to the baseline method.

Digital library of Brno University of Technology

National Repository of Grey Literature