Search CORE

23 research outputs found

Fast Deep Matting for Portrait Animation on Mobile Phone

Author: Cho Donghyeon
Gastal Eduardo SL
He Kaiming
Huang Gao
Jégou Simon
Paszke Adam
Qin Hongwei
Redmon Joseph
Shen Xiaoyong
Szegedy Christian
Publication venue
Publication date: 26/07/2017
Field of study

Image matting plays an important role in image and video editing. However, the formulation of image matting is inherently ill-posed. Traditional methods usually employ interaction to deal with the image matting problem with trimaps and strokes, and cannot run on the mobile phone in real-time. In this paper, we propose a real-time automatic deep matting approach for mobile devices. By leveraging the densely connected blocks and the dilated convolution, a light full convolutional network is designed to predict a coarse binary mask for portrait images. And a feathering block, which is edge-preserving and matting adaptive, is further developed to learn the guided filter and transform the binary mask into alpha matte. Finally, an automatic portrait animation system based on fast deep matting is built on mobile devices, which does not need any interaction and can realize real-time matting with 15 fps. The experiments show that the proposed approach achieves comparable results with the state-of-the-art matting solvers.Comment: ACM Multimedia Conference (MM) 2017 camera-read

arXiv.org e-Print Archive

Crossref

PortraitNet:Real-time portrait segmentation network for mobile device

Author: Dong Xin
Li Ruilong
Yang Yongliang
Zhang Song-Hai
Publication venue: 'Elsevier BV'
Publication date: 01/05/2019
Field of study

Real-time portrait segmentation plays a significant role in many applications on mobile device, such as background replacement in video chat or teleconference. In this paper, we propose a real-time portrait segmentation model, called PortraitNet, that can run effectively and efficiently on mobile device. PortraitNet is based on a lightweight U-shape architecture with two auxiliary losses at the training stage, while no additional cost is required at the testing stage for portrait inference. The two auxiliary losses are boundary loss and consistency constraint loss. The former improves the accuracy of boundary pixels, and the latter enhances the robustness in complex lighting environment. We evaluate PortraitNet on portrait segmentation dataset EG1800 and Supervise-Portrait. Compared with the state-of-the-art methods, our approach achieves remarkable performance in terms of both accuracy and efficiency, especially for generating results with sharper boundaries and under severe illumination conditions. Meanwhile, PortraitNet is capable of processing 224 × 224 RGB images at 30 FPS on iPhone 7

OPUS

BGGAN: Bokeh-Glass Generative Adversarial Network for Rendering Realistic Bokeh

Author: Cheng Jian
Guo Zhenyu
Leng Cong
Li Chenghua
Lin Jiamin
Qian Ming
Qiao Congyu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/11/2020
Field of study

A photo captured with bokeh effect often means objects in focus are sharp while the out-of-focus areas are all blurred. DSLR can easily render this kind of effect naturally. However, due to the limitation of sensors, smartphones cannot capture images with depth-of-field effects directly. In this paper, we propose a novel generator called Glass-Net, which generates bokeh images not relying on complex hardware. Meanwhile, the GAN-based method and perceptual loss are combined for rendering a realistic bokeh effect in the stage of finetuning the model. Moreover, Instance Normalization(IN) is reimplemented in our network, which ensures our tflite model with IN can be accelerated on smartphone GPU. Experiments show that our method is able to render a high-quality bokeh effect and process one

1024 \times 1536

pixel image in 1.9 seconds on all smartphone chipsets. This approach ranked First in AIM 2020 Rendering Realistic Bokeh Challenge Track 1 \& Track 2.Comment: accepted by ECCV workshop 202

arXiv.org e-Print Archive

Crossref

Workflow for reducing semantic segmentation annotation time

Author: Suvilehto O. (Otso)
Publication venue: University of Oulu
Publication date: 17/12/2020
Field of study

Abstract. Semantic segmentation is a challenging task within the field of pattern recognition from digital images. Current semantic segmentation methods that are based on neural networks show great promise in accurate pixel-level classification, but the methods seem to be limited at least to some extent by the availability of accurate training data. Semantic segmentation training data is typically curated by humans, but the task is rather slow and tedious even for humans. While humans are fast at checking whether a segmentation is accurate or not, creating segmentations is rather slow as the human visual system becomes limited by physical interfaces such as hand coordination for drawing segmentations by hand. This thesis evaluates a workflow that aims to reduce the need for drawing segmentations by hand to create an accurate set of training data. A publicly available dataset is used as the starting-point for the annotation process, and four different evaluation sets are used to evaluate the introduced annotation workflow in labour efficiency and annotation accuracy. Evaluation of the results indicates that the workflow can produce annotations that are comparable to manually corrected annotations in accuracy while requiring significantly less manual labour to produce annotations.Työnkulku semanttisen segmentoinnin annotointiajan vähentämiseen. Tiivistelmä. Semanttinen segmentointi on haastava osa-alue hahmontunnistusta digitaalisista kuvista. Tämänhetkiset semanttiset segmentaatiomenetelmät, jotka perustuvat neuroverkkoihin, osoittavat suurta potentiaalia tarkassa pikselitason luokittelussa, mutta ovat ainakin osittain tarkan koulutusdatan saatavuuden rajoittamia. Semanttisen segmentaation koulutusdata on tyypillisesti täysin ihmisten annotoimaa, mutta segmentaatioiden annotointi on hidasta ja pitkäveteistä. Vaikka ihmiset ovat nopeita tarkistamaan ovatko annotaatiot tarkkoja, niiden luonti on hidasta, koska ihmisen visuaalisen järjestelmän nopeuden ja tarkkuuden rajoittavaksi tekijäksi lisätään fyysinen rajapinta, kuten silmä-käsi-koordinaatio piirtäessä segmentaatioita käsin. Tämä opinnäytetyö arvioi kokonaisvaltaisen semanttisten segmentaatioiden annotointitavan, joka pyrkii vähentämään käsin piirtämisen tarvetta tarkan koulutusdatan luomiseksi. Julkisesti saatavilla olevaa datajoukkoa käytetään annotoinnin lähtökohtana, ja neljää erilaista evaluointijoukkoa käytetään esitetyn annotointitavan työtehokkuuden sekä annotaatiotarkkuuden arviointiin. Evaluaatiotulokset osoittavat, että esitetty tapa kykenee tuottamaan annotaatioita jotka ovat yhtä tarkkoja kuin käsin korjatut annotaatiot samalla merkittävästi vähentäen käsin tehtävän työn määrää

University of Oulu Repository - Jultika