2,922 research outputs found
Workflow for reducing semantic segmentation annotation time
Abstract. Semantic segmentation is a challenging task within the field of pattern recognition from digital images. Current semantic segmentation methods that are based on neural networks show great promise in accurate pixel-level classification, but the methods seem to be limited at least to some extent by the availability of accurate training data. Semantic segmentation training data is typically curated by humans, but the task is rather slow and tedious even for humans. While humans are fast at checking whether a segmentation is accurate or not, creating segmentations is rather slow as the human visual system becomes limited by physical interfaces such as hand coordination for drawing segmentations by hand. This thesis evaluates a workflow that aims to reduce the need for drawing segmentations by hand to create an accurate set of training data.
A publicly available dataset is used as the starting-point for the annotation process, and four different evaluation sets are used to evaluate the introduced annotation workflow in labour efficiency and annotation accuracy.
Evaluation of the results indicates that the workflow can produce annotations that are comparable to manually corrected annotations in accuracy while requiring significantly less manual labour to produce annotations.Työnkulku semanttisen segmentoinnin annotointiajan vähentämiseen. Tiivistelmä. Semanttinen segmentointi on haastava osa-alue hahmontunnistusta digitaalisista kuvista. Tämänhetkiset semanttiset segmentaatiomenetelmät, jotka perustuvat neuroverkkoihin, osoittavat suurta potentiaalia tarkassa pikselitason luokittelussa, mutta ovat ainakin osittain tarkan koulutusdatan saatavuuden rajoittamia. Semanttisen segmentaation koulutusdata on tyypillisesti täysin ihmisten annotoimaa, mutta segmentaatioiden annotointi on hidasta ja pitkäveteistä. Vaikka ihmiset ovat nopeita tarkistamaan ovatko annotaatiot tarkkoja, niiden luonti on hidasta, koska ihmisen visuaalisen järjestelmän nopeuden ja tarkkuuden rajoittavaksi tekijäksi lisätään fyysinen rajapinta, kuten silmä-käsi-koordinaatio piirtäessä segmentaatioita käsin. Tämä opinnäytetyö arvioi kokonaisvaltaisen semanttisten segmentaatioiden annotointitavan, joka pyrkii vähentämään käsin piirtämisen tarvetta tarkan koulutusdatan luomiseksi.
Julkisesti saatavilla olevaa datajoukkoa käytetään annotoinnin lähtökohtana, ja neljää erilaista evaluointijoukkoa käytetään esitetyn annotointitavan työtehokkuuden sekä annotaatiotarkkuuden arviointiin.
Evaluaatiotulokset osoittavat, että esitetty tapa kykenee tuottamaan annotaatioita jotka ovat yhtä tarkkoja kuin käsin korjatut annotaatiot samalla merkittävästi vähentäen käsin tehtävän työn määrää
Deep Bilateral Learning for Real-Time Image Enhancement
Performance is a critical challenge in mobile image processing. Given a
reference imaging pipeline, or even human-adjusted pairs of images, we seek to
reproduce the enhancements and enable real-time evaluation. For this, we
introduce a new neural network architecture inspired by bilateral grid
processing and local affine color transforms. Using pairs of input/output
images, we train a convolutional neural network to predict the coefficients of
a locally-affine model in bilateral space. Our architecture learns to make
local, global, and content-dependent decisions to approximate the desired image
transformation. At runtime, the neural network consumes a low-resolution
version of the input image, produces a set of affine transformations in
bilateral space, upsamples those transformations in an edge-preserving fashion
using a new slicing node, and then applies those upsampled transformations to
the full-resolution image. Our algorithm processes high-resolution images on a
smartphone in milliseconds, provides a real-time viewfinder at 1080p
resolution, and matches the quality of state-of-the-art approximation
techniques on a large class of image operators. Unlike previous work, our model
is trained off-line from data and therefore does not require access to the
original operator at runtime. This allows our model to learn complex,
scene-dependent transformations for which no reference implementation is
available, such as the photographic edits of a human retoucher.Comment: 12 pages, 14 figures, Siggraph 201
Deep Learning of Unified Region, Edge, and Contour Models for Automated Image Segmentation
Image segmentation is a fundamental and challenging problem in computer
vision with applications spanning multiple areas, such as medical imaging,
remote sensing, and autonomous vehicles. Recently, convolutional neural
networks (CNNs) have gained traction in the design of automated segmentation
pipelines. Although CNN-based models are adept at learning abstract features
from raw image data, their performance is dependent on the availability and
size of suitable training datasets. Additionally, these models are often unable
to capture the details of object boundaries and generalize poorly to unseen
classes. In this thesis, we devise novel methodologies that address these
issues and establish robust representation learning frameworks for
fully-automatic semantic segmentation in medical imaging and mainstream
computer vision. In particular, our contributions include (1) state-of-the-art
2D and 3D image segmentation networks for computer vision and medical image
analysis, (2) an end-to-end trainable image segmentation framework that unifies
CNNs and active contour models with learnable parameters for fast and robust
object delineation, (3) a novel approach for disentangling edge and texture
processing in segmentation networks, and (4) a novel few-shot learning model in
both supervised settings and semi-supervised settings where synergies between
latent and image spaces are leveraged to learn to segment images given limited
training data.Comment: PhD dissertation, UCLA, 202
Clustering Optimized Portrait Matting Algorithm Based on Improved Sparrow Algorithm
As a result of the influence of individual appearance and lighting conditions, aberrant noise spots cause significant mis-segmentation for frontal portraits. This paper presents an accurate portrait segmentation approach based on a combination of wavelet proportional shrinkage and an upgraded sparrow search (SSA) clustering algorithm to solve the accuracy challenge of segmentation for frontal portraits. The brightness component of the human portrait in HSV space is first subjected to wavelet scaling denoising. The elite inverse learning approach and adaptive weighting factor are then implemented to optimize the initial center location of the K-Means algorithm to improve the initial distribution and accelerate the convergence speed of SSA population members. The pixel segmentation accuracy of the proposed method is approximately 70% and 15% higher than two comparable traditional methods, while the similarity of color image features is approximately 10% higher. Experiments show that the proposed method has achieved a high level of accuracy in capricious lighting conditions
- …