125 research outputs found
Deep Bilateral Learning for Real-Time Image Enhancement
Performance is a critical challenge in mobile image processing. Given a
reference imaging pipeline, or even human-adjusted pairs of images, we seek to
reproduce the enhancements and enable real-time evaluation. For this, we
introduce a new neural network architecture inspired by bilateral grid
processing and local affine color transforms. Using pairs of input/output
images, we train a convolutional neural network to predict the coefficients of
a locally-affine model in bilateral space. Our architecture learns to make
local, global, and content-dependent decisions to approximate the desired image
transformation. At runtime, the neural network consumes a low-resolution
version of the input image, produces a set of affine transformations in
bilateral space, upsamples those transformations in an edge-preserving fashion
using a new slicing node, and then applies those upsampled transformations to
the full-resolution image. Our algorithm processes high-resolution images on a
smartphone in milliseconds, provides a real-time viewfinder at 1080p
resolution, and matches the quality of state-of-the-art approximation
techniques on a large class of image operators. Unlike previous work, our model
is trained off-line from data and therefore does not require access to the
original operator at runtime. This allows our model to learn complex,
scene-dependent transformations for which no reference implementation is
available, such as the photographic edits of a human retoucher.Comment: 12 pages, 14 figures, Siggraph 201
Guided interactive image segmentation using machine learning and color based data set clustering
We present a novel approach that combines machine learning based interactive
image segmentation using supervoxels with a clustering method for the automated
identification of similarly colored images in large data sets which enables a
guided reuse of classifiers. Our approach solves the problem of significant
color variability prevalent and often unavoidable in biological and medical
images which typically leads to deteriorated segmentation and quantification
accuracy thereby greatly reducing the necessary training effort. This increase
in efficiency facilitates the quantification of much larger numbers of images
thereby enabling interactive image analysis for recent new technological
advances in high-throughput imaging. The presented methods are applicable for
almost any image type and represent a useful tool for image analysis tasks in
general
3D Human Face Reconstruction and 2D Appearance Synthesis
3D human face reconstruction has been an extensive research for decades due to its wide applications, such as animation, recognition and 3D-driven appearance synthesis. Although commodity depth sensors are widely available in recent years, image based face reconstruction are significantly valuable as images are much easier to access and store.
In this dissertation, we first propose three image-based face reconstruction approaches according to different assumption of inputs.
In the first approach, face geometry is extracted from multiple key frames of a video sequence with different head poses. The camera should be calibrated under this assumption.
As the first approach is limited to videos, we propose the second approach then focus on single image. This approach also improves the geometry by adding fine grains using shading cue. We proposed a novel albedo estimation and linear optimization algorithm in this approach.
In the third approach, we further loose the constraint of the input image to arbitrary in the wild images. Our proposed approach can robustly reconstruct high quality model even with extreme expressions and large poses.
We then explore the applicability of our face reconstructions on four interesting applications: video face beautification, generating personalized facial blendshape from image sequences, face video stylizing and video face replacement. We demonstrate great potentials of our reconstruction approaches on these real-world applications. In particular, with the recent surge of interests in VR/AR, it is increasingly common to see people wearing head-mounted displays. However, the large occlusion on face is a big obstacle for people to communicate in a face-to-face manner. Our another application is that we explore hardware/software solutions for synthesizing the face image with presence of HMDs. We design two setups (experimental and mobile) which integrate two near IR cameras and one color camera to solve this problem. With our algorithm and prototype, we can achieve photo-realistic results.
We further propose a deep neutral network to solve the HMD removal problem considering it as a face inpainting problem. This approach doesn\u27t need special hardware and run in real-time with satisfying results
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
Structured sampling and fast reconstruction of smooth graph signals
This work concerns sampling of smooth signals on arbitrary graphs. We first
study a structured sampling strategy for such smooth graph signals that
consists of a random selection of few pre-defined groups of nodes. The number
of groups to sample to stably embed the set of -bandlimited signals is
driven by a quantity called the \emph{group} graph cumulative coherence. For
some optimised sampling distributions, we show that sampling
groups is always sufficient to stably embed the set of -bandlimited signals
but that this number can be smaller -- down to -- depending on the
structure of the groups of nodes. Fast methods to approximate these sampling
distributions are detailed. Second, we consider -bandlimited signals that
are nearly piecewise constant over pre-defined groups of nodes. We show that it
is possible to speed up the reconstruction of such signals by reducing
drastically the dimension of the vectors to reconstruct. When combined with the
proposed structured sampling procedure, we prove that the method provides
stable and accurate reconstruction of the original signal. Finally, we present
numerical experiments that illustrate our theoretical results and, as an
example, show how to combine these methods for interactive object segmentation
in an image using superpixels
CONTINUOUS DEPTH MAP RECONSTRUCTION FROM LIGHT FIELDS
Light field analysis recently received growing interest, since its rich structure information benefits many computer vision tasks. This paper presents a novel method to reconstruct continuous depth maps from light field data. Conventional approaches usually treat depth map reconstruction as an optimization problem with discrete labels. On the contrary, our proposed method can obtain continuous depth maps by solving a linear system, which preserves richer details compared with conventional discrete approaches. Structure tensor is employed to extract raw depth information and corresponding confidence levels from the light field data. We introduce a method to reduce the adverse effect of unreliable local estimations, which helps to get rid of errors in specular areas and edges where depth values are discontinuous. Experiments on both synthetic and real light field data demonstrate the effectiveness of the proposed method. Index Terms â Depth map reconstruction, light field, linear system 1
- âŠ