109 research outputs found
Super-Resolution Approaches for Depth Video Enhancement
Sensing using 3D technologies has seen a revolution in the past years where cost-effective depth sensors are today part of accessible consumer electronics. Their ability in directly capturing depth videos in real-time has opened tremendous possibilities for multiple applications in computer vision. These sensors, however, have major shortcomings due to their high noise contamination, including missing and jagged measurements, and their low spatial resolutions. In order to extract detailed 3D features from this type of data, a dedicated data enhancement is required. We propose a generic depth multi–frame super–resolution framework that addresses the limitations of state-of-theart depth enhancement approaches. The proposed framework doesnot need any additional hardware or coupling with different modalities. It is based on a new data model that uses densely upsampled low resolution observations. This results in a robust median initial estimation, further refined by a deblurring operation using a bilateraltotal variation as the regularization term. The upsampling operation ensures a systematic improvement in the registration accuracy. This is explored in different scenarios based on the motions involved in the depth video. For the general and most challenging case of objects deforming non-rigidly in full 3D, we propose a recursive dynamic multi–frame super-resolution algorithm where the relative local 3D motions between consecutive frames are directly accounted for. We rely on the assumption that these 3D motions can be decoupled into lateral motions and radial displacements. This allows to perform a simple local per–pixel tracking where both depth measurements and deformations are optimized. As compared to alternative approaches, the results show a clear improvement in reconstruction accuracy and in robustness to noise, to relative large non-rigid deformations, and to topological changes. Moreover, the proposed approach, implemented on a CPU, is shown to be computationally efficient and working in real-time
FULL 3D RECONSTRUCTION OF DYNAMIC NON-RIGID SCENES: ACQUISITION AND ENHANCEMENT
Recent advances in commodity depth or 3D sensing technologies have enabled us to move
closer to the goal of accurately sensing and modeling the 3D representations of complex
dynamic scenes. Indeed, in domains such as virtual reality, security, surveillance and
e-health, there is now a greater demand for aff ordable and flexible vision systems which
are capable of acquiring high quality 3D reconstructions. Available commodity RGB-D
cameras, though easily accessible, have limited fi eld-of-view, and acquire noisy and low-resolution measurements which restricts their direct usage in building such vision systems.
This thesis targets these limitations and builds approaches around commodity 3D
sensing technologies to acquire noise-free and feature preserving full 3D reconstructions
of dynamic scenes containing, static or moving, rigid or non-rigid objects. A mono-view
system based on a single RGB-D camera is incapable of acquiring full 360 degrees 3D reconstruction of a dynamic scene instantaneously. For this purpose, a multi-view system
composed of several RGB-D cameras covering the whole scene is used. In the first part of
this thesis, the domain of correctly aligning the information acquired from RGB-D cameras
in a multi-view system to provide full and textured 3D reconstructions of dynamic
scenes, instantaneously, is explored. This is achieved by solving the extrinsic calibration
problem. This thesis proposes an extrinsic calibration framework which uses the 2D
photometric and 3D geometric information, acquired with RGB-D cameras, according
to their relative (in)accuracies, a ffected by the presence of noise, in a single weighted
bi-objective optimization. An iterative scheme is also proposed, which estimates the parameters
of noise model aff ecting both 2D and 3D measurements, and solves the extrinsic
calibration problem simultaneously. Results show improvement in calibration accuracy
as compared to state-of-art methods. In the second part of this thesis, the domain
of enhancement of noisy and low-resolution 3D data acquired with commodity RGB-D
cameras in both mono-view and multi-view systems is explored. This thesis extends
the state-of-art in mono-view template-free recursive 3D data enhancement which targets
dynamic scenes containing rigid-objects, and thus requires tracking only the global
motions of those objects for view-dependent surface representation and fi ltering. This
thesis proposes to target dynamic scenes containing non-rigid objects which introduces
the complex requirements of tracking relatively large local motions and maintaining data
organization for view-dependent surface representation. The proposed method is shown
to be e ffective in handling non-rigid objects of changing topologies. Building upon the
previous work, this thesis overcomes the requirement of data organization by proposing
an approach based on view-independent surface representation. View-independence
decreases the complexity of the proposed algorithm and allows it the flexibility to process
and enhance noisy data, acquired with multiple cameras in a multi-view system,
simultaneously. Moreover, qualitative and quantitative experimental analysis shows this
method to be more accurate in removing noise to produce enhanced 3D reconstructions
of non-rigid objects. Although, extending this method to a multi-view system would
allow for obtaining instantaneous enhanced full 360 degrees 3D reconstructions of non-rigid
objects, it still lacks the ability to explicitly handle low-resolution data. Therefore, this
thesis proposes a novel recursive dynamic multi-frame 3D super-resolution algorithm
together with a novel 3D bilateral total variation regularization to filter out the noise,
recover details and enhance the resolution of data acquired from commodity cameras in
a multi-view system. Results show that this method is able to build accurate, smooth
and feature preserving full 360 degrees 3D reconstructions of the dynamic scenes containing
non-rigid objects
Computational Imaging Approach to Recovery of Target Coordinates Using Orbital Sensor Data
This dissertation addresses the components necessary for simulation of an image-based recovery of the position of a target using orbital image sensors. Each component is considered in detail, focusing on the effect that design choices and system parameters have on the accuracy of the position estimate. Changes in sensor resolution, varying amounts of blur, differences in image noise level, selection of algorithms used for each component, and lag introduced by excessive processing time all contribute to the accuracy of the result regarding recovery of target coordinates using orbital sensor data.
Using physical targets and sensors in this scenario would be cost-prohibitive in the exploratory setting posed, therefore a simulated target path is generated using Bezier curves which approximate representative paths followed by the targets of interest. Orbital trajectories for the sensors are designed on an elliptical model representative of the motion of physical orbital sensors. Images from each sensor are simulated based on the position and orientation of the sensor, the position of the target, and the imaging parameters selected for the experiment (resolution, noise level, blur level, etc.). Post-processing of the simulated imagery seeks to reduce noise and blur and increase resolution. The only information available for calculating the target position by a fully implemented system are the sensor position and orientation vectors and the images from each sensor. From these data we develop a reliable method of recovering the target position and analyze the impact on near-realtime processing. We also discuss the influence of adjustments to system components on overall capabilities and address the potential system size, weight, and power requirements from realistic implementation approaches
Data-Driven Image Restoration
Every day many images are taken by digital cameras, and people
are demanding visually accurate and pleasing result. Noise and
blur degrade images captured by modern cameras, and high-level
vision tasks (such as segmentation, recognition, and tracking)
require high-quality images. Therefore, image restoration
specifically, image
deblurring and image denoising is a critical preprocessing step.
A fundamental problem in image deblurring is to recover reliably
distinct spatial frequencies that have been suppressed by the
blur kernel. Existing image deblurring techniques often rely on
generic image priors that only help recover part of the frequency
spectrum, such as the frequencies near the high-end. To this end,
we pose the following specific questions: (i) Does class-specific
information offer an advantage over existing generic priors for
image quality restoration? (ii) If a class-specific prior exists,
how should it be encoded into a deblurring framework to recover
attenuated image frequencies? Throughout this work, we devise a
class-specific prior based on the band-pass filter responses and
incorporate it into a deblurring strategy. Specifically, we show
that the subspace of band-pass filtered images and their
intensity distributions serve as useful priors for recovering
image frequencies.
Next, we present a novel image denoising algorithm that uses
external, category specific image database. In contrast to
existing noisy image restoration algorithms, our method selects
clean image “support patches” similar to the noisy patch from
an external database. We employ a content adaptive distribution
model for each patch where we derive the parameters of the
distribution from the support patches. Our objective function
composed of a Gaussian fidelity term that imposes category
specific information, and a low-rank term that encourages the
similarity between the noisy and the support patches in a robust
manner.
Finally, we propose to learn a fully-convolutional network model
that consists of a Chain of Identity Mapping Modules (CIMM) for
image denoising. The CIMM structure possesses two distinctive
features that are important for the noise removal task. Firstly,
each residual unit employs identity mappings as the skip
connections and receives pre-activated input to preserve the
gradient magnitude propagated in both the forward and backward
directions. Secondly, by utilizing dilated kernels for the
convolution layers in the residual branch, each neuron in the
last convolution layer of each module can observe the full
receptive field of the first layer
Artificial Intelligence in the Creative Industries: A Review
This paper reviews the current state of the art in Artificial Intelligence
(AI) technologies and applications in the context of the creative industries. A
brief background of AI, and specifically Machine Learning (ML) algorithms, is
provided including Convolutional Neural Network (CNNs), Generative Adversarial
Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement
Learning (DRL). We categorise creative applications into five groups related to
how AI technologies are used: i) content creation, ii) information analysis,
iii) content enhancement and post production workflows, iv) information
extraction and enhancement, and v) data compression. We critically examine the
successes and limitations of this rapidly advancing technology in each of these
areas. We further differentiate between the use of AI as a creative tool and
its potential as a creator in its own right. We foresee that, in the near
future, machine learning-based AI will be adopted widely as a tool or
collaborative assistant for creativity. In contrast, we observe that the
successes of machine learning in domains with fewer constraints, where AI is
the `creator', remain modest. The potential of AI (or its developers) to win
awards for its original creations in competition with human creatives is also
limited, based on contemporary technologies. We therefore conclude that, in the
context of creative industries, maximum benefit from AI will be derived where
its focus is human centric -- where it is designed to augment, rather than
replace, human creativity
Evolution-Operator-Based Single-Step Method for Image Processing
This work proposes an evolution-operator-based single-time-step
method for image and signal processing. The key component of the
proposed method is a local spectral evolution kernel (LSEK) that
analytically integrates a class of evolution partial differential
equations (PDEs). From the point of view PDEs, the LSEK provides
the analytical solution in a single time step, and is of spectral
accuracy, free of instability constraint. From the point of
image/signal processing, the LSEK gives rise to a family of
lowpass filters. These filters contain controllable time delay and
amplitude scaling. The new evolution operator-based method is
constructed by pointwise adaptation of anisotropy to the
coefficients of the LSEK. The Perona-Malik-type of anisotropic
diffusion schemes is incorporated in the LSEK for image denoising.
A forward-backward diffusion process is adopted to the LSEK for
image deblurring or sharpening. A coupled PDE system is modified
for image edge detection. The resulting image edge is utilized for
image enhancement. Extensive computer experiments are carried out
to demonstrate the performance of the proposed method. The major
advantages of the proposed method are its single-step solution and
readiness for multidimensional data analysis
- …