2,742 research outputs found
Quantum Annealing for Single Image Super-Resolution
This paper proposes a quantum computing-based algorithm to solve the single
image super-resolution (SISR) problem. One of the well-known classical
approaches for SISR relies on the well-established patch-wise sparse modeling
of the problem. Yet, this field's current state of affairs is that deep neural
networks (DNNs) have demonstrated far superior results than traditional
approaches. Nevertheless, quantum computing is expected to become increasingly
prominent for machine learning problems soon. As a result, in this work, we
take the privilege to perform an early exploration of applying a quantum
computing algorithm to this important image enhancement problem, i.e., SISR.
Among the two paradigms of quantum computing, namely universal gate quantum
computing and adiabatic quantum computing (AQC), the latter has been
successfully applied to practical computer vision problems, in which quantum
parallelism has been exploited to solve combinatorial optimization efficiently.
This work demonstrates formulating quantum SISR as a sparse coding optimization
problem, which is solved using quantum annealers accessed via the D-Wave Leap
platform. The proposed AQC-based algorithm is demonstrated to achieve improved
speed-up over a classical analog while maintaining comparable SISR accuracy.Comment: Accepted to IEEE/CVF CVPR 2023, NTIRE Challenge and Workshop. Draft
info: 10 pages, 6 Figures, 2 Table
Face Centered Image Analysis Using Saliency and Deep Learning Based Techniques
Image analysis starts with the purpose of configuring vision machines that can perceive like human to intelligently infer general principles and sense the surrounding situations from imagery. This dissertation studies the face centered image analysis as the core problem in high level computer vision research and addresses the problem by tackling three challenging subjects: Are there anything interesting in the image? If there is, what is/are that/they? If there is a person presenting, who is he/she? What kind of expression he/she is performing? Can we know his/her age? Answering these problems results in the saliency-based object detection, deep learning structured objects categorization and recognition, human facial landmark detection and multitask biometrics.
To implement object detection, a three-level saliency detection based on the self-similarity technique (SMAP) is firstly proposed in the work. The first level of SMAP accommodates statistical methods to generate proto-background patches, followed by the second level that implements local contrast computation based on image self-similarity characteristics. At last, the spatial color distribution constraint is considered to realize the saliency detection. The outcome of the algorithm is a full resolution image with highlighted saliency objects and well-defined edges.
In object recognition, the Adaptive Deconvolution Network (ADN) is implemented to categorize the objects extracted from saliency detection. To improve the system performance, L1/2 norm regularized ADN has been proposed and tested in different applications. The results demonstrate the efficiency and significance of the new structure.
To fully understand the facial biometrics related activity contained in the image, the low rank matrix decomposition is introduced to help locate the landmark points on the face images. The natural extension of this work is beneficial in human facial expression recognition and facial feature parsing research.
To facilitate the understanding of the detected facial image, the automatic facial image analysis becomes essential. We present a novel deeply learnt tree-structured face representation to uniformly model the human face with different semantic meanings. We show that the proposed feature yields unified representation in multi-task facial biometrics and the multi-task learning framework is applicable to many other computer vision tasks
Super-resolving Compressed Images via Parallel and Series Integration of Artifact Reduction and Resolution Enhancement
In this paper, we propose a novel compressed image super resolution (CISR)
framework based on parallel and series integration of artifact removal and
resolution enhancement. Based on maximum a posterior inference for estimating a
clean low-resolution (LR) input image and a clean high resolution (HR) output
image from down-sampled and compressed observations, we have designed a CISR
architecture consisting of two deep neural network modules: the artifact
reduction module (ARM) and resolution enhancement module (REM). ARM and REM
work in parallel with both taking the compressed LR image as their inputs,
while they also work in series with REM taking the output of ARM as one of its
inputs and ARM taking the output of REM as its other input. A unique property
of our CSIR system is that a single trained model is able to super-resolve LR
images compressed by different methods to various qualities. This is achieved
by exploiting deep neural net-works capacity for handling image degradations,
and the parallel and series connections between ARM and REM to reduce the
dependency on specific degradations. ARM and REM are trained simultaneously by
the deep unfolding technique. Experiments are conducted on a mixture of JPEG
and WebP compressed images without a priori knowledge of the compression type
and com-pression factor. Visual and quantitative comparisons demonstrate the
superiority of our method over state-of-the-art super resolu-tion methods.Code
link: https://github.com/luohongming/CISR_PS
High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference
We propose a data-driven method for recovering miss-ing parts of 3D shapes.
Our method is based on a new deep learning architecture consisting of two
sub-networks: a global structure inference network and a local geometry
refinement network. The global structure inference network incorporates a long
short-term memorized context fusion module (LSTM-CF) that infers the global
structure of the shape based on multi-view depth information provided as part
of the input. It also includes a 3D fully convolutional (3DFCN) module that
further enriches the global structure representation according to volumetric
information in the input. Under the guidance of the global structure network,
the local geometry refinement network takes as input lo-cal 3D patches around
missing regions, and progressively produces a high-resolution, complete surface
through a volumetric encoder-decoder architecture. Our method jointly trains
the global structure inference and local geometry refinement networks in an
end-to-end manner. We perform qualitative and quantitative evaluations on six
object categories, demonstrating that our method outperforms existing
state-of-the-art work on shape completion.Comment: 8 pages paper, 11 pages supplementary material, ICCV spotlight pape
- …