12 research outputs found
Efficient Learning-based Image Enhancement : Application to Compression Artifact Removal and Super-resolution
Many computer vision and computational photography applications essentially solve an image enhancement problem. The image has been deteriorated by a specific noise process, such as aberrations from camera optics and compression artifacts, that we would like to remove. We describe a framework for learning-based image enhancement. At the core of our algorithm lies a generic regularization framework that comprises a prior on natural images, as well as an application-specific conditional model based on Gaussian processes. In contrast to prior learning-based approaches, our algorithm can instantly learn task-specific degradation models from sample images which enables users to easily adapt the algorithm to a specific problem and data set of interest. This is facilitated by our efficient approximation scheme of large-scale Gaussian processes. We demonstrate the efficiency and effectiveness of our approach by applying it to example enhancement applications including single-image super-resolution, as well as artifact removal in JPEG- and JPEG 2000-encoded images
A Comparison of Image Denoising Methods
The advancement of imaging devices and countless images generated everyday
pose an increasingly high demand on image denoising, which still remains a
challenging task in terms of both effectiveness and efficiency. To improve
denoising quality, numerous denoising techniques and approaches have been
proposed in the past decades, including different transforms, regularization
terms, algebraic representations and especially advanced deep neural network
(DNN) architectures. Despite their sophistication, many methods may fail to
achieve desirable results for simultaneous noise removal and fine detail
preservation. In this paper, to investigate the applicability of existing
denoising techniques, we compare a variety of denoising methods on both
synthetic and real-world datasets for different applications. We also introduce
a new dataset for benchmarking, and the evaluations are performed from four
different perspectives including quantitative metrics, visual effects, human
ratings and computational cost. Our experiments demonstrate: (i) the
effectiveness and efficiency of representative traditional denoisers for
various denoising tasks, (ii) a simple matrix-based algorithm may be able to
produce similar results compared with its tensor counterparts, and (iii) the
notable achievements of DNN models, which exhibit impressive generalization
ability and show state-of-the-art performance on various datasets. In spite of
the progress in recent years, we discuss shortcomings and possible extensions
of existing techniques. Datasets, code and results are made publicly available
and will be continuously updated at
https://github.com/ZhaomingKong/Denoising-Comparison.Comment: In this paper, we intend to collect and compare various denoising
methods to investigate their effectiveness, efficiency, applicability and
generalization ability with both synthetic and real-world experiment
Inverse tone mapping
The introduction of High Dynamic Range Imaging in computer graphics has produced a novelty
in Imaging that can be compared to the introduction of colour photography or even more.
Light can now be captured, stored, processed, and finally visualised without losing information.
Moreover, new applications that can exploit physical values of the light have been introduced
such as re-lighting of synthetic/real objects, or enhanced visualisation of scenes. However,
these new processing and visualisation techniques cannot be applied to movies and pictures
that have been produced by photography and cinematography in more than one hundred years.
This thesis introduces a general framework for expanding legacy content into High Dynamic
Range content. The expansion is achieved avoiding artefacts, producing images suitable for
visualisation and re-lighting of synthetic/real objects. Moreover, it is presented a methodology
based on psychophysical experiments and computational metrics to measure performances of
expansion algorithms. Finally, a compression scheme, inspired by the framework, for High
Dynamic Range Textures, is proposed and evaluated
Compression and Subjective Quality Assessment of 3D Video
In recent years, three-dimensional television (3D TV) has been broadly considered as the successor to the existing traditional two-dimensional television (2D TV) sets. With its capability of offering a dynamic and immersive experience, 3D video (3DV) is expected to expand conventional video in several applications in the near future. However, 3D content requires more than a single view to deliver the depth sensation to the viewers and this, inevitably, increases the bitrate compared to the corresponding 2D content. This need drives the research trend in video compression field towards more advanced and more efficient algorithms.
Currently, the Advanced Video Coding (H.264/AVC) is the state-of-the-art video coding standard which has been developed by the Joint Video Team of ISO/IEC MPEG and ITU-T VCEG. This codec has been widely adopted in various applications and products such as TV broadcasting, video conferencing, mobile TV, and blue-ray disc. One important extension of H.264/AVC, namely Multiview Video Coding (MVC) was an attempt to multiple view compression by taking into consideration the inter-view dependency between different views of the same scene. This codec H.264/AVC with its MVC extension (H.264/MVC) can be used for encoding either conventional stereoscopic video, including only two views, or multiview video, including more than two views.
In spite of the high performance of H.264/MVC, a typical multiview video sequence requires a huge amount of storage space, which is proportional to the number of offered views. The available views are still limited and the research has been devoted to synthesizing an arbitrary number of views using the multiview video and depth map (MVD). This process is mandatory for auto-stereoscopic displays (ASDs) where many views are required at the viewer side and there is no way to transmit such a relatively huge number of views with currently available broadcasting technology. Therefore, to satisfy the growing hunger for 3D related applications, it is mandatory to further decrease the bitstream by introducing new and more efficient algorithms for compressing multiview video and depth maps.
This thesis tackles the 3D content compression targeting different formats i.e. stereoscopic video and depth-enhanced multiview video. Stereoscopic video compression algorithms introduced in this thesis mostly focus on proposing different types of asymmetry between the left and right views. This means reducing the quality of one view compared to the other view aiming to achieve a better subjective quality against the symmetric case (the reference) and under the same bitrate constraint. The proposed algorithms to optimize depth-enhanced multiview video compression include both texture compression schemes as well as depth map coding tools. Some of the introduced coding schemes proposed for this format include asymmetric quality between the views.
Knowing that objective metrics are not able to accurately estimate the subjective quality of stereoscopic content, it is suggested to perform subjective quality assessment to evaluate different codecs. Moreover, when the concept of asymmetry is introduced, the Human Visual System (HVS) performs a fusion process which is not completely understood. Therefore, another important aspect of this thesis is conducting several subjective tests and reporting the subjective ratings to evaluate the perceived quality of the proposed coded content against the references. Statistical analysis is carried out in the thesis to assess the validity of the subjective ratings and determine the best performing test cases
Inverse tone mapping
The introduction of High Dynamic Range Imaging in computer graphics has produced a novelty in Imaging that can be compared to the introduction of colour photography or even more. Light can now be captured, stored, processed, and finally visualised without losing information. Moreover, new applications that can exploit physical values of the light have been introduced such as re-lighting of synthetic/real objects, or enhanced visualisation of scenes. However, these new processing and visualisation techniques cannot be applied to movies and pictures that have been produced by photography and cinematography in more than one hundred years. This thesis introduces a general framework for expanding legacy content into High Dynamic Range content. The expansion is achieved avoiding artefacts, producing images suitable for visualisation and re-lighting of synthetic/real objects. Moreover, it is presented a methodology based on psychophysical experiments and computational metrics to measure performances of expansion algorithms. Finally, a compression scheme, inspired by the framework, for High Dynamic Range Textures, is proposed and evaluated.EThOS - Electronic Theses Online ServiceEngineering and Physical Sciences Research Council (EPSRC) (EP/D032148)GBUnited Kingdo