134 research outputs found
Recommended from our members
Camera positioning for 3D panoramic image rendering
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London.Virtual camera realisation and the proposition of trapezoidal camera architecture are the two broad contributions of this thesis. Firstly, multiple camera and their arrangement constitute a critical component which affect the integrity of visual content acquisition for multi-view video. Currently, linear, convergence, and divergence arrays are the prominent camera topologies adopted. However, the large number of cameras required and their synchronisation are two of prominent challenges usually encountered. The use of virtual cameras can significantly reduce the number of physical cameras used with respect to any of the known
camera structures, hence adequately reducing some of the other implementation issues. This thesis explores to use image-based rendering with and without geometry in the implementations leading to the realisation of virtual cameras. The virtual camera implementation was carried out from the perspective of depth map (geometry) and use of multiple image samples (no geometry). Prior to the virtual camera realisation, the generation of depth map was investigated using region match measures widely known for solving image point correspondence problem. The constructed depth maps have been compare with the ones generated
using the dynamic programming approach. In both the geometry and no geometry approaches, the virtual cameras lead to the rendering of views from a textured depth map, construction of 3D panoramic image of a scene by stitching multiple image samples and performing superposition on them, and computation
of virtual scene from a stereo pair of panoramic images. The quality of these rendered images were assessed through the use of either objective or subjective analysis in Imatest software. Further more, metric reconstruction of a scene was performed by re-projection of the pixel points from multiple image samples with
a single centre of projection. This was done using sparse bundle adjustment algorithm. The statistical summary obtained after the application of this algorithm provides a gauge for the efficiency of the optimisation step. The optimised data was then visualised in Meshlab software environment, hence providing the reconstructed scene. Secondly, with any of the well-established camera arrangements, all cameras are usually constrained to the same horizontal plane. Therefore, occlusion becomes an extremely challenging problem, and a robust camera set-up is required in order to resolve strongly the hidden part of any scene objects.
To adequately meet the visibility condition for scene objects and given that occlusion of the same scene objects can occur, a multi-plane camera structure is highly desirable. Therefore, this thesis also explore trapezoidal camera structure for image acquisition. The approach here is to assess the feasibility and potential
of several physical cameras of the same model being sparsely arranged on the edge of an efficient trapezoid graph. This is implemented both Matlab and Maya. The quality of the depth maps rendered in Matlab are better in Quality
Encoding high dynamic range and wide color gamut imagery
In dieser Dissertation wird ein szenischer Bewegtbilddatensatz mit erweitertem Dynamikumfang (High Dynamic Range, HDR) und groĂem Farbumfang (Wide Color Gamut, WCG) eingefĂŒhrt und es werden Modelle zur Kodierung von HDR und WCG Bildern vorgestellt.
Die objektive und visuelle Evaluation neuer HDR und WCG Bildverarbeitungsalgorithmen, Kompressionsverfahren und BildwiedergabegerĂ€te erfordert einen Referenzdatensatz hoher QualitĂ€t. Daher wird ein neuer HDR- und WCG-Video-Datensatz mit einem Dynamikumfang von bis zu 18 fotografischen Blenden eingefĂŒhrt. Er enthĂ€lt inszenierte und dokumentarische Szenen. Die einzelnen Szenen sind konzipiert um eine Herausforderung fĂŒr Tone Mapping Operatoren, Gamut Mapping Algorithmen, Kompressionscodecs und HDR und WCG BildanzeigegerĂ€te darzustellen. Die Szenen sind mit professionellem Licht, Maske und Filmausstattung aufgenommen. Um einen cinematischen Bildeindruck zu erhalten, werden digitale Filmkameras mit âSuper-35 mmâ SensorgröĂe verwendet.
Der zusĂ€tzliche Informationsgehalt von HDR- und WCG-Videosignalen erfordert im Vergleich zu Signalen mit herkömmlichem Dynamikumfang eine neue und effizientere Signalkodierung. Ein Farbraum fĂŒr HDR und WCG Video sollte nicht nur effizient quantisieren, sondern wegen der unterschiedlichen Monitoreigenschaften auf der EmpfĂ€ngerseite auch fĂŒr die Dynamik- und Farbumfangsanpassung geeignet sein. Bisher wurden Methoden fĂŒr die Quantisierung von HDR Luminanzsignalen vorgeschlagen. Es fehlt jedoch noch ein entsprechendes Modell fĂŒr Farbdifferenzsignale. Es werden daher zwei neue FarbrĂ€ume eingefĂŒhrt, die sich sowohl fĂŒr die effiziente Kodierung von HDR und WCG Signalen als auch fĂŒr die Dynamik- und Farbumfangsanpassung eignen. Diese FarbrĂ€ume werden mit existierenden HDR und WCG Farbsignalkodierungen des aktuellen Stands der Technik verglichen. Die vorgestellten Kodierungsschemata erlauben es, HDR- und WCG-Video mittels drei FarbkanĂ€len mit 12 Bits tonaler Auflösung zu quantisieren, ohne dass Quantisierungsartefakte sichtbar werden.
WĂ€hrend die Speicherung und Ăbertragung von HDR und WCG Video mit 12-Bit Farbtiefe pro Kanal angestrebt wird, unterstĂŒtzen aktuell verbreitete Dateiformate, Videoschnittstellen und Kompressionscodecs oft nur niedrigere Bittiefen. Um diese existierende Infrastruktur fĂŒr die HDR VideoĂŒbertragung und -speicherung nutzen zu können, wird ein neues bildinhaltsabhĂ€ngiges Quantisierungsschema eingefĂŒhrt. Diese Quantisierungsmethode nutzt Bildeigenschaften wie Rauschen und Textur um die benötigte tonale Auflösung fĂŒr die visuell verlustlose Quantisierung zu schĂ€tzen. Die vorgestellte Methode erlaubt es HDR Video mit einer Bittiefe von 10 Bits ohne sichtbare Unterschiede zum Original zu quantisieren und kommt mit weniger Rechenkraft im Vergleich zu aktuellen HDR Bilddifferenzmetriken aus
Discrete Wavelet Transforms
The discrete wavelet transform (DWT) algorithms have a firm position in processing of signals in several areas of research and industry. As DWT provides both octave-scale frequency and spatial timing of the analyzed signal, it is constantly used to solve and treat more and more advanced problems. The present book: Discrete Wavelet Transforms: Algorithms and Applications reviews the recent progress in discrete wavelet transform algorithms and applications. The book covers a wide range of methods (e.g. lifting, shift invariance, multi-scale analysis) for constructing DWTs. The book chapters are organized into four major parts. Part I describes the progress in hardware implementations of the DWT algorithms. Applications include multitone modulation for ADSL and equalization techniques, a scalable architecture for FPGA-implementation, lifting based algorithm for VLSI implementation, comparison between DWT and FFT based OFDM and modified SPIHT codec. Part II addresses image processing algorithms such as multiresolution approach for edge detection, low bit rate image compression, low complexity implementation of CQF wavelets and compression of multi-component images. Part III focuses watermaking DWT algorithms. Finally, Part IV describes shift invariant DWTs, DC lossless property, DWT based analysis and estimation of colored noise and an application of the wavelet Galerkin method. The chapters of the present book consist of both tutorial and highly advanced material. Therefore, the book is intended to be a reference text for graduate students and researchers to obtain state-of-the-art knowledge on specific applications
Camera based Display Image Quality Assessment
This thesis presents the outcomes of research carried out by the PhD candidate Ping Zhao during 2012 to 2015 in GjĂžvik University College. The underlying research was a part of the HyPerCept project, in the program of Strategic Projects for University Colleges, which was funded by The Research Council of Norway. The research was engaged under the supervision of Professor Jon Yngve Hardeberg and co-supervision of Associate Professor Marius Pedersen, from The Norwegian Colour and Visual Computing Laboratory, in the Faculty of Computer Science and Media Technology of GjĂžvik University College; as well as the co-supervision of Associate Professor Jean-Baptiste Thomas, from The Laboratoire Electronique, Informatique et Image, in the Faculty of Computer Science of UniversitÂŽe de Bourgogne. The main goal of this research was to develop a fast and an inexpensive camera based display image quality assessment framework. Due to the limited time frame, we decided to focus only on projection displays with static images displayed on them. However, the proposed methods were not limited to projection displays, and they were expected to work with other types of displays, such as desktop monitors, laptop screens, smart phone screens, etc., with limited modifications. The primary contributions from this research can be summarized as follows:
1. We proposed a camera based display image quality assessment framework, which was originally designed for projection displays but it can be used for other types of displays with limited modifications.
2. We proposed a method to calibrate the camera in order to eliminate unwanted vignetting artifact, which is mainly introduced by the camera lens.
3. We proposed a method to optimize the cameraâs exposure with respect to the measured luminance of incident light, so that after the calibration all camera sensors share a common linear response region.
4. We proposed a marker-less and view-independent method to register one captured image with its original at a sub-pixel level, so that we can incorporate existing full reference image quality metrics without modifying them.
5. We identified spatial uniformity, contrast and sharpness as the most important image quality attributes for projection displays, and we used the proposed framework to evaluate the prediction performance of the state-of-the-art image quality metrics regarding these attributes.
The proposed image quality assessment framework is the core contribution of this research. Comparing to conventional image quality assessment approaches, which were largely based on the measurements of colorimeter or spectroradiometer, using camera as the acquisition device has the advantages of quickly recording all displayed pixels in one shot, relatively inexpensive to purchase the instrument. Therefore, the consumption of time and resources for image quality assessment can be largely reduced. We proposed a method to calibrate the camera in order to eliminate unwanted vignetting artifact primarily introduced by the camera lens. We used a hazy sky as a closely uniform light source, and the vignetting mask was generated with respect to the median sensor responses over i only a few rotated shots of the same spot on the sky. We also proposed a method to quickly determine whether all camera sensors were sharing a common linear response region. In order to incorporate existing full reference image quality metrics without modifying them, an accurate registration of pairs of pixels between one captured image and its original is required. We proposed a marker-less and view-independent image registration method to solve this problem. The experimental results proved that the proposed method worked well in the viewing conditions with a low ambient light. We further identified spatial uniformity, contrast and sharpness as the most important image quality attributes for projection displays. Subsequently, we used the developed framework to objectively evaluate the prediction performance of the state-of-art image quality metrics regarding these attributes in a robust manner. In this process, the metrics were benchmarked with respect to the correlations between the prediction results and the perceptual ratings collected from subjective experiments. The analysis of the experimental results indicated that our proposed methods were effective and efficient. Subjective experiment is an essential component for image quality assessment; however it can be time and resource consuming, especially in the cases that additional image distortion levels are required to extend the existing subjective experimental results. For this reason, we investigated the possibility of extending subjective experiments with baseline adjustment method, and we found that the method could work well if appropriate strategies were applied. The underlying strategies referred to the best distortion levels to be included in the baseline, as well as the number of them
Recommended from our members
Depth Estimation from a Single Holoscopic 3D Image and Image Up-sampling with Deep-learning
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London3D depth information is widely utilized in industries such as security, autonomous vehicles, robotics, 3D printing, AR/VR entertainment, cinematography and medical science. However, state-of-the-art imaging and 3D depth-sensing technologies are rather complicated or expensive and still lack scalability and interoperability. The research identified, entails the development of an innovative technique for reliable and efficient 3D depth estimation that deliver better accuracy. The proposed (1) multilayer Holoscopic 3D encoding technique reduces the computational cost of extracting viewpoint images from complex structured Holoscopic 3D data by 95%, by using labelled multilayer elemental images. It also addresses misplacement of elemental image pixels due to lens distortion error. The multilayer Holoscopic 3D encoding computing efficiency leads to the implementation of real-time 3D depth-dependent applications. Also, (2) an innovative approach of a deep learning-based single image super-resolution framework is developed and evaluated. It identified that learning-based image up-sampling techniques could be used regardless of inadequate 3D training data, as 2D training data can yield the same results.
(3) The research is extended further by implementation of an H3D depth disparity -based framework, where a Holoscopic content adaptation technique for extracting semi-segmented stereo viewpoint image is introduced, and the design of a smart 3D depth mapping technique is proposed. Particularly, it provides a somewhat accurate 3D depth estimation from H3D images in near real-time. Holoscopic 3D image has thousands of perspective elemental images from omnidirectional viewpoint images and (4) a novel 3D depth estimation technique is developed to estimates 3D depth information directly from a single Holoscopic 3D image without the loss of any angular information and the introduction of unwanted artefacts. The proposed 3D depth measurement techniques are computationally efficient and robust with high accuracy; these can be incorporated in real-time applications of autonomous vehicles, security and AR/VR for real-time interaction
Colour depth-from-defocus incorporating experimental point spread function measurements
Depth-From-Defocus (DFD) is a monocular computer vision technique for creating
depth maps from two images taken on the same optical axis with different intrinsic camera
parameters. A pre-processing stage for optimally converting colour images to monochrome
using a linear combination of the colour planes has been shown to improve the
accuracy of the depth map. It was found that the first component formed using Principal
Component Analysis (PCA) and a technique to maximise the signal-to-noise ratio (SNR)
performed better than using an equal weighting of the colour planes with an additive noise
model. When the noise is non-isotropic the Mean Square Error (MSE) of the depth map
by maximising the SNR was improved by 7.8 times compared to an equal weighting and
1.9 compared to PCA. The fractal dimension (FD) of a monochrome image gives a measure
of its roughness and an algorithm was devised to maximise its FD through colour
mixing. The formulation using a fractional Brownian motion (mm) model reduced the
SNR and thus produced depth maps that were less accurate than using PCA or an equal
weighting. An active DFD algorithm to reduce the image overlap problem has been
developed, called Localisation through Colour Mixing (LCM), that uses a projected colour
pattern. Simulation results showed that LCM produces a MSE 9.4 times lower than equal
weighting and 2.2 times lower than PCA.
The Point Spread Function (PSF) of a camera system models how a point source of
light is imaged. For depth maps to be accurately created using DFD a high-precision PSF
must be known. Improvements to a sub-sampled, knife-edge based technique are presented
that account for non-uniform illumination of the light box and this reduced the
MSE by 25%. The Generalised Gaussian is presented as a model of the PSF and shown to
be up to 16 times better than the conventional models of the Gaussian and pillbox
BlickpunktabhÀngige Computergraphik
Contemporary digital displays feature multi-million pixels at ever-increasing refresh rates. Reality, on the other hand, provides us with a view of the world that is continuous in space and time. The discrepancy between viewing the physical world and its sampled depiction on digital displays gives rise to perceptual quality degradations. By measuring or estimating where we look, gaze-contingent algorithms aim at exploiting the way we visually perceive to remedy visible artifacts. This dissertation presents a variety of novel gaze-contingent algorithms and respective perceptual studies. Chapter 4 and 5 present methods to boost perceived visual quality of conventional video footage when viewed on commodity monitors or projectors. In Chapter 6 a novel head-mounted display with real-time gaze tracking is described. The device enables a large variety of applications in the context of Virtual Reality and Augmented Reality. Using the gaze-tracking VR headset, a novel gaze-contingent render method is described in Chapter 7. The gaze-aware approach greatly reduces computational efforts for shading virtual worlds. The described methods and studies show that gaze-contingent algorithms are able to improve the quality of displayed images and videos or reduce the computational effort for image generation, while display quality perceived by the user does not change.Moderne digitale Bildschirme ermöglichen immer höhere Auflösungen bei ebenfalls steigenden Bildwiederholraten. Die RealitĂ€t hingegen ist in Raum und Zeit kontinuierlich. Diese Grundverschiedenheit fĂŒhrt beim Betrachter zu perzeptuellen Unterschieden. Die Verfolgung der Aug-Blickrichtung ermöglicht blickpunktabhĂ€ngige Darstellungsmethoden, die sichtbare Artefakte verhindern können. Diese Dissertation trĂ€gt zu vier Bereichen blickpunktabhĂ€ngiger und wahrnehmungstreuer Darstellungsmethoden bei. Die Verfahren in Kapitel 4 und 5 haben zum Ziel, die wahrgenommene visuelle QualitĂ€t von Videos fĂŒr den Betrachter zu erhöhen, wobei die Videos auf gewöhnlicher Ausgabehardware wie z.B. einem Fernseher oder Projektor dargestellt werden. Kapitel 6 beschreibt die Entwicklung eines neuartigen Head-mounted Displays mit UnterstĂŒtzung zur Erfassung der Blickrichtung in Echtzeit. Die Kombination der Funktionen ermöglicht eine Reihe interessanter Anwendungen in Bezug auf Virtuelle RealitĂ€t (VR) und Erweiterte RealitĂ€t (AR). Das vierte und abschlieĂende Verfahren in Kapitel 7 dieser Dissertation beschreibt einen neuen Algorithmus, der das entwickelte Eye-Tracking Head-mounted Display zum blickpunktabhĂ€ngigen Rendern nutzt. Die QualitĂ€t des Shadings wird hierbei auf Basis eines Wahrnehmungsmodells fĂŒr jeden Bildpixel in Echtzeit analysiert und angepasst. Das Verfahren hat das Potenzial den Berechnungsaufwand fĂŒr das Shading einer virtuellen Szene auf ein Bruchteil zu reduzieren. Die in dieser Dissertation beschriebenen Verfahren und Untersuchungen zeigen, dass blickpunktabhĂ€ngige Algorithmen die DarstellungsqualitĂ€t von Bildern und Videos wirksam verbessern können, beziehungsweise sich bei gleichbleibender BildqualitĂ€t der Berechnungsaufwand des bildgebenden Verfahrens erheblich verringern lĂ€sst
Recommended from our members
A Novel Inpainting Framework for Virtual View Synthesis
Multi-view imaging has stimulated significant research to enhance the user experience of free viewpoint video, allowing interactive navigation between views and the freedom to select a desired view to watch. This usually involves transmitting both textural and depth information captured from different viewpoints to the receiver, to enable the synthesis of an arbitrary view. In rendering these virtual views, perceptual holes can appear due to certain regions, hidden in the original view by a closer object, becoming visible in the virtual view. To provide a high quality experience these holes must be filled in a visually plausible way, in a process known as inpainting. This is challenging because the missing information is generally unknown and the hole-regions can be large. Recently depth-based inpainting techniques have been proposed to address this challenge and while these generally perform better than non-depth assisted methods, they are not very robust and can produce perceptual artefacts.
This thesis presents a new inpainting framework that innovatively exploits depth and textural self-similarity characteristics to construct subjectively enhanced virtual viewpoints. The framework makes three significant contributions to the field: i) the exploitation of view information to jointly inpaint textural and depth hole regions; ii) the introduction of the novel concept of self-similarity characterisation which is combined with relevant depth information; and iii) an advanced self-similarity characterising scheme that automatically determines key spatial transform parameters for effective and flexible inpainting.
The presented inpainting framework has been critically analysed and shown to provide superior performance both perceptually and numerically compared to existing techniques, especially in terms of lower visual artefacts. It provides a flexible robust framework to develop new inpainting strategies for the next generation of interactive multi-view technologies
Interdisciplinarity in the Age of the Triple Helix: a Film Practitioner's Perspective
This integrative chapter contextualises my research including articles I have published as well as one of the creative artefacts developed from it, the feature film The Knife That Killed Me. I review my work considering the ways in which technology, industry methods and academic practice have evolved as well as how attitudes to interdisciplinarity have changed, linking these to Etzkowitz and Leydesdorffâs âTriple Helixâ model (1995). I explore my own experiences and observations of opportunities and challenges that have been posed by the intersection of different stakeholder needs and expectations, both from industry and academic perspectives, and argue that my work provides novel examples of the applicability of the âTriple Helixâ to the creative industries. The chapter concludes with a reflection on the evolution and direction of my work, the relevance of the âTriple Helixâ to creative practice, and ways in which this relationship could be investigated further
The Art of Movies
Movie is considered to be an important art form; films entertain, educate, enlighten and inspire audiences.
Film is a term that encompasses motion pictures as individual projects, as well as â in metonymy â the field in general. The origin of the name comes from the fact that photographic film (also called filmstock) has historically been the primary medium for recording and displaying motion pictures. Many other terms exist â motion pictures (or just pictures or âpictureâ), the silver screen, photoplays, the cinema, picture shows, flicks â and commonly movies
- âŠ