251 research outputs found
Recommended from our members
Camera positioning for 3D panoramic image rendering
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London.Virtual camera realisation and the proposition of trapezoidal camera architecture are the two broad contributions of this thesis. Firstly, multiple camera and their arrangement constitute a critical component which affect the integrity of visual content acquisition for multi-view video. Currently, linear, convergence, and divergence arrays are the prominent camera topologies adopted. However, the large number of cameras required and their synchronisation are two of prominent challenges usually encountered. The use of virtual cameras can significantly reduce the number of physical cameras used with respect to any of the known
camera structures, hence adequately reducing some of the other implementation issues. This thesis explores to use image-based rendering with and without geometry in the implementations leading to the realisation of virtual cameras. The virtual camera implementation was carried out from the perspective of depth map (geometry) and use of multiple image samples (no geometry). Prior to the virtual camera realisation, the generation of depth map was investigated using region match measures widely known for solving image point correspondence problem. The constructed depth maps have been compare with the ones generated
using the dynamic programming approach. In both the geometry and no geometry approaches, the virtual cameras lead to the rendering of views from a textured depth map, construction of 3D panoramic image of a scene by stitching multiple image samples and performing superposition on them, and computation
of virtual scene from a stereo pair of panoramic images. The quality of these rendered images were assessed through the use of either objective or subjective analysis in Imatest software. Further more, metric reconstruction of a scene was performed by re-projection of the pixel points from multiple image samples with
a single centre of projection. This was done using sparse bundle adjustment algorithm. The statistical summary obtained after the application of this algorithm provides a gauge for the efficiency of the optimisation step. The optimised data was then visualised in Meshlab software environment, hence providing the reconstructed scene. Secondly, with any of the well-established camera arrangements, all cameras are usually constrained to the same horizontal plane. Therefore, occlusion becomes an extremely challenging problem, and a robust camera set-up is required in order to resolve strongly the hidden part of any scene objects.
To adequately meet the visibility condition for scene objects and given that occlusion of the same scene objects can occur, a multi-plane camera structure is highly desirable. Therefore, this thesis also explore trapezoidal camera structure for image acquisition. The approach here is to assess the feasibility and potential
of several physical cameras of the same model being sparsely arranged on the edge of an efficient trapezoid graph. This is implemented both Matlab and Maya. The quality of the depth maps rendered in Matlab are better in Quality
Data augmentation for NeRF: a geometric consistent solution based on view morphing
NeRF aims to learn a continuous neural scene representation by using a finite
set of input images taken from different viewpoints. The fewer the number of
viewpoints, the higher the likelihood of overfitting on them. This paper
mitigates such limitation by presenting a novel data augmentation approach to
generate geometrically consistent image transitions between viewpoints using
view morphing. View morphing is a highly versatile technique that does not
requires any prior knowledge about the 3D scene because it is based on general
principles of projective geometry. A key novelty of our method is to use the
very same depths predicted by NeRF to generate the image transitions that are
then added to NeRF training. We experimentally show that this procedure enables
NeRF to improve the quality of its synthesised novel views in the case of
datasets with few training viewpoints. We improve PSNR up to 1.8dB and 10.5dB
when eight and four views are used for training, respectively. To the best of
our knowledge, this is the first data augmentation strategy for NeRF that
explicitly synthesises additional new input images to improve the model
generalisation
Status and future of extraterrestrial mapping programs
Extensive mapping programs have been completed for the Earth's Moon and for the planet Mercury. Mars, Venus, and the Galilean satellites of Jupiter (Io, Europa, Ganymede, and Callisto), are currently being mapped. The two Voyager spacecraft are expected to return data from which maps can be made of as many as six of the satellites of Saturn and two or more of the satellites of Uranus. The standard reconnaissance mapping scales used for the planets are 1:25,000,000 and 1:5,000,000; where resolution of data warrants, maps are compiled at the larger scales of 1:2,000,000, 1:1,000,000 and 1:250,000. Planimetric maps of a particular planet are compiled first. The first spacecraft to visit a planet is not designed to return data from which elevations can be determined. As exploration becomes more intensive, more sophisticated missions return photogrammetric and other data to permit compilation of contour maps
Image matching algorithms in stereo vision using address-event- representation: a theoretical study and evaluation of the different algorithms
Image processing in digital computer systems usually considers the visual information as a sequence of frames. These frames are from cameras that capture reality for a short period of time. They are renewed and transmitted at a rate of 25-30 fps (typical real-time scenario). Digital video processing has to process each frame in order to obtain a filter result or detect a feature on the input. In stereo vision, existing algorithms use frames from two digital cameras and process them pixel by pixel until it is found a pattern match in a section of both stereo frames. Spike-based processing is a relatively new approach that implements the processing by manipulating spikes one by one at the time they are transmitted, like a human brain. The mammal nervous system is able to solve much more complex problems, such as visual recognition by manipulating neuronâs spikes. The spike-based philosophy for visual information processing based on the neuro-inspired Address-Event- Representation (AER) is achieving nowadays very high performances. In this work we study the existing digital stereo matching algorithms and how do they work. After that, we propose an AER stereo matching algorithm using some of the principles shown in digital stereo methodsMinisterio de Ciencia e InnovaciĂłn TEC2009-10639-C04-02 (VULCANO)European Union (UE) FP7-248582 (CARDIAC
Stereoscopic Cinema
International audienceStereoscopic cinema has seen a surge of activity in recent years, and for the first time all of the major Hollywood studios released 3-D movies in 2009. This is happening alongside the adoption of 3-D technology for sports broadcasting, and the arrival of 3-D TVs for the home. Two previous attempts to introduce 3-D cinema in the 1950s and the 1980s failed because the contemporary technology was immature and resulted in viewer discomfort. But current technologies â such as accurately-adjustable 3-D camera rigs with onboard computers to automatically inform a camera operator of inappropriate stereoscopic shots, digital processing for post-shooting rectification of the 3-D imagery, digital projectors for accurate positioning of the two stereo projections on the cinema screen, and polarized silver screens to reduce cross-talk between the viewers left- and right-eyes â mean that the viewer experience is at a much higher level of quality than in the past. Even so, creation of stereoscopic cinema is an open, active research area, and there are many challenges from acquisition to post-production to automatic adaptation for different-sized display. This chapter describes the current state-of-the-art in stereoscopic cinema, and directions of future work
An Architecture for High-throughput and Improved-quality Stereo Vision Processor
This paper presents the VLSI architecture to achieve high-throughput and
improved-quality stereo vision for real applications. The stereo vision processor
generates gray-scale output images with depth information from input images taken by
two CMOS Image Sensors (CIS). The depth estimator using the sum of absolute
differences (SAD) algorithm as stereo matching technique is implemented on hardware
by exploiting pipelining and parallelism. To produce depth maps with improved-quality
at real-time, pre- and post-processing units are adopted, and to enhance the adaptability
of the system to real environments, special function registers (SFRs) are assigned to
vision parameters. The design using 0.18um standard CMOS technology can operate at
120MHz clock, achieving over 140 frames/sec depth maps with 320 by 240 image size
and 64 disparity levels. Experimental results based on images taken in real world and
the Middlebury data set will be presented. Comparison data with existing hardware
systems and hardware specifications of the proposed processor will be given
The Psyche Topography and Geomorphology Investigation
Detailed mapping of topography is crucial for the understanding of processes shaping the surfaces of planetary bodies. In particular, stereoscopic imagery makes a major contribution to topographic mapping and especially supports the geologic characterization of planetary surfaces. Image data provide the basis for extensive studies of the surface structure and morphology on local, regional and global scales using photogeologic information from images, the topographic information from stereo-derived digital terrain models and co-registered spectral terrain information from color images. The objective of the Psyche topography and geomorphology investigation is to derive the detailed shape of (16) Psyche to generate orthorectified image mosaics, which are needed to study the asteroidsâ landforms, interior structure, and the processes that have modified the surface over geologic time. In this paper we describe our approaches for producing shape models, and our plans for acquiring requested image data to quantify the expected accuracy of the results. Multi-angle images obtained by Psycheâs camera will be used to create topographic models with about 15 m/pixel horizontal resolution and better than 10 m height accuracy on a global scale. This is slightly better as global imaging obtained during the Dawn mission, however, both missions yield resolutions of a few m/pixel locally. Two different techniques, stereophotogrammetry and stereophotoclinometry, are used to model the shape; these models will be merged with the gravity fields obtained by the Psyche spacecraft to produce geodetically controlled topographic models. The resulting digital topography models, together with the gravity data, will reveal the tectonic, volcanic, impact, and gradational history of Psyche, and enable co-registration of data sets to determine Psycheâs geologic history
A Practical Stereo Depth System for Smart Glasses
We present the design of a productionized end-to-end stereo depth sensing
system that does pre-processing, online stereo rectification, and stereo depth
estimation with a fallback to monocular depth estimation when rectification is
unreliable. The output of our depth sensing system is then used in a novel view
generation pipeline to create 3D computational photography effects using
point-of-view images captured by smart glasses. All these steps are executed
on-device on the stringent compute budget of a mobile phone, and because we
expect the users can use a wide range of smartphones, our design needs to be
general and cannot be dependent on a particular hardware or ML accelerator such
as a smartphone GPU. Although each of these steps is well studied, a
description of a practical system is still lacking. For such a system, all
these steps need to work in tandem with one another and fallback gracefully on
failures within the system or less than ideal input data. We show how we handle
unforeseen changes to calibration, e.g., due to heat, robustly support depth
estimation in the wild, and still abide by the memory and latency constraints
required for a smooth user experience. We show that our trained models are
fast, and run in less than 1s on a six-year-old Samsung Galaxy S8 phone's CPU.
Our models generalize well to unseen data and achieve good results on
Middlebury and in-the-wild images captured from the smart glasses.Comment: Accepted at CVPR202
FROM DAGUERREOTYPES TO DIGITAL AUTOMATIC PHOTOGRAMMETRY. APPLICATIONS AND LIMITS FOR THE BUILT HERITAGE PROJECT
This paper will describe the evolutionary stages that shaped and built, over the time, a robust and solid relationship between âindirect survey methodsâ and knowledge of the âarchitectural matterâ, aiming at producing a conservation project for the built heritage.
Collecting architectural data by simply drawing them was considered to be inadequate by John Ruskin already in 1845. He strongly felt the need to fix them through that âblessedâ invention that was the âdaguerreotypeâ. Today taking simple photographs is not enough: it is crucial to develop systems able to provide the best graphics supports (possibly in the third dimension) for the development and editing of the architectural project.
This paper will focus not only on the re-examination of historical data, on the research and representation of the âsignâ, but also on the evolution of technologies and âreading methodsâ, in order to highlight their strengths and weaknesses in the real practice of conservation project and in the use of the architectures of the past
- âŠ