Search CORE

214 research outputs found

An object-based approach to plenoptic videos

Author: Chan SC
Gan ZF
Ng KT
Shum HY
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

This paper proposes an object-based approach to plenoptic videos, where the plenoptic video sequences are segmented into image-based rendering (IBR) objects each with its image sequence, depth map and other relevant information such as shape information. This allows desirable functionalities such as scalability of contents, error resilience, and interactivity with individual IBR objects to be supported. A portable capturing system consisting of two linear camera arrays, each hosting 6 JVC video cameras, was developed to verify the proposed approach. Rendering and compression results of real-world scenes demonstrate the usefulness and good quality of the proposed approach. © 2005 IEEE.published_or_final_versio

HKU Scholars Hub

Real-time refocusing using an FPGA-based standard plenoptic camera

Author: Aggoun Amar
Hahne Christopher
Lumsdaine Andrew
Velisavljević Vladan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2018
Field of study

Plenoptic cameras are receiving increased attention in scientific and commercial applications because they capture the entire structure of light in a scene, enabling optical transforms (such as focusing) to be applied computationally after the fact, rather than once and for all at the time a picture is taken. In many settings, real-time inter active performance is also desired, which in turn requires significant computational power due to the large amount of data required to represent a plenoptic image. Although GPUs have been shown to provide acceptable performance for real-time plenoptic rendering, their cost and power requirements make them prohibitive for embedded uses (such as in-camera). On the other hand, the computation to accomplish plenoptic rendering is well structured, suggesting the use of specialized hardware. Accordingly, this paper presents an array of switch-driven finite impulse response filters, implemented with FPGA to accomplish high-throughput spatial-domain rendering. The proposed architecture provides a power-efficient rendering hardware design suitable for full-video applications as required in broadcasting or cinematography. A benchmark assessment of the proposed hardware implementation shows that real-time performance can readily be achieved, with a one order of magnitude performance improvement over a GPU implementation and three orders ofmagnitude performance improvement over a general-purpose CPU implementation

Wolverhampton Intellectual Repository and E-theses

University of Bedfordshire Repository

Coherent multi-dimensional segmentation of multiview images using a variational framework and applications to image based rendering

Author: Berent Jesse
Berent Jesse
Publication venue: Electrical & Electronic Engineering, Imperial College London
Publication date: 01/10/2008
Field of study

Image Based Rendering (IBR) and in particular light field rendering has attracted a lot of attention for interpolating new viewpoints from a set of multiview images. New images of a scene are interpolated directly from nearby available ones, thus enabling a photorealistic rendering. Sampling theory for light fields has shown that exact geometric information in the scene is often unnecessary for rendering new views. Indeed, the band of the function is approximately limited and new views can be rendered using classical interpolation methods. However, IBR using undersampled light fields suffers from aliasing effects and is difficult particularly when the scene has large depth variations and occlusions. In order to deal with these cases, we study two approaches: New sampling schemes have recently emerged that are able to perfectly reconstruct certain classes of parametric signals that are not bandlimited but characterized by a finite number of parameters. In this context, we derive novel sampling schemes for piecewise sinusoidal and polynomial signals. In particular, we show that a piecewise sinusoidal signal with arbitrarily high frequencies can be exactly recovered given certain conditions. These results are applied to parametric multiview data that are not bandlimited. We also focus on the problem of extracting regions (or layers) in multiview images that can be individually rendered free of aliasing. The problem is posed in a multidimensional variational framework using region competition. In extension to previous methods, layers are considered as multi-dimensional hypervolumes. Therefore the segmentation is done jointly over all the images and coherence is imposed throughout the data. However, instead of propagating active hypersurfaces, we derive a semi-parametric methodology that takes into account the constraints imposed by the camera setup and the occlusion ordering. The resulting framework is a global multi-dimensional region competition that is consistent in all the images and efficiently handles occlusions. We show the validity of the approach with captured light fields. Other special effects such as augmented reality and disocclusion of hidden objects are also demonstrated

Spiral - Imperial College Digital Repository

Steered mixture-of-experts for light field images and video : representation and coding

Author: Lambert Peter
Sikora Thomas
Van Wallendael Glenn
Verhack Ruben
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

Ghent University Academic Bibliography

Light Field compression and manipulation via residual convolutional neural network

Author: Hedayati Eisa
Publication venue: Digital Commons @ Michigan Tech
Publication date: 01/01/2021
Field of study

Light field (LF) imaging has gained significant attention due to its recent success in microscopy, 3-dimensional (3D) displaying and rendering, augmented and virtual reality usage. Postprocessing of LF enables us to extract more information from a scene compared to traditional cameras. However, the use of LF is still a research novelty because of the current limitations in capturing high-resolution LF in all of its four dimensions. While researchers are actively improving methods of capturing high-resolution LF\u27s, using simulation, it is possible to explore a high-quality captured LF\u27s properties. The immediate concerns following the LF capture are its storage and processing time. A rich LF occupies a large chunk of memory ---order of multiple gigabytes per LF---. Also, most feature extraction techniques associated with LF postprocessing involve multi-dimensional integration that requires access to the whole LF and is usually time-consuming. Recent advancements in computer processing units made it possible to simulate realistic images using physical-based rendering software. In this work, at first, a transformation function is proposed for building a camera array (CA) to capture the same portion of LF from a scene that a standard plenoptic camera (SPC) can acquire. Using this transformation, LF simulation with similar properties as a plenoptic camera will become trivial in any rendering software. Artificial intelligence (AI) and machine learning (ML) algorithms ---when deployed on the new generation of GPUs--- are faster than ever. It is possible to generate and train large networks with millions of trainable parameters to learn very complex features. Here, residual convolutional neural network (RCNN) structures are employed to build complex networks for compression and feature extraction from an LF. By combining state-of-the-art image compression and RCNN, I have created a compression pipeline. The proposed pipeline\u27s bit per pixel (bpp) ratio is 0.0047 on average. I show that with a 1% compression time cost and 18x speedup for decompression, our methods reconstructed LFs have better structural similarity index metric (SSIM) and comparable peak signal-to-noise ratio (PSNR) compared to the state-of-the-art video compression techniques used to compress LFs. In the end, using RCNN, I created a network called RefNet, for extracting a group of 16 refocused images from a raw LF. The training parameters of the 16 LFs are set to (\alpha=0.125, 0.250, 0.375, ..., 2.0) for training. I show that RefNet is 134x faster than the state-of-the-art refocusing technique. The RefNet is also superior in color prediction compared to the state-of-the-art ---Fourier slice and shift-and-sum--- methods

Michigan Technological University

Constructing interactive multi-view videos based on image-based rendering

Author: Shih-Ming Chang
Publication venue: 'Inderscience Publishers'
Publication date: 01/01/2015
Field of study

[[abstract]]In this paper, we use image-based rendering (IBR) to develop a scene rotation mechanism. We shot several images in the same scene and computed the angles between images. A video is then composed, allowing users to select viewing angles when the video is playing. We made three kinds of assumptions that may affect the resulting video, and proved our assumptions by a series of experiments. Finally, we use video of realistic scenario and produce interactive video by the proposed method. The contribution also includes techniques to compute geometric parameters of the scene from one or more images.[[notice]]補正完

Crossref

Tamkang University Institutional Repository

Image-Based Rendering Of Real Environments For Virtual Reality

Author: Bertel Tobias
Publication venue
Publication date: 14/02/2022
Field of study

OPUS

Non-disruptive use of light fields in image and video processing

Author: Hariharan Harini Priyadarshini
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2022
Field of study

In the age of computational imaging, cameras capture not only an image but also data. This captured additional data can be best used for photo-realistic renderings facilitating numerous post-processing possibilities such as perspective shift, depth scaling, digital refocus, 3D reconstruction, and much more. In computational photography, the light field imaging technology captures the complete volumetric information of a scene. This technology has the highest potential to accelerate immersive experiences towards close-toreality. It has gained significance in both commercial and research domains. However, due to lack of coding and storage formats and also the incompatibility of the tools to process and enable the data, light fields are not exploited to its full potential. This dissertation approaches the integration of light field data to image and video processing. Towards this goal, the representation of light fields using advanced file formats designed for 2D image assemblies to facilitate asset re-usability and interoperability between applications and devices is addressed. The novel 5D light field acquisition and the on-going research on coding frameworks are presented. Multiple techniques for optimised sequencing of light field data are also proposed. As light fields contain complete 3D information of a scene, large amounts of data is captured and is highly redundant in nature. Hence, by pre-processing the data using the proposed approaches, excellent coding performance can be achieved.Im Zeitalter der computergestützten Bildgebung erfassen Kameras nicht mehr nur ein Bild, sondern vielmehr auch Daten. Diese erfassten Zusatzdaten lassen sich optimal für fotorealistische Renderings nutzen und erlauben zahlreiche Nachbearbeitungsmöglichkeiten, wie Perspektivwechsel, Tiefenskalierung, digitale Nachfokussierung, 3D-Rekonstruktion und vieles mehr. In der computergestützten Fotografie erfasst die Lichtfeld-Abbildungstechnologie die vollständige volumetrische Information einer Szene. Diese Technologie bietet dabei das größte Potenzial, immersive Erlebnisse zu mehr Realitätsnähe zu beschleunigen. Deshalb gewinnt sie sowohl im kommerziellen Sektor als auch im Forschungsbereich zunehmend an Bedeutung. Aufgrund fehlender Kompressions- und Speicherformate sowie der Inkompatibilität derWerkzeuge zur Verarbeitung und Freigabe der Daten, wird das Potenzial der Lichtfelder nicht voll ausgeschöpft. Diese Dissertation ermöglicht die Integration von Lichtfelddaten in die Bild- und Videoverarbeitung. Hierzu wird die Darstellung von Lichtfeldern mit Hilfe von fortschrittlichen für 2D-Bilder entwickelten Dateiformaten erarbeitet, um die Wiederverwendbarkeit von Assets- Dateien und die Kompatibilität zwischen Anwendungen und Geräten zu erleichtern. Die neuartige 5D-Lichtfeldaufnahme und die aktuelle Forschung an Kompressions-Rahmenbedingungen werden vorgestellt. Es werden zudem verschiedene Techniken für eine optimierte Sequenzierung von Lichtfelddaten vorgeschlagen. Da Lichtfelder die vollständige 3D-Information einer Szene beinhalten, wird eine große Menge an Daten, die in hohem Maße redundant sind, erfasst. Die hier vorgeschlagenen Ansätze zur Datenvorverarbeitung erreichen dabei eine ausgezeichnete Komprimierleistung

Universaar

Acronym

Recommended from our members

Post-production of holoscopic 3D image

Author: Abdul Fatah Obaidullah
Publication venue: Brunel University London
Publication date: 01/01/2015
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University LondonHoloscopic 3D imaging also known as “Integral imaging” was first proposed by Lippmann in 1908. It facilitates a promising technique for creating full colour spatial image that exists in space. It promotes a single lens aperture for recording spatial images of a real scene, thus it offers omnidirectional motion parallax and true 3D depth, which is the fundamental feature for digital refocusing. While stereoscopic and multiview 3D imaging systems simulate human eye technique, holoscopic 3D imaging system mimics fly’s eye technique, in which viewpoints are orthographic projection. This system enables true 3D representation of a real scene in space, thus it offers richer spatial cues compared to stereoscopic 3D and multiview 3D systems. Focus has been the greatest challenge since the beginning of photography. It is becoming even more critical in film production where focus pullers are finding it difficult to get the right focus with camera resolution becoming increasingly higher. Holoscopic 3D imaging enables the user to carry out re/focusing in post-production. There have been three main types of digital refocusing methods namely Shift and Integration, full resolution, and full resolution with blind. However, these methods suffer from artifacts and unsatisfactory resolution in the final resulting image. For instance the artifacts are in the form of blocky and blurry pictures, due to unmatched boundaries. An upsampling method is proposed that improves the resolution of the resulting image of shift and integration approach. Sub-pixel adjustment of elemental images including “upsampling technique” with smart filters are proposed to reduce the artifacts, introduced by full resolution with blind method as well as to improve both image quality and resolution of the final rendered image. A novel 3D object extraction method is proposed that takes advantage of disparity, which is also applied to generate stereoscopic 3D images from holoscopic 3D image. Cross correlation matching algorithm is used to obtain the disparity map from the disparity information and the desirable object is then extracted. In addition, 3D image conversion algorithm is proposed for the generation of stereoscopic and multiview 3D images from both unidirectional and omnidirectional holoscopic 3D images, which facilitates 3D content reformation

Brunel University Research Archive