58 research outputs found
FVV Live: A real-time free-viewpoint video system with consumer electronics hardware
FVV Live is a novel end-to-end free-viewpoint video system, designed for low
cost and real-time operation, based on off-the-shelf components. The system has
been designed to yield high-quality free-viewpoint video using consumer-grade
cameras and hardware, which enables low deployment costs and easy installation
for immersive event-broadcasting or videoconferencing.
The paper describes the architecture of the system, including acquisition and
encoding of multiview plus depth data in several capture servers and virtual
view synthesis on an edge server. All the blocks of the system have been
designed to overcome the limitations imposed by hardware and network, which
impact directly on the accuracy of depth data and thus on the quality of
virtual view synthesis. The design of FVV Live allows for an arbitrary number
of cameras and capture servers, and the results presented in this paper
correspond to an implementation with nine stereo-based depth cameras.
FVV Live presents low motion-to-photon and end-to-end delays, which enables
seamless free-viewpoint navigation and bilateral immersive communications.
Moreover, the visual quality of FVV Live has been assessed through subjective
assessment with satisfactory results, and additional comparative tests show
that it is preferred over state-of-the-art DIBR alternatives
Compression and Subjective Quality Assessment of 3D Video
In recent years, three-dimensional television (3D TV) has been broadly considered as the successor to the existing traditional two-dimensional television (2D TV) sets. With its capability of offering a dynamic and immersive experience, 3D video (3DV) is expected to expand conventional video in several applications in the near future. However, 3D content requires more than a single view to deliver the depth sensation to the viewers and this, inevitably, increases the bitrate compared to the corresponding 2D content. This need drives the research trend in video compression field towards more advanced and more efficient algorithms.
Currently, the Advanced Video Coding (H.264/AVC) is the state-of-the-art video coding standard which has been developed by the Joint Video Team of ISO/IEC MPEG and ITU-T VCEG. This codec has been widely adopted in various applications and products such as TV broadcasting, video conferencing, mobile TV, and blue-ray disc. One important extension of H.264/AVC, namely Multiview Video Coding (MVC) was an attempt to multiple view compression by taking into consideration the inter-view dependency between different views of the same scene. This codec H.264/AVC with its MVC extension (H.264/MVC) can be used for encoding either conventional stereoscopic video, including only two views, or multiview video, including more than two views.
In spite of the high performance of H.264/MVC, a typical multiview video sequence requires a huge amount of storage space, which is proportional to the number of offered views. The available views are still limited and the research has been devoted to synthesizing an arbitrary number of views using the multiview video and depth map (MVD). This process is mandatory for auto-stereoscopic displays (ASDs) where many views are required at the viewer side and there is no way to transmit such a relatively huge number of views with currently available broadcasting technology. Therefore, to satisfy the growing hunger for 3D related applications, it is mandatory to further decrease the bitstream by introducing new and more efficient algorithms for compressing multiview video and depth maps.
This thesis tackles the 3D content compression targeting different formats i.e. stereoscopic video and depth-enhanced multiview video. Stereoscopic video compression algorithms introduced in this thesis mostly focus on proposing different types of asymmetry between the left and right views. This means reducing the quality of one view compared to the other view aiming to achieve a better subjective quality against the symmetric case (the reference) and under the same bitrate constraint. The proposed algorithms to optimize depth-enhanced multiview video compression include both texture compression schemes as well as depth map coding tools. Some of the introduced coding schemes proposed for this format include asymmetric quality between the views.
Knowing that objective metrics are not able to accurately estimate the subjective quality of stereoscopic content, it is suggested to perform subjective quality assessment to evaluate different codecs. Moreover, when the concept of asymmetry is introduced, the Human Visual System (HVS) performs a fusion process which is not completely understood. Therefore, another important aspect of this thesis is conducting several subjective tests and reporting the subjective ratings to evaluate the perceived quality of the proposed coded content against the references. Statistical analysis is carried out in the thesis to assess the validity of the subjective ratings and determine the best performing test cases
Stereoscopic high dynamic range imaging
Two modern technologies show promise to dramatically increase immersion in
virtual environments. Stereoscopic imaging captures two images representing
the views of both eyes and allows for better depth perception. High dynamic
range (HDR) imaging accurately represents real world lighting as opposed to
traditional low dynamic range (LDR) imaging. HDR provides a better contrast
and more natural looking scenes. The combination of the two technologies in
order to gain advantages of both has been, until now, mostly unexplored due to
the current limitations in the imaging pipeline. This thesis reviews both fields,
proposes stereoscopic high dynamic range (SHDR) imaging pipeline outlining the
challenges that need to be resolved to enable SHDR and focuses on capture and
compression aspects of that pipeline.
The problems of capturing SHDR images that would potentially require two
HDR cameras and introduce ghosting, are mitigated by capturing an HDR and
LDR pair and using it to generate SHDR images. A detailed user study compared
four different methods of generating SHDR images. Results demonstrated that
one of the methods may produce images perceptually indistinguishable from the
ground truth.
Insights obtained while developing static image operators guided the design
of SHDR video techniques. Three methods for generating SHDR video from an
HDR-LDR video pair are proposed and compared to the ground truth SHDR
videos. Results showed little overall error and identified a method with the least
error.
Once captured, SHDR content needs to be efficiently compressed. Five SHDR
compression methods that are backward compatible are presented. The proposed
methods can encode SHDR content to little more than that of a traditional single
LDR image (18% larger for one method) and the backward compatibility property
encourages early adoption of the format.
The work presented in this thesis has introduced and advanced capture and
compression methods for the adoption of SHDR imaging. In general, this research
paves the way for a novel field of SHDR imaging which should lead to improved
and more realistic representation of captured scenes
Perceptual modelling for 2D and 3D
Livrable D1.1 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D1.1 du projet
Recent Advances in Image Restoration with Applications to Real World Problems
In the past few decades, imaging hardware has improved tremendously in terms of resolution, making widespread usage of images in many diverse applications on Earth and planetary missions. However, practical issues associated with image acquisition are still affecting image quality. Some of these issues such as blurring, measurement noise, mosaicing artifacts, low spatial or spectral resolution, etc. can seriously affect the accuracy of the aforementioned applications. This book intends to provide the reader with a glimpse of the latest developments and recent advances in image restoration, which includes image super-resolution, image fusion to enhance spatial, spectral resolution, and temporal resolutions, and the generation of synthetic images using deep learning techniques. Some practical applications are also included
Perceptual modelling for 2D and 3D
Livrable D1.1 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D1.1 du projet
In Vivo Vascular Imaging with Photoacoustic Microscopy
Photoacoustic (PA) tomography (PAT) has received extensive attention in the last decade for its capability to provide label-free structural and functional imaging in biological tissue with highly scalable spatial resolution and penetration depth. Compared to modern optical modalities, PAT offers speckle-free images and is more sensitive to optical absorption contrast (with 100% relative sensitivity). By implementing different regimes of optical wavelength, PAT can be used to image diverse light-absorbing biomolecules. For example, hemoglobin is of particular interest in the visible wavelength regime owing to its dominant absorption, and lipids and water are more commonly studied in the near-infrared regime.
In this dissertation, one challenge was to quantitatively investigate red-blood-cell dynamics in nailfold capillaries with single-cell resolution PA microscopy (PAM). We recruited healthy volunteers and measured multiple hemodynamic parameters based on individual red blood cells (RBCs). Statistical analysis revealed the process of oxygen release and changes in flow speed for RBCs in a capillary. For the first time on record, oxygen release from individual RBCs in human capillaries was imaged with nearly real-time speed, and the work paved the way for our following study of a specific blood disorder.
We next conducted a pilot study on sickle cell disease (SCD), measuring and comparing the parameters related to RBC dynamics between healthy subjects and patients with SCD. In the patient group, we found that capillaries tended to be more tortuous, dilated, and had higher number density. In addition, abnormal RBCs tended to have lower oxygenation in the inlet of a capillary, from where they flowed slower and released a larger fraction of oxygen than normal RBCs. As the only imaging modality able to observe the real-time dynamics of the oxygen release of individual RBCs, PAM provides medically valuable information for diagnostic purposes.
As the last focus of this dissertation, we tackled the limited view problem in PAM by introducing an off-axis illumination technique for complementing the original detection view. We demonstrated this technique numerically and then experimentally on phantoms and animals. This simple but very effective method revealed abundant vertical vasculature in a mouse brain that had long been missed by conventional top-illumination PAM. This technique greatly advances future studies on neurovascular responses in mouse brains
Neural Radiance Fields: Past, Present, and Future
The various aspects like modeling and interpreting 3D environments and
surroundings have enticed humans to progress their research in 3D Computer
Vision, Computer Graphics, and Machine Learning. An attempt made by Mildenhall
et al in their paper about NeRFs (Neural Radiance Fields) led to a boom in
Computer Graphics, Robotics, Computer Vision, and the possible scope of
High-Resolution Low Storage Augmented Reality and Virtual Reality-based 3D
models have gained traction from res with more than 1000 preprints related to
NeRFs published. This paper serves as a bridge for people starting to study
these fields by building on the basics of Mathematics, Geometry, Computer
Vision, and Computer Graphics to the difficulties encountered in Implicit
Representations at the intersection of all these disciplines. This survey
provides the history of rendering, Implicit Learning, and NeRFs, the
progression of research on NeRFs, and the potential applications and
implications of NeRFs in today's world. In doing so, this survey categorizes
all the NeRF-related research in terms of the datasets used, objective
functions, applications solved, and evaluation criteria for these applications.Comment: 413 pages, 9 figures, 277 citation
- …