Search CORE

5,198 research outputs found

H2-Stereo: High-Speed, High-Resolution Stereoscopic Video System

Author: Asif M. Salman
Cheng Ming
Ma Chao
Ma Zhan
Shen Wang
Sun Jun
Xu Yiling
Publication venue
Publication date: 04/08/2022
Field of study

High-speed, high-resolution stereoscopic (H2-Stereo) video allows us to perceive dynamic 3D content at fine granularity. The acquisition of H2-Stereo video, however, remains challenging with commodity cameras. Existing spatial super-resolution or temporal frame interpolation methods provide compromised solutions that lack temporal or spatial details, respectively. To alleviate this problem, we propose a dual camera system, in which one camera captures high-spatial-resolution low-frame-rate (HSR-LFR) videos with rich spatial details, and the other captures low-spatial-resolution high-frame-rate (LSR-HFR) videos with smooth temporal details. We then devise a Learned Information Fusion network (LIFnet) that exploits the cross-camera redundancies to enhance both camera views to high spatiotemporal resolution (HSTR) for reconstructing the H2-Stereo video effectively. We utilize a disparity network to transfer spatiotemporal information across views even in large disparity scenes, based on which, we propose disparity-guided flow-based warping for LSR-HFR view and complementary warping for HSR-LFR view. A multi-scale fusion method in feature domain is proposed to minimize occlusion-induced warping ghosts and holes in HSR-LFR view. The LIFnet is trained in an end-to-end manner using our collected high-quality Stereo Video dataset from YouTube. Extensive experiments demonstrate that our model outperforms existing state-of-the-art methods for both views on synthetic data and camera-captured real data with large disparity. Ablation studies explore various aspects, including spatiotemporal resolution, camera baseline, camera desynchronization, long/short exposures and applications, of our system to fully understand its capability for potential applications

arXiv.org e-Print Archive

Single-shot compressed ultrafast photography: a review

Author: Cao Fengyan
Ding Pengpeng
Gao Liang
He Yilin
Jia Tianqing
Liang Jinyang
Qi Dalong
Sun Zhenrong
Wang Lihong V.
Yang Chengshuai
Yao Jiali
Zhang Shian
Publication venue: Society of Photo-Optical Instrumentation Engineers (SPIE)
Publication date: 01/02/2020
Field of study

Compressed ultrafast photography (CUP) is a burgeoning single-shot computational imaging technique that provides an imaging speed as high as 10 trillion frames per second and a sequence depth of up to a few hundred frames. This technique synergizes compressed sensing and the streak camera technique to capture nonrepeatable ultrafast transient events with a single shot. With recent unprecedented technical developments and extensions of this methodology, it has been widely used in ultrafast optical imaging and metrology, ultrafast electron diffraction and microscopy, and information security protection. We review the basic principles of CUP, its recent advances in data acquisition and image reconstruction, its fusions with other modalities, and its unique applications in multiple research fields

Caltech Authors

Colour videos with depth : acquisition, processing and evaluation

Author: Richardt Christian
Publication venue: University of Cambridge
Publication date: 01/01/2012
Field of study

The human visual system lets us perceive the world around us in three dimensions by integrating evidence from depth cues into a coherent visual model of the world. The equivalent in computer vision and computer graphics are geometric models, which provide a wealth of information about represented objects, such as depth and surface normals. Videos do not contain this information, but only provide per-pixel colour information. In this dissertation, I hence investigate a combination of videos and geometric models: videos with per-pixel depth (also known as RGBZ videos). I consider the full life cycle of these videos: from their acquisition, via filtering and processing, to stereoscopic display. I propose two approaches to capture videos with depth. The first is a spatiotemporal stereo matching approach based on the dual-cross-bilateral grid – a novel real-time technique derived by accelerating a reformulation of an existing stereo matching approach. This is the basis for an extension which incorporates temporal evidence in real time, resulting in increased temporal coherence of disparity maps – particularly in the presence of image noise. The second acquisition approach is a sensor fusion system which combines data from a noisy, low-resolution time-of-flight camera and a high-resolution colour video camera into a coherent, noise-free video with depth. The system consists of a three-step pipeline that aligns the video streams, efficiently removes and fills invalid and noisy geometry, and finally uses a spatiotemporal filter to increase the spatial resolution of the depth data and strongly reduce depth measurement noise. I show that these videos with depth empower a range of video processing effects that are not achievable using colour video alone. These effects critically rely on the geometric information, like a proposed video relighting technique which requires high-quality surface normals to produce plausible results. In addition, I demonstrate enhanced non-photorealistic rendering techniques and the ability to synthesise stereoscopic videos, which allows these effects to be applied stereoscopically. These stereoscopic renderings inspired me to study stereoscopic viewing discomfort. The result of this is a surprisingly simple computational model that predicts the visual comfort of stereoscopic images. I validated this model using a perceptual study, which showed that it correlates strongly with human comfort ratings. This makes it ideal for automatic comfort assessment, without the need for costly and lengthy perceptual studies

CiteSeerX

Apollo (Cambridge)

Cockpit Ocular Recording System (CORS)

Author: Arnold William
Dick A. O.
Lagrossa Charles
Rothenheber Edward
Stokes James
Publication venue
Publication date
Field of study

The overall goal was the development of a Cockpit Ocular Recording System (CORS). Four tasks were used: (1) the development of the system; (2) the experimentation and improvement of the system; (3) demonstrations of the working system; and (4) system documentation. Overall, the prototype represents a workable and flexibly designed CORS system. For the most part, the hardware use for the prototype system is off-the-shelf. All of the following software was developed specifically: (1) setup software that the user specifies the cockpit configuration and identifies possible areas in which the pilot will look; (2) sensing software which integrates the 60 Hz data from the oculometer and heat orientation sensing unit; (3) processing software which applies a spatiotemporal filter to the lookpoint data to determine fixation/dwell positions; (4) data recording output routines; and (5) playback software which allows the user to retrieve and analyze the data. Several experiments were performed to verify the system accuracy and quantify system deficiencies. These tests resulted in recommendations for any future system that might be constructed

NASA Technical Reports Server

Pix2HDR -- A pixel-wise acquisition and deep learning-based synthesis approach for high-speed HDR videos

Author: Etienne-Cummings Ralph
Wang Caixin
Wilson Matthew A.
Zhang Jie
Publication venue
Publication date: 24/10/2023
Field of study

Accurately capturing dynamic scenes with wide-ranging motion and light intensity is crucial for many vision applications. However, acquiring high-speed high dynamic range (HDR) video is challenging because the camera's frame rate restricts its dynamic range. Existing methods sacrifice speed to acquire multi-exposure frames. Yet, misaligned motion in these frames can still pose complications for HDR fusion algorithms, resulting in artifacts. Instead of frame-based exposures, we sample the videos using individual pixels at varying exposures and phase offsets. Implemented on a pixel-wise programmable image sensor, our sampling pattern simultaneously captures fast motion at a high dynamic range. We then transform pixel-wise outputs into an HDR video using end-to-end learned weights from deep neural networks, achieving high spatiotemporal resolution with minimized motion blurring. We demonstrate aliasing-free HDR video acquisition at 1000 FPS, resolving fast motion under low-light conditions and against bright backgrounds - both challenging conditions for conventional cameras. By combining the versatility of pixel-wise sampling patterns with the strength of deep neural networks at decoding complex scenes, our method greatly enhances the vision system's adaptability and performance in dynamic conditions.Comment: 14 pages, 14 figure

arXiv.org e-Print Archive