207 research outputs found

    Wireless Software Synchronization of Multiple Distributed Cameras

    Full text link
    We present a method for precisely time-synchronizing the capture of image sequences from a collection of smartphone cameras connected over WiFi. Our method is entirely software-based, has only modest hardware requirements, and achieves an accuracy of less than 250 microseconds on unmodified commodity hardware. It does not use image content and synchronizes cameras prior to capture. The algorithm operates in two stages. In the first stage, we designate one device as the leader and synchronize each client device's clock to it by estimating network delay. Once clocks are synchronized, the second stage initiates continuous image streaming, estimates the relative phase of image timestamps between each client and the leader, and shifts the streams into alignment. We quantitatively validate our results on a multi-camera rig imaging a high-precision LED array and qualitatively demonstrate significant improvements to multi-view stereo depth estimation and stitching of dynamic scenes. We release as open source 'libsoftwaresync', an Android implementation of our system, to inspire new types of collective capture applications.Comment: Main: 9 pages, 10 figures. Supplemental: 3 pages, 5 figure

    Towards High-Frequency Tracking and Fast Edge-Aware Optimization

    Full text link
    This dissertation advances the state of the art for AR/VR tracking systems by increasing the tracking frequency by orders of magnitude and proposes an efficient algorithm for the problem of edge-aware optimization. AR/VR is a natural way of interacting with computers, where the physical and digital worlds coexist. We are on the cusp of a radical change in how humans perform and interact with computing. Humans are sensitive to small misalignments between the real and the virtual world, and tracking at kilo-Hertz frequencies becomes essential. Current vision-based systems fall short, as their tracking frequency is implicitly limited by the frame-rate of the camera. This thesis presents a prototype system which can track at orders of magnitude higher than the state-of-the-art methods using multiple commodity cameras. The proposed system exploits characteristics of the camera traditionally considered as flaws, namely rolling shutter and radial distortion. The experimental evaluation shows the effectiveness of the method for various degrees of motion. Furthermore, edge-aware optimization is an indispensable tool in the computer vision arsenal for accurate filtering of depth-data and image-based rendering, which is increasingly being used for content creation and geometry processing for AR/VR. As applications increasingly demand higher resolution and speed, there exists a need to develop methods that scale accordingly. This dissertation proposes such an edge-aware optimization framework which is efficient, accurate, and algorithmically scales well, all of which are much desirable traits not found jointly in the state of the art. The experiments show the effectiveness of the framework in a multitude of computer vision tasks such as computational photography and stereo.Comment: PhD thesi

    Neural Global Shutter: Learn to Restore Video from a Rolling Shutter Camera with Global Reset Feature

    Full text link
    Most computer vision systems assume distortion-free images as inputs. The widely used rolling-shutter (RS) image sensors, however, suffer from geometric distortion when the camera and object undergo motion during capture. Extensive researches have been conducted on correcting RS distortions. However, most of the existing work relies heavily on the prior assumptions of scenes or motions. Besides, the motion estimation steps are either oversimplified or computationally inefficient due to the heavy flow warping, limiting their applicability. In this paper, we investigate using rolling shutter with a global reset feature (RSGR) to restore clean global shutter (GS) videos. This feature enables us to turn the rectification problem into a deblur-like one, getting rid of inaccurate and costly explicit motion estimation. First, we build an optic system that captures paired RSGR/GS videos. Second, we develop a novel algorithm incorporating spatial and temporal designs to correct the spatial-varying RSGR distortion. Third, we demonstrate that existing image-to-image translation algorithms can recover clean GS videos from distorted RSGR inputs, yet our algorithm achieves the best performance with the specific designs. Our rendered results are not only visually appealing but also beneficial to downstream tasks. Compared to the state-of-the-art RS solution, our RSGR solution is superior in both effectiveness and efficiency. Considering it is easy to realize without changing the hardware, we believe our RSGR solution can potentially replace the RS solution in taking distortion-free videos with low noise and low budget.Comment: CVPR2022, https://github.com/lightChaserX/neural-global-shutte

    Cost-effective solution to synchronised audio-visual data capture using multiple sensors

    Get PDF
    Applications such as surveillance and human behaviour analysis require high-bandwidth recording from multiple cameras, as well as from other sensors. In turn, sensor fusion has increased the required accuracy of synchronisation between sensors. Using commercial off-the-shelf components may compromise quality and accuracy due to several challenges, such as dealing with the combined data rate from multiple sensors; unknown offset and rate discrepancies between independent hardware clocks; the absence of trigger inputs or -outputs in the hardware; as well as the different methods for time-stamping the recorded data. To achieve accurate synchronisation, we centralise the synchronisation task by recording all trigger- or timestamp signals with a multi-channel audio interface. For sensors that don't have an external trigger signal, we let the computer that captures the sensor data periodically generate timestamp signals from its serial port output. These signals can also be used as a common time base to synchronise multiple asynchronous audio interfaces. Furthermore, we show that a consumer PC can currently capture 8-bit video data with 1024 × 1024 spatial- and 59.1 Hz temporal resolution, from at least 14 cameras, together with 8 channels of 24-bit audio at 96 kHz. We thus improve the quality/cost ratio of multi-sensor systems data capture systems

    Benchmarking of mobile phone cameras

    Get PDF
    fi=vertaisarvioitu|en=peerReviewed

    Cost-effective solution to synchronised audio-visual data capture using multiple sensors

    Get PDF
    Applications such as surveillance and human behaviour analysis require high- bandwidth recording from multiple cameras, as well as from other sensors. In turn, sensor fusion has increased the required accuracy of synchronisation be- tween sensors. Using commercial off-the-shelf components may compromise quality and accuracy, because it is difficult to handle the combined data rate from multiple sensors, the offset and rate discrepancies between independent hardware clocks, the absence of trigger inputs or -outputs in the hardware, as well as the different methods for timestamping the recorded data. To achieve accurate synchronisation, we centralise the synchronisation task by recording all trigger- or timestamp signals with a multi-channel audio interface. For sensors that don’t have an external trigger signal, we let the computer that captures the sensor data periodically generate timestamp signals from its se- rial port output. These signals can also be used as a common time base to synchronise multiple asynchronous audio interfaces. Furthermore, we show that a consumer PC can currently capture 8-bit video data with 1024x1024 spatial- and 59.1Hz temporal resolution, from at least 14 cameras, together with 8 channels of 24-bit audio at 96kHz. We thus improve the quality/cost ratio of multi-sensor systems data capture systems
    • …
    corecore