185,751 research outputs found

    An adaptive true motion estimation algorithm for frame rate up-conversion and its hardware design

    Get PDF
    With the advancement in video and display technologies, recently flat panel High Definition Television (HDTV) displays with 100 Hz, 120 Hz and most recently 240 Hz picture rates are introduced. However, video materials are captured and broadcast in different temporal resolutions ranging from 24 Hz to 60 Hz. In order to display these video formats correctly on high picture rate displays, new frames should be generated and inserted into the original video sequence to increase its frame rate. Therefore, Frame Rate Up-Conversion (FRUC) has become a necessity. Motion Compensated FRUC algorithms provide better quality results than non-motion compensated FRUC algorithms. Motion Estimation (ME) is the process of finding motion vectors which describe the motion of the objects between adjacent frames and is the most computationally intensive part of motion compensated FRUC algorithms. For FRUC applications, it is important to find the motion vectors that represent real motions of the objects which is called true ME. In this thesis, an Adaptive True Motion Estimation (ATME) algorithm is proposed. ATME algorithm produces similar quality results with less number of calculations or better quality results with similar number of calculations compared to 3-D Recursive Search true ME algorithm by adaptively using optimized sets of candidate search locations and several redundancy removal techniques. In addition, 3 different complexity hardware architectures for ATME are proposed. The proposed hardware use efficient data re-use schemes for the non-regular data flow of ATME algorithm. 2 of these hardware architectures are implemented on Xilinx Virtex-4 FPGA and are capable of processing ~158 and ~168 720p HD frames per second respectively

    Cascaded Scene Flow Prediction using Semantic Segmentation

    Full text link
    Given two consecutive frames from a pair of stereo cameras, 3D scene flow methods simultaneously estimate the 3D geometry and motion of the observed scene. Many existing approaches use superpixels for regularization, but may predict inconsistent shapes and motions inside rigidly moving objects. We instead assume that scenes consist of foreground objects rigidly moving in front of a static background, and use semantic cues to produce pixel-accurate scene flow estimates. Our cascaded classification framework accurately models 3D scenes by iteratively refining semantic segmentation masks, stereo correspondences, 3D rigid motion estimates, and optical flow fields. We evaluate our method on the challenging KITTI autonomous driving benchmark, and show that accounting for the motion of segmented vehicles leads to state-of-the-art performance.Comment: International Conference on 3D Vision (3DV), 2017 (oral presentation

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world
    corecore