278 research outputs found

    Design of a High-Speed Architecture for Stabilization of Video Captured Under Non-Uniform Lighting Conditions

    Get PDF
    Video captured in shaky conditions may lead to vibrations. A robust algorithm to immobilize the video by compensating for the vibrations from physical settings of the camera is presented in this dissertation. A very high performance hardware architecture on Field Programmable Gate Array (FPGA) technology is also developed for the implementation of the stabilization system. Stabilization of video sequences captured under non-uniform lighting conditions begins with a nonlinear enhancement process. This improves the visibility of the scene captured from physical sensing devices which have limited dynamic range. This physical limitation causes the saturated region of the image to shadow out the rest of the scene. It is therefore desirable to bring back a more uniform scene which eliminates the shadows to a certain extent. Stabilization of video requires the estimation of global motion parameters. By obtaining reliable background motion, the video can be spatially transformed to the reference sequence thereby eliminating the unintended motion of the camera. A reflectance-illuminance model for video enhancement is used in this research work to improve the visibility and quality of the scene. With fast color space conversion, the computational complexity is reduced to a minimum. The basic video stabilization model is formulated and configured for hardware implementation. Such a model involves evaluation of reliable features for tracking, motion estimation, and affine transformation to map the display coordinates of a stabilized sequence. The multiplications, divisions and exponentiations are replaced by simple arithmetic and logic operations using improved log-domain computations in the hardware modules. On Xilinx\u27s Virtex II 2V8000-5 FPGA platform, the prototype system consumes 59% logic slices, 30% flip-flops, 34% lookup tables, 35% embedded RAMs and two ZBT frame buffers. The system is capable of rendering 180.9 million pixels per second (mpps) and consumes approximately 30.6 watts of power at 1.5 volts. With a 1024×1024 frame, the throughput is equivalent to 172 frames per second (fps). Future work will optimize the performance-resource trade-off to meet the specific needs of the applications. It further extends the model for extraction and tracking of moving objects as our model inherently encapsulates the attributes of spatial distortion and motion prediction to reduce complexity. With these parameters to narrow down the processing range, it is possible to achieve a minimum of 20 fps on desktop computers with Intel Core 2 Duo or Quad Core CPUs and 2GB DDR2 memory without a dedicated hardware

    Structure and motion estimation from rolling shutter video

    Full text link
    The majority of consumer quality cameras sold today have CMOS sensors with rolling shutters. In a rolling shutter camera, images are read out row by row, and thus each row is exposed during a different time interval. A rolling-shutter exposure causes geometric image distortions when either the camera or the scene is moving, and this causes state-of-the-art structure and motion algorithms to fail. We demonstrate a novel method for solving the structure and motion problem for rolling-shutter video. The method relies on exploiting the continuity of the camera motion, both between frames, and across a frame. We demonstrate the effectiveness of our method by controlled experiments on real video sequences. We show, both visually and quantitatively, that our method outperforms standard structure and motion, and is more accurate and efficient than a two-step approach, doing image rectification and structure and motion

    Affine multi-view modelling for close range object measurement

    Get PDF
    In photogrammetry, sensor modelling with 3D point estimation is a fundamental topic of research. Perspective frame cameras offer the mathematical basis for close range modelling approaches. The norm is to employ robust bundle adjustments for simultaneous parameter estimation and 3D object measurement. In 2D to 3D modelling strategies image resolution, scale, sampling and geometric distortion are prior factors. Non-conventional image geometries that implement uncalibrated cameras are established in computer vision approaches; these aim for fast solutions at the expense of precision. The projective camera is defined in homogeneous terms and linear algorithms are employed. An attractive sensor model disembodied from projective distortions is the affine. Affine modelling has been studied in the contexts of geometry recovery, feature detection and texturing in vision, however multi-view approaches for precise object measurement are not yet widely available. This project investigates affine multi-view modelling from a photogrammetric standpoint. A new affine bundle adjustment system has been developed for point-based data observed in close range image networks. The system allows calibration, orientation and 3D point estimation. It is processed as a least squares solution with high redundancy providing statistical analysis. Starting values are recovered from a combination of implicit perspective and explicit affine approaches. System development focuses on retrieval of orientation parameters, 3D point coordinates and internal calibration with definition of system datum, sensor scale and radial lens distortion. Algorithm development is supported with method description by simulation. Initialization and implementation are evaluated with the statistical indicators, algorithm convergence and correlation of parameters. Object space is assessed with evaluation of the 3D point correlation coefficients and error ellipsoids. Sensor scale is checked with comparison of camera systems utilizing quality and accuracy metrics. For independent method evaluation, testing is implemented over a perspective bundle adjustment tool with similar indicators. Test datasets are initialized from precise reference image networks. Real affine image networks are acquired with an optical system (~1M pixel CCD cameras with 0.16x telecentric lens). Analysis of tests ascertains that the affine method results in an RMS image misclosure at a sub-pixel level and precisions of a few tenths of microns in object space

    Advanced 3D photogrammetric surface reconstruction of extensive objects by UAV camera image acquisition

    Get PDF
    This paper proposes a replicable methodology to enhance the accuracy of the photogrammetric reconstruction of large-scale objects based on the optimization of the procedures for Unmanned Aerial Vehicle (UAV) camera image acquisition. The relationships between the acquisition grid shapes, the acquisition grid geometric parameters (pitches, image rates, camera framing, flight heights), and the 3D photogrammetric surface reconstruction accuracy were studied. Ground Sampling Distance (GSD), the necessary number of photos to assure the desired overlapping, and the surface reconstruction accuracy were related to grid shapes, image rate, and camera framing at different flight heights. The established relationships allow to choose the best combination of grid shapes and acquisition grid geometric parameters to obtain the desired accuracy for the required GSD. This outcome was assessed by means of a case study related to the ancient arched brick Bridge of the Saracens in Adrano (Sicily, Italy). The reconstruction of the three-dimensional surfaces of this structure, obtained by the efficient Structure-From-Motion (SfM) algorithms of the commercial software Pix4Mapper, supported the study by validating it with experimental data. A comparison between the surface reconstruction with different acquisition grids at different flight heights and the measurements obtained with a 3D terrestrial laser and total station-theodolites allowed to evaluate the accuracy in terms of Euclidean distances

    高速ビジョンを用いたリアルタイムビデオモザイキングと安定化に関する研究

    Get PDF
    広島大学(Hiroshima University)博士(工学)Doctor of Engineeringdoctora
    corecore