92 research outputs found

    VLSI Implementation of Barrel Distortion Correction in Endoscopic Images based on Least Squares Estimation

    Get PDF
    An efficient VLSI Implementation of Barrel Distortion Correction (BDC) in Endoscopic Images based on Least Squares Estimation is presented in this paper. Computational complexity is reduced by employing Odd order polynomial, as an approximation to Back-mapping expansion polynomial. This polynomial can be solved in monomial form, by Estrin\u27s algorithm. In Estrin’s algorithm, a high order expression can be factorized in to sub-expression, which can be evaluated in parallel. In our simulation, on comparison with some existing distortion correction techniques, 75% of hardware cost and 70% of memory requirement is reduced by using TSMC 0.18μm technology

    Design of a High-Speed Architecture for Stabilization of Video Captured Under Non-Uniform Lighting Conditions

    Get PDF
    Video captured in shaky conditions may lead to vibrations. A robust algorithm to immobilize the video by compensating for the vibrations from physical settings of the camera is presented in this dissertation. A very high performance hardware architecture on Field Programmable Gate Array (FPGA) technology is also developed for the implementation of the stabilization system. Stabilization of video sequences captured under non-uniform lighting conditions begins with a nonlinear enhancement process. This improves the visibility of the scene captured from physical sensing devices which have limited dynamic range. This physical limitation causes the saturated region of the image to shadow out the rest of the scene. It is therefore desirable to bring back a more uniform scene which eliminates the shadows to a certain extent. Stabilization of video requires the estimation of global motion parameters. By obtaining reliable background motion, the video can be spatially transformed to the reference sequence thereby eliminating the unintended motion of the camera. A reflectance-illuminance model for video enhancement is used in this research work to improve the visibility and quality of the scene. With fast color space conversion, the computational complexity is reduced to a minimum. The basic video stabilization model is formulated and configured for hardware implementation. Such a model involves evaluation of reliable features for tracking, motion estimation, and affine transformation to map the display coordinates of a stabilized sequence. The multiplications, divisions and exponentiations are replaced by simple arithmetic and logic operations using improved log-domain computations in the hardware modules. On Xilinx\u27s Virtex II 2V8000-5 FPGA platform, the prototype system consumes 59% logic slices, 30% flip-flops, 34% lookup tables, 35% embedded RAMs and two ZBT frame buffers. The system is capable of rendering 180.9 million pixels per second (mpps) and consumes approximately 30.6 watts of power at 1.5 volts. With a 1024×1024 frame, the throughput is equivalent to 172 frames per second (fps). Future work will optimize the performance-resource trade-off to meet the specific needs of the applications. It further extends the model for extraction and tracking of moving objects as our model inherently encapsulates the attributes of spatial distortion and motion prediction to reduce complexity. With these parameters to narrow down the processing range, it is possible to achieve a minimum of 20 fps on desktop computers with Intel Core 2 Duo or Quad Core CPUs and 2GB DDR2 memory without a dedicated hardware

    Design of an Automated Book Reader as an Assistive Technology for Blind Persons

    Get PDF
    This dissertation introduces a novel automated book reader as an assistive technology tool for persons with blindness. The literature shows extensive work in the area of optical character recognition, but the current methodologies available for the automated reading of books or bound volumes remain inadequate and are severely constrained during document scanning or image acquisition processes. The goal of the book reader design is to automate and simplify the task of reading a book while providing a user-friendly environment with a realistic but affordable system design. This design responds to the main concerns of (a) providing a method of image acquisition that maintains the integrity of the source (b) overcoming optical character recognition errors created by inherent imaging issues such as curvature effects and barrel distortion, and (c) determining a suitable method for accurate recognition of characters that yields an interface with the ability to read from any open book with a high reading accuracy nearing 98%. This research endeavor focuses in its initial aim on the development of an assistive technology tool to help persons with blindness in the reading of books and other bound volumes. But its secondary and broader aim is to also find in this design the perfect platform for the digitization process of bound documentation in line with the mission of the Open Content Alliance (OCA), a nonprofit Alliance at making reading materials available in digital form. The theoretical perspective of this research relates to the mathematical developments that are made in order to resolve both the inherent distortions due to the properties of the camera lens and the anticipated distortions of the changing page curvature as one leafs through the book. This is evidenced by the significant increase of the recognition rate of characters and a high accuracy read-out through text to speech processing. This reasonably priced interface with its high performance results and its compatibility to any computer or laptop through universal serial bus connectors extends greatly the prospects for universal accessibility to documentation

    Evaluation of a SoC for Real-time 3D SLAM

    Get PDF
    SLAM, or Simultaneous Localization and Mapping, is the combined problem of constructing a map of an agent’s environment while localizing, or tracking that same agent’s pose in tandem. It is among the most challenging and fundamental tasks in computer vision, with applications ranging from augmented reality to robotic navigation. With the increasing capability and ubiquity of mobile computers such as cell phones, portable 3D SLAM systems are becoming feasible for widespread use. The Microsoft Hololens, Google Project Tango, and other 3D aware devices are modern day examples of the potential of SLAM and the challenges it has yet to face. The ICP, or Iterative Closest Point Algorithm, is a popular solution for retrieving the relative transformation between two scans of the same object. It has gained a resurgence in popularity due to the rise of affordable depth sensors such as the Kinect in robotics and augmented reality research. ICP, while providing a high certainty of correctness given similar point clouds, is challenging to implement in real time due to its computational complexity. In this thesis, a basic 3D SLAM algorithm is implemented and evaluated, and two proposed FPGA architectures to accelerate the Nearest Neighbor component of ICP for use in a mobile ARM-based System-on-Chip (SoC) are presented. These architectures are predicted to achieve speedups of up to 7.89x and 17.22x over a naive embedded software implementation

    Co-design hardware/software of real time vision system on FPGA for obstacle detection

    Get PDF
    La détection, localisation d'obstacles et la reconstruction de carte d'occupation 2D sont des fonctions de base pour un robot navigant dans un environnement intérieure lorsque l'intervention avec les objets se fait dans un environnement encombré. Les solutions fondées sur la vision artificielle et couramment utilisées comme SLAM (simultaneous localization and mapping) ou le flux optique ont tendance a être des calculs intensifs. Ces solutions nécessitent des ressources de calcul puissantes pour répondre à faible vitesse en temps réel aux contraintes. Nous présentons une architecture matérielle pour la détection, localisation d'obstacles et la reconstruction de cartes d'occupation 2D en temps réel. Le système proposé est réalisé en utilisant une architecture de vision sur FPGA (field programmable gates array) et des capteurs d'odométrie pour la détection, localisation des obstacles et la cartographie. De la fusion de ces deux sources d'information complémentaires résulte un modèle amelioré de l'environnement autour des robots. L'architecture proposé est un système à faible coût avec un temps de calcul réduit, un débit d'images élevé, et une faible consommation d'énergieObstacle detection, localization and occupancy map reconstruction are essential abilities for a mobile robot to navigate in an environment. Solutions based on passive monocular vision such as simultaneous localization and mapping (SLAM) or optical flow (OF) require intensive computation. Systems based on these methods often rely on over-sized computation resources to meet real-time constraints. Inverse perspective mapping allows for obstacles detection at a low computational cost under the hypothesis of a flat ground observed during motion. It is thus possible to build an occupancy grid map by integrating obstacle detection over the course of the sensor. In this work we propose hardware/software system for obstacle detection, localization and 2D occupancy map reconstruction in real-time. The proposed system uses a FPGA-based design for vision and proprioceptive sensors for localization. Fusing this information allows for the construction of a simple environment model of the sensor surrounding. The resulting architecture is a low-cost, low-latency, high-throughput and low-power system

    Power-Aware Design Methodologies for FPGA-Based Implementation of Video Processing Systems

    Get PDF
    The increasing capacity and capabilities of FPGA devices in recent years provide an attractive option for performance-hungry applications in the image and video processing domain. FPGA devices are often used as implementation platforms for image and video processing algorithms for real-time applications due to their programmable structure that can exploit inherent spatial and temporal parallelism. While performance and area remain as two main design criteria, power consumption has become an important design goal especially for mobile devices. Reduction in power consumption can be achieved by reducing the supply voltage, capacitances, clock frequency and switching activities in a circuit. Switching activities can be reduced by architectural optimization of the processing cores such as adders, multipliers, multiply and accumulators (MACS), etc. This dissertation research focuses on reducing the switching activities in digital circuits by considering data dependencies in bit level, word level and block level neighborhoods in a video frame. The bit level data neighborhood dependency consideration for power reduction is illustrated in the design of pipelined array, Booth and log-based multipliers. For an array multiplier, operands of the multipliers are partitioned into higher and lower parts so that the probability of the higher order parts being zero or one increases. The gating technique for the pipelined approach deactivates part(s) of the multiplier when the above special values are detected. For the Booth multiplier, the partitioning and gating technique is integrated into the Booth recoding scheme. In addition, a delay correction strategy is developed for the Booth multiplier to reduce the switching activities of the sign extension part in the partial products. A novel architecture design for the computation of log and inverse-log functions for the reduction of power consumption in arithmetic circuits is also presented. This also utilizes the proposed partitioning and gating technique for further dynamic power reduction by reducing the switching activities. The word level and block level data dependencies for reducing the dynamic power consumption are illustrated by presenting the design of a 2-D convolution architecture. Here the similarities of the neighboring pixels in window-based operations of image and video processing algorithms are considered for reduced switching activities. A partitioning and detection mechanism is developed to deactivate the parallel architecture for window-based operations if higher order parts of the pixel values are the same. A neighborhood dependent approach (NDA) is incorporated with different window buffering schemes. Consideration of the symmetry property in filter kernels is also applied with the NDA method for further reduction of switching activities. The proposed design methodologies are implemented and evaluated in a FPGA environment. It is observed that the dynamic power consumption in FPGA-based circuit implementations is significantly reduced in bit level, data level and block level architectures when compared to state-of-the-art design techniques. A specific application for the design of a real-time video processing system incorporating the proposed design methodologies for low power consumption is also presented. An image enhancement application is considered and the proposed partitioning and gating, and NDA methods are utilized in the design of the enhancement system. Experimental results show that the proposed multi-level power aware methodology achieves considerable power reduction. Research work is progressing In utilizing the data dependencies in subsequent frames in a video stream for the reduction of circuit switching activities and thereby the dynamic power consumption

    Overview of the Instrumentation for the Dark Energy Spectroscopic Instrument

    Get PDF

    A Full Scale Camera Calibration Technique with Automatic Model Selection – Extension and Validation

    Get PDF
    This thesis presents work on the testing and development of a complete camera calibration approach which can be applied to a wide range of cameras equipped with normal, wide-angle, fish-eye, or telephoto lenses. The full scale calibration approach estimates all of the intrinsic and extrinsic parameters. The calibration procedure is simple and does not require prior knowledge of any parameters. The method uses a simple planar calibration pattern. Closed-form estimates for the intrinsic and extrinsic parameters are computed followed by nonlinear optimization. Polynomial functions are used to describe the lens projection instead of the commonly used radial model. Statistical information criteria are used to automatically determine the complexity of the lens distortion model. In the first stage experiments were performed to verify and compare the performance of the calibration method. Experiments were performed on a wide range of lenses. Synthetic data was used to simulate real data and validate the performance. Synthetic data was also used to validate the performance of the distortion model selection which uses Information Theoretic Criterion (AIC) to automatically select the complexity of the distortion model. In the second stage work was done to develop an improved calibration procedure which addresses shortcomings of previously developed method. Experiments on the previous method revealed that the estimation of the principal point during calibration was erroneous for lenses with a large focal length. To address this issue the calibration method was modified to include additional methods to accurately estimate the principal point in the initial stages of the calibration procedure. The modified procedure can now be used to calibrate a wide spectrum of imaging systems including telephoto and verifocal lenses. Survey of current work revealed a vast amount of research concentrating on calibrating only the distortion of the camera. In these methods researchers propose methods to calibrate only the distortion parameters and suggest using other popular methods to find the remaining camera parameters. Using this proposed methodology we apply distortion calibration to our methods to separate the estimation of distortion parameters. We show and compare the results with the original method on a wide range of imaging systems

    Real-time 3-D Scene Reconstruction

    Get PDF
    This dissertation describes a complete system that captures image data from multiple stereoscopic camera pairs and reconstructs a 3-D model of the imaged scene in real-time. To achieve real-time rates, the system is organized in a distributed hierarchical fashion to maximize parallelism and uses algorithms that, in many instances, are suitable for direct implementation in digital hardware rather than software on a general purpose computer. At the lowest level of the hierarchy, image data is acquired from a single camera and processed to compensate for lens distortion and to apply rectification in preparation for stereo image processing. At the next level, data from pairs of cameras is matched to compute a dense stereoscopic disparity map from which 3-D surfaces are inferred and a mesh model is constructed. Finally, at the top level all of the individual 3-D mesh models are merged into a single 3-D model. If desired, the camera image data can be applied to the resultant 3-D model as a texture and the model re-rendered from a virtual camera viewpoint. Previous 3-D research focuses on individual steps in this process (lens distortion correction, image rectification, stereoscopic disparity computation, and model building). This dissertation considers them instead in the context of a complete end-to-end system. Traditional approaches to model building begin with an unstructured "point cloud" that is neutral with respect to how the data was acquired; this allows model building to be studied independent of data acquisition but may miss some opportunities available in a more tightly coupled interface. By taking a broader view of the problems faced by the entire system, a novel algorithm for 3-D model building has been developed that takes advantage of the organization in the dense stereoscopic disparity map to efficiently build its model. The core of this novel algorithm is a method of evaluating linear regression error to fit a series of line segments to data points in a way that can be efficiently implemented directly in hardware
    corecore