914 research outputs found

    Advances in Stereo Vision

    Get PDF
    Stereopsis is a vision process whose geometrical foundation has been known for a long time, ever since the experiments by Wheatstone, in the 19th century. Nevertheless, its inner workings in biological organisms, as well as its emulation by computer systems, have proven elusive, and stereo vision remains a very active and challenging area of research nowadays. In this volume we have attempted to present a limited but relevant sample of the work being carried out in stereo vision, covering significant aspects both from the applied and from the theoretical standpoints

    Depth Estimation for 2D-to-3D Image Conversion Using Scene Feature

    Get PDF
    In this modern era 3D supportive hardware popularity is increased but the demand for 3D contents and there availability is not matching. They are still dominated by its 2D counterpart hence there is need of 3D contents. While doing 2D-to-3D image or video conversion depth estimation is a key step and a bit challenging procedure. There are distinct parameters that can be considered during conversion like, structure from motion, defocus, perspective geometry, etc. Until now many researchers have been proposed different methods to close this gap by considering one or many parameters. In this paper for depth estimation, conversion using scene feature is used. Here color is chosen as a scene feature. Intensity information is used here to estimate depth image, hence RGB to HSV conversion is performed from which Value (V) deals with intensity information. RGB to HSV conversion is implemented on FPGA. The proposed method is prototyped on Spartan 3E FPGA based developing board and MATLAB. DOI: 10.17762/ijritcc2321-8169.15068

    Design of a High-Speed Architecture for Stabilization of Video Captured Under Non-Uniform Lighting Conditions

    Get PDF
    Video captured in shaky conditions may lead to vibrations. A robust algorithm to immobilize the video by compensating for the vibrations from physical settings of the camera is presented in this dissertation. A very high performance hardware architecture on Field Programmable Gate Array (FPGA) technology is also developed for the implementation of the stabilization system. Stabilization of video sequences captured under non-uniform lighting conditions begins with a nonlinear enhancement process. This improves the visibility of the scene captured from physical sensing devices which have limited dynamic range. This physical limitation causes the saturated region of the image to shadow out the rest of the scene. It is therefore desirable to bring back a more uniform scene which eliminates the shadows to a certain extent. Stabilization of video requires the estimation of global motion parameters. By obtaining reliable background motion, the video can be spatially transformed to the reference sequence thereby eliminating the unintended motion of the camera. A reflectance-illuminance model for video enhancement is used in this research work to improve the visibility and quality of the scene. With fast color space conversion, the computational complexity is reduced to a minimum. The basic video stabilization model is formulated and configured for hardware implementation. Such a model involves evaluation of reliable features for tracking, motion estimation, and affine transformation to map the display coordinates of a stabilized sequence. The multiplications, divisions and exponentiations are replaced by simple arithmetic and logic operations using improved log-domain computations in the hardware modules. On Xilinx\u27s Virtex II 2V8000-5 FPGA platform, the prototype system consumes 59% logic slices, 30% flip-flops, 34% lookup tables, 35% embedded RAMs and two ZBT frame buffers. The system is capable of rendering 180.9 million pixels per second (mpps) and consumes approximately 30.6 watts of power at 1.5 volts. With a 1024×1024 frame, the throughput is equivalent to 172 frames per second (fps). Future work will optimize the performance-resource trade-off to meet the specific needs of the applications. It further extends the model for extraction and tracking of moving objects as our model inherently encapsulates the attributes of spatial distortion and motion prediction to reduce complexity. With these parameters to narrow down the processing range, it is possible to achieve a minimum of 20 fps on desktop computers with Intel Core 2 Duo or Quad Core CPUs and 2GB DDR2 memory without a dedicated hardware

    SYSTEM-ON-A-CHIP (SOC)-BASED HARDWARE ACCELERATION FOR HUMAN ACTION RECOGNITION WITH CORE COMPONENTS

    Get PDF
    Today, the implementation of machine vision algorithms on embedded platforms or in portable systems is growing rapidly due to the demand for machine vision in daily human life. Among the applications of machine vision, human action and activity recognition has become an active research area, and market demand for providing integrated smart security systems is growing rapidly. Among the available approaches, embedded vision is in the top tier; however, current embedded platforms may not be able to fully exploit the potential performance of machine vision algorithms, especially in terms of low power consumption. Complex algorithms can impose immense computation and communication demands, especially action recognition algorithms, which require various stages of preprocessing, processing and machine learning blocks that need to operate concurrently. The market demands embedded platforms that operate with a power consumption of only a few watts. Attempts have been mad to improve the performance of traditional embedded approaches by adding more powerful processors; this solution may solve the computation problem but increases the power consumption. System-on-a-chip eld-programmable gate arrays (SoC-FPGAs) have emerged as a major architecture approach for improving power eciency while increasing computational performance. In a SoC-FPGA, an embedded processor and an FPGA serving as an accelerator are fabricated in the same die to simultaneously improve power consumption and performance. Still, current SoC-FPGA-based vision implementations either shy away from supporting complex and adaptive vision algorithms or operate at very limited resolutions due to the immense communication and computation demands. The aim of this research is to develop a SoC-based hardware acceleration workflow for the realization of advanced vision algorithms. Hardware acceleration can improve performance for highly complex mathematical calculations or repeated functions. The performance of a SoC system can thus be improved by using hardware acceleration method to accelerate the element that incurs the highest performance overhead. The outcome of this research could be used for the implementation of various vision algorithms, such as face recognition, object detection or object tracking, on embedded platforms. The contributions of SoC-based hardware acceleration for hardware-software codesign platforms include the following: (1) development of frameworks for complex human action recognition in both 2D and 3D; (2) realization of a framework with four main implemented IPs, namely, foreground and background subtraction (foreground probability), human detection, 2D/3D point-of-interest detection and feature extraction, and OS-ELM as a machine learning algorithm for action identication; (3) use of an FPGA-based hardware acceleration method to resolve system bottlenecks and improve system performance; and (4) measurement and analysis of system specications, such as the acceleration factor, power consumption, and resource utilization. Experimental results show that the proposed SoC-based hardware acceleration approach provides better performance in terms of the acceleration factor, resource utilization and power consumption among all recent works. In addition, a comparison of the accuracy of the framework that runs on the proposed embedded platform (SoCFPGA) with the accuracy of other PC-based frameworks shows that the proposed approach outperforms most other approaches

    Deep Sketch-Photo Face Recognition Assisted by Facial Attributes

    Full text link
    In this paper, we present a deep coupled framework to address the problem of matching sketch image against a gallery of mugshots. Face sketches have the essential in- formation about the spatial topology and geometric details of faces while missing some important facial attributes such as ethnicity, hair, eye, and skin color. We propose a cou- pled deep neural network architecture which utilizes facial attributes in order to improve the sketch-photo recognition performance. The proposed Attribute-Assisted Deep Con- volutional Neural Network (AADCNN) method exploits the facial attributes and leverages the loss functions from the facial attributes identification and face verification tasks in order to learn rich discriminative features in a common em- bedding subspace. The facial attribute identification task increases the inter-personal variations by pushing apart the embedded features extracted from individuals with differ- ent facial attributes, while the verification task reduces the intra-personal variations by pulling together all the fea- tures that are related to one person. The learned discrim- inative features can be well generalized to new identities not seen in the training data. The proposed architecture is able to make full use of the sketch and complementary fa- cial attribute information to train a deep model compared to the conventional sketch-photo recognition methods. Exten- sive experiments are performed on composite (E-PRIP) and semi-forensic (IIIT-D semi-forensic) datasets. The results show the superiority of our method compared to the state- of-the-art models in sketch-photo recognition algorithm

    REVIEW ON DETECTION OF RICE PLANT LEAVES DISEASES USING DATA AUGMENTATION AND TRANSFER LEARNING TECHNIQUES

    Get PDF
    The most important cereal crop in the world is rice (Oryza sativa). Over half of the world's population uses it as a staple food and energy source. Abiotic and biotic factors such as precipitation, soil fertility, temperature, pests, bacteria, and viruses, among others, impact the yield production and quality of rice grain. Farmers spend a lot of time and money managing diseases, and they do so using a bankrupt "eye" method that leads to unsanitary farming practices. The development of agricultural technology is greatly conducive to the automatic detection of pathogenic organisms in the leaves of rice plants. Several deep learning algorithms are discussed, and processors for computer vision problems such as image classification, object segmentation, and image analysis are discussed. The paper showed many methods for detecting, characterizing, estimating, and using diseases in a range of crops. The methods of increasing the number of images in the data set were shown. Two methods were presented, the first is traditional reinforcement methods, and the second is generative adversarial networks. And many of the advantages have been demonstrated in the research paper for the work that has been done in the field of deep learning

    FPGA implementation and performance comparison of a Bayesian face detection system

    Get PDF
    Face detection has primarily been a software-based effort. A hardware-based approach can provide significant speed-up over its software counterpart. Advances in transistor technology have made it possible to produce larger and faster FPGAs at more affordable prices. Through VHDL and synthesis tools it is possible to rapidly develop a hardware-based solution to face detection on an FPGA. This work analyzes and compares the performance of a feature-invariant face detection method implemented in software and an FPGA. The primary components of the face detector were a Bayesian classifier used to segment the image into skin and nonskin pixels, and a direct least square elliptical fitting technique to determine if the skin region\u27s shape has elliptical characteristics similar to a face. The C++ implementation was benchmarked on several high performance workstations, while the VHDL implementation was synthesized for FPGAs from several Xilinx product lines. The face detector used to compare software and hardware performance had a modest correct detection rate of 48.6% and a false alarm rate of 29.7%. The elliptical-shape of the region was determined to be an inaccurate approach for filtering out non-face skin regions. The software-based face detector was capable of detecting faces within images of approximately 378x567 pixels or less at 20 frames per second on Pentium 4 and Pentium D systems. The FPGA-based implementation was capable of faster detection speeds; a speedup of 3.33 was seen on a Spartan 3 and 4.52 on a Virtex 4. The comparison shows that an FPGA-based face detector could provide a significant increase in computational speed
    • …
    corecore