30 research outputs found

    CMOS-3D smart imager architectures for feature detection

    Get PDF
    This paper reports a multi-layered smart image sensor architecture for feature extraction based on detection of interest points. The architecture is conceived for 3-D integrated circuit technologies consisting of two layers (tiers) plus memory. The top tier includes sensing and processing circuitry aimed to perform Gaussian filtering and generate Gaussian pyramids in fully concurrent way. The circuitry in this tier operates in mixed-signal domain. It embeds in-pixel correlated double sampling, a switched-capacitor network for Gaussian pyramid generation, analog memories and a comparator for in-pixel analog-to-digital conversion. This tier can be further split into two for improved resolution; one containing the sensors and another containing a capacitor per sensor plus the mixed-signal processing circuitry. Regarding the bottom tier, it embeds digital circuitry entitled for the calculation of Harris, Hessian, and difference-of-Gaussian detectors. The overall system can hence be configured by the user to detect interest points by using the algorithm out of these three better suited to practical applications. The paper describes the different kind of algorithms featured and the circuitry employed at top and bottom tiers. The Gaussian pyramid is implemented with a switched-capacitor network in less than 50 μs, outperforming more conventional solutions.Xunta de Galicia 10PXIB206037PRMinisterio de Ciencia e Innovación TEC2009-12686, IPT-2011-1625-430000Office of Naval Research N00014111031

    Low Power Architectures for MPEG-4 AVC/H.264 Video Compression

    Get PDF

    Analog parallel processor solutions for video encoding

    Get PDF
    This thesis deals with Cellular Nonlinear Network (CNN) analog parallel processor networks and their implementations in current video coding standards. The target applications are low-power video encoders within 3rd generation mobile terminals. The video codecs of such mobile terminals are defined by either the MPEG-4/H.263 or H.264 video standard. All of these standards are based on the block-based hybrid approach. As block-based motion estimation (ME) is responsible for most of the power consumption of such hybrid video encoders, this thesis deals mostly with low-power ME implementations. Low-power solutions are introduced at both the algorithmic and hardware levels. On the algorithmic level, the introduced implementations are derived from a segmentation algorithm, which has previously been partly realized. The first introduced algorithm reduces the computational complexity of ME within an object-based MPEG-4 encoder. The use of this algorithm enables a 60% drop in the power consumption of Full Search ME. The second algorithm calculates a near-optimal block-size partition for H.264 motion estimation. With this algorithm, the use of computationally complex Lagrange optimization in H.264 ME is not required. The third algorithm reduces the shape bit-rate of an object-based MPEG-4 encoder. On the hardware level a CNN-type ME architecture is introduced. The architecture includes connections and circuitry to fully realize block-based ME. The analog ME implemented with this architecture is capable of lower power than comparable digital realizations. A 9×9 test chip has also been realized. Additionally implemented is a digital predictive ME realization that takes advantage of the introduced partition algorithm. Although the IC layout of the ME algorithm was drawn, the design was verified as an FPGA.reviewe

    Image Feature Extraction Acceleration

    Get PDF
    Image feature extraction is instrumental for most of the best-performing algorithms in computer vision. However, it is also expensive in terms of computational and memory resources for embedded systems due to the need of dealing with individual pixels at the earliest processing levels. In this regard, conventional system architectures do not take advantage of potential exploitation of parallelism and distributed memory from the very beginning of the processing chain. Raw pixel values provided by the front-end image sensor are squeezed into a high-speed interface with the rest of system components. Only then, after deserializing this massive dataflow, parallelism, if any, is exploited. This chapter introduces a rather different approach from an architectural point of view. We present two Application-Specific Integrated Circuits (ASICs) where the 2-D array of photo-sensitive devices featured by regular imagers is combined with distributed memory supporting concurrent processing. Custom circuitry is added per pixel in order to accelerate image feature extraction right at the focal plane. Specifically, the proposed sensing-processing chips aim at the acceleration of two flagships algorithms within the computer vision community: the Viola-Jones face detection algorithm and the Scale Invariant Feature Transform (SIFT). Experimental results prove the feasibility and benefits of this architectural solution.Ministerio de Economía y Competitividad TEC2012-38921-C02, IPT-2011- 1625-430000, IPC-20111009Junta de Andalucía TIC 2338-2013Xunta de Galicia EM2013/038Office of NavalResearch (USA) N00014141035

    Low-Power CMOS Vision Sensor for Gaussian Pyramid Extraction

    Get PDF
    This paper introduces a CMOS vision sensor chip in a standard 0.18 μm CMOS technology for Gaussian pyramid extraction. The Gaussian pyramid provides computer vision algorithms with scale invariance, which permits having the same response regardless of the distance of the scene to the camera. The chip comprises 176×120 photosensors arranged into 88×60 processing elements (PEs). The Gaussian pyramid is generated with a double-Euler switched capacitor (SC) network. Every PE comprises four photodiodes, one 8 b single-slope analog-to-digital converter, one correlated double sampling circuit, and four state capacitors with their corresponding switches to implement the double-Euler SC network. Every PE occupies 44×44 μm2 . Measurements from the chip are presented to assess the accuracy of the generated Gaussian pyramid for visual tracking applications. Error levels are below 2% full-scale output, thus making the chip feasible for these applications. Also, energy cost is 26.5 nJ/px at 2.64 Mpx/s, thus outperforming conventional solutions of imager plus microprocessor unit.Office of Naval Research, USA N00014-14-1-0355Ministerio de Economía y Competitividad TEC2015-66878- C3-1-R, TEC2015-66878-C3-3-RJunta de Andalucía TIC 2338, EM2013/038, EM2014/01

    Dynamically reconfigurable management of energy, performance, and accuracy applied to digital signal, image, and video Processing Applications

    Get PDF
    There is strong interest in the development of dynamically reconfigurable systems that can meet real-time constraints in energy/power-performance-accuracy (EPA/PPA). In this dissertation, I introduce a framework for implementing dynamically reconfigurable digital signal, image, and video processing systems. The basic idea is to first generate a collection of Pareto-optimal realizations in the EPA/PPA space. Dynamic EPA/PPA management is then achieved by selecting the Pareto-optimal implementations that can meet the real-time constraints. The systems are then demonstrated using Dynamic Partial Reconfiguration (DPR) and dynamic frequency control on FPGAs. The framework is demonstrated on: i) a dynamic pixel processor, ii) a dynamically reconfigurable 1-D digital filtering architecture, and iii) a dynamically reconfigurable 2-D separable digital filtering system. Efficient implementations of the pixel processor are based on the use of look-up tables and local-multiplexes to minimize FPGA resources. For the pixel-processor, different realizations are generated based on the number of input bits, the number of cores, the number of output bits, and the frequency of operation. For each parameters combination, there is a different pixel-processor realization. Pareto-optimal realizations are selected based on measurements of energy per frame, PSNR accuracy, and performance in terms of frames per second. Dynamic EPA/PPA management is demonstrated for a sequential list of real-time constraints by selecting optimal realizations and implementing using DPR and dynamic frequency control. Efficient FPGA implementations for the 1-D and 2-D FIR filters are based on the use a distributed arithmetic technique. Different realizations are generated by varying the number of coefficients, coefficient bitwidth, and output bitwidth. Pareto-optimal realizations are selected in the EPA space. Dynamic EPA management is demonstrated on the application of real-time EPA constraints on a digital video. The results suggest that the general framework can be applied to a variety of digital signal, image, and video processing systems. It is based on the use of offline-processing that is used to determine the Pareto-optimal realizations. Real-time constraints are met by selecting Pareto-optimal realizations pre-loaded in memory that are then implemented efficiently using DPR and/or dynamic frequency control

    2D operators on topographic and non-topographic architectures-implementation, efficiency analysis, and architecture selection methodology

    Get PDF
    Topographic and non-topographic image processing architectures and chips, developed within the CNN community recently, are analysed and compared. It is achieved on a way that the 2D operators are collected to classes according to their implementation methods on the different architectures, and the main implementation parameters of the different operator classes are compared. Based on the results, an efficient architecture selection methodology is formalized

    Flexi-WVSNP-DASH: A Wireless Video Sensor Network Platform for the Internet of Things

    Get PDF
    abstract: Video capture, storage, and distribution in wireless video sensor networks (WVSNs) critically depends on the resources of the nodes forming the sensor networks. In the era of big data, Internet of Things (IoT), and distributed demand and solutions, there is a need for multi-dimensional data to be part of the Sensor Network data that is easily accessible and consumable by humanity as well as machinery. Images and video are expected to become as ubiquitous as is the scalar data in traditional sensor networks. The inception of video-streaming over the Internet, heralded a relentless research for effective ways of distributing video in a scalable and cost effective way. There has been novel implementation attempts across several network layers. Due to the inherent complications of backward compatibility and need for standardization across network layers, there has been a refocused attention to address most of the video distribution over the application layer. As a result, a few video streaming solutions over the Hypertext Transfer Protocol (HTTP) have been proposed. Most notable are Apple’s HTTP Live Streaming (HLS) and the Motion Picture Experts Groups Dynamic Adaptive Streaming over HTTP (MPEG-DASH). These frameworks, do not address the typical and future WVSN use cases. A highly flexible Wireless Video Sensor Network Platform and compatible DASH (WVSNP-DASH) are introduced. The platform's goal is to usher video as a data element that can be integrated into traditional and non-Internet networks. A low cost, scalable node is built from the ground up to be fully compatible with the Internet of Things Machine to Machine (M2M) concept, as well as the ability to be easily re-targeted to new applications in a short time. Flexi-WVSNP design includes a multi-radio node, a middle-ware for sensor operation and communication, a cross platform client facing data retriever/player framework, scalable security as well as a cohesive but decoupled hardware and software design.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Etude et implantation d'algorithmes de compression d'images dans un environnement mixte matériel et logiciel

    Get PDF
    Le sujet de cette thèse est la contribution au développement et à la conception d’un système multimédia embarqué en utilisant la méthodologie de conception conjointe logicielle/matérielle (codesign). Il en a découlé la constitution d’une bibliothèque des modules IP (Intellectual Property) pour les applications vidéo. Dans ce contexte, une plateforme matérielle d’acquisition et de restitution vidéo a été réalisée servant de préalable à l’évaluation de la méthodologie de conception en codesign et à toute étude d’algorithme de traitement vidéo. On s’est intéressé en particulier à l’étude et à l’implantation de la norme de compression vidéo H.263 de l’organisme UIT-T. La fréquence de fonctionnement de la plateforme est de 120 MHz. L’ensemble du développement est exécuté par le processeur NIOS II sous le système d’exploitation μClinux. Le codeur H.263 ainsi développé, grâce aux différents accélérateurs matériels pour le SAD, TCD/TCDI et Q/QI permet de coder des séquences vidéo [email protected] main purpose of this thesis was to contribute to the development and to the design of an embedded system for multimedia by using the HW/SW methodology (codesign). A library of flexible IP cores (Intellectual Property) for video applications was created. Within this framework, a hardware platform for video acquisition and video restitution was achieved in order to evaluate the codesign methodology approach and to study the video processing algorithm. We have studied and implemented the H.263 video encoder from the UIT-T organism. The frequency of our platform is about 120 MHz. The whole development was executed under μClinux Operating System and controlled by the NIOS II processor. The H.263 encoder was developed with different hardware accelerators for the SAD, TCD/TCDI and Q/QI operations and permits finally to code video sequences at QCIF@15Hz

    Etude et Implantation d'Algorithmes de Compression d'Images dans un Environnement Mixte Matériel et Logiciel

    Get PDF
    The main purpose of this thesis was to contribute to the development and to the design of anembedded system for multimedia by using the HW/SW methodology (codesign). A library offlexible IP cores (Intellectual Property) for video applications was created. Within thisframework, a hardware platform for video acquisition and video restitution was achieved inorder to evaluate the codesign methodology approach and to study the video processingalgorithm. We have studied and implemented the H.263 video encoder from the UIT-Torganism. The frequency of our platform is about 120 MHz. The whole development wasexecuted under μClinux Operating System and controlled by the NIOS II processor. TheH.263 encoder was developed with different hardware accelerators for the SAD, TCD/TCDIand Q/QI operations and permits finally to code video sequences at [email protected] sujet de cette thèse est la contribution au développement et à la conception d’un systèmemultimédia embarqué en utilisant la méthodologie de conception conjointelogicielle/matérielle (codesign). Il en a découlé la constitution d’une bibliothèque des modulesIP (Intellectual Property) pour les applications vidéo. Dans ce contexte, une plateformematérielle d’acquisition et de restitution vidéo a été réalisée servant de préalable àl’évaluation de la méthodologie de conception en codesign et à toute étude d’algorithme detraitement vidéo. On s’est intéressé en particulier à l’étude et à l’implantation de la norme decompression vidéo H.263 de l’organisme UIT-T. La fréquence de fonctionnement de laplateforme est de 120 MHz. L’ensemble du développement est exécuté par le processeurNIOS II sous le système d’exploitation μClinux. Le codeur H.263 ainsi développé, grâce auxdifférents accélérateurs matériels pour le SAD, TCD/TCDI et Q/QI permet de coder desséquences vidéo QCIF@15Hz
    corecore