13 research outputs found

    Low-Error and High-Throughput Discrete Cosine Transform (DCT) Design

    Get PDF
    In this paper, by operating the shifting and addition in parallel, an error-compensated adder-tree (ECAT) is proposed to deal with the truncation errors and to achieve low-error and high-throughput discrete cosine transform (DCT) design.  Many DCT architectures were proposed on systolic design to reduce the number of multipliers in the systolic design as multipliers consumes high power and occupy less area . Instead of the 12 bits used in previous works, 9-bit distributed arithmetic-precision is chosen for this work so as to meet peak-signal-to-noise-ratio (PSNR) requirements. He proposed 2-D DCT core synthesized by usingXilinx ISE 9.1, and the Xilinx XC2VP30 FPGA can achieve 792 megapixels per second (M-pels/sec) throughput rate

    Investigation of a Novel Common Subexpression Elimination Method for Low Power and Area Efficient DCT Architecture

    Get PDF
    A wide interest has been observed to find a low power and area efficient hardware design of discrete cosine transform (DCT) algorithm. This research work proposed a novel Common Subexpression Elimination (CSE) based pipelined architecture for DCT, aimed at reproducing the cost metrics of power and area while maintaining high speed and accuracy in DCT applications. The proposed design combines the techniques of Canonical Signed Digit (CSD) representation and CSE to implement the multiplier-less method for fixed constant multiplication of DCT coefficients. Furthermore, symmetry in the DCT coefficient matrix is used with CSE to further decrease the number of arithmetic operations. This architecture needs a single-port memory to feed the inputs instead of multiport memory, which leads to reduction of the hardware cost and area. From the analysis of experimental results and performance comparisons, it is observed that the proposed scheme uses minimum logic utilizing mere 340 slices and 22 adders. Moreover, this design meets the real time constraints of different video/image coders and peak-signal-to-noise-ratio (PSNR) requirements. Furthermore, the proposed technique has significant advantages over recent well-known methods along with accuracy in terms of power reduction, silicon area usage, and maximum operating frequency by 41%, 15%, and 15%, respectively

    Multi-core architectures with coarse-grained dynamically reconfigurable processors for broadband wireless access technologies

    Get PDF
    Broadband Wireless Access technologies have significant market potential, especially the WiMAX protocol which can deliver data rates of tens of Mbps. Strong demand for high performance WiMAX solutions is forcing designers to seek help from multi-core processors that offer competitive advantages in terms of all performance metrics, such as speed, power and area. Through the provision of a degree of flexibility similar to that of a DSP and performance and power consumption advantages approaching that of an ASIC, coarse-grained dynamically reconfigurable processors are proving to be strong candidates for processing cores used in future high performance multi-core processor systems. This thesis investigates multi-core architectures with a newly emerging dynamically reconfigurable processor – RICA, targeting WiMAX physical layer applications. A novel master-slave multi-core architecture is proposed, using RICA processing cores. A SystemC based simulator, called MRPSIM, is devised to model this multi-core architecture. This simulator provides fast simulation speed and timing accuracy, offers flexible architectural options to configure the multi-core architecture, and enables the analysis and investigation of multi-core architectures. Meanwhile a profiling-driven mapping methodology is developed to partition the WiMAX application into multiple tasks as well as schedule and map these tasks onto the multi-core architecture, aiming to reduce the overall system execution time. Both the MRPSIM simulator and the mapping methodology are seamlessly integrated with the existing RICA tool flow. Based on the proposed master-slave multi-core architecture, a series of diverse homogeneous and heterogeneous multi-core solutions are designed for different fixed WiMAX physical layer profiles. Implemented in ANSI C and executed on the MRPSIM simulator, these multi-core solutions contain different numbers of cores, combine various memory architectures and task partitioning schemes, and deliver high throughputs at relatively low area costs. Meanwhile a design space exploration methodology is developed to search the design space for multi-core systems to find suitable solutions under certain system constraints. Finally, laying a foundation for future multithreading exploration on the proposed multi-core architecture, this thesis investigates the porting of a real-time operating system – Micro C/OS-II to a single RICA processor. A multitasking version of WiMAX is implemented on a single RICA processor with the operating system support

    Real -time Retinex image enhancement: Algorithm and architecture optimizations

    Get PDF
    The field of digital image processing encompasses the study of algorithms applied to two-dimensional digital images, such as photographs, or three-dimensional signals, such as digital video. Digital image processing algorithms are generally divided into several distinct branches including image analysis, synthesis, segmentation, compression, restoration, and enhancement. One particular image enhancement algorithm that is rapidly gaining widespread acceptance as a near optimal solution for providing good visual representations of scenes is the Retinex.;The Retinex algorithm performs a non-linear transform that improves the brightness, contrast and sharpness of an image. It simultaneously provides dynamic range compression, color constancy, and color rendition. It has been successfully applied to still imagery---captured from a wide variety of sources including medical radiometry, forensic investigations, and consumer photography. Many potential users require a real-time implementation of the algorithm. However, prior to this research effort, no real-time version of the algorithm had ever been achieved.;In this dissertation, we research and provide solutions to the issues associated with performing real-time Retinex image enhancement. We design, develop, test, and evaluate the algorithm and architecture optimizations that we developed to enable the implementation of the real-time Retinex specifically targeting specialized, embedded digital signal processors (DSPs). This includes optimization and mapping of the algorithm to different DSPs, and configuration of these architectures to support real-time processing.;First, we developed and implemented the single-scale monochrome Retinex on a Texas Instruments TMS320C6711 floating-point DSP and attained 21 frames per second (fps) performance. This design was then transferred to the faster TMS320C6713 floating-point DSP and ran at 28 fps. Then we modified our design for the fixed-point TMS320DM642 DSP and achieved an execution rate of 70 fps. Finally, we migrated this design to the fixed-point TMS320C6416 DSP. After making several additional optimizations and exploiting the enhanced architecture of the TMS320C6416, we achieved 108 fps and 20 fps performance for the single-scale, monochrome Retinex and three-scale, color Retinex, respectively. We also applied a version of our real-time Retinex in an Enhanced Vision System. This provides a general basis for using the algorithm in other applications

    Survey of FPGA applications in the period 2000 – 2015 (Technical Report)

    Get PDF
    Romoth J, Porrmann M, Rückert U. Survey of FPGA applications in the period 2000 – 2015 (Technical Report).; 2017.Since their introduction, FPGAs can be seen in more and more different fields of applications. The key advantage is the combination of software-like flexibility with the performance otherwise common to hardware. Nevertheless, every application field introduces special requirements to the used computational architecture. This paper provides an overview of the different topics FPGAs have been used for in the last 15 years of research and why they have been chosen over other processing units like e.g. CPUs

    Hardware acceleration of the trace transform for vision applications

    Get PDF
    Computer Vision is a rapidly developing field in which machines process visual data to extract meaningful information. Digitised images in their pixels and bits serve no purpose of their own. It is only by interpreting the data, and extracting higher level information that a scene can be understood. The algorithms that enable this process are often complex, and data-intensive, limiting the processing rate when implemented in software. Hardware-accelerated implementations provide a significant performance boost that can enable real- time processing. The Trace Transform is a newly proposed algorithm that has been proven effective in image categorisation and recognition tasks. It is flexibly defined allowing the mathematical details to be tailored to the target application. However, it is highly computationally intensive, which limits its applications. Modern heterogeneous FPGAs provide an ideal platform for accelerating the Trace transform for real-time performance, while also allowing an element of flexibility, which highly suits the generality of the Trace transform. This thesis details the implementation of an extensible Trace transform architecture for vision applications, before extending this architecture to a full flexible platform suited to the exploration of Trace transform applications. As part of the work presented, a general set of architectures for large-windowed median and weighted median filters are presented as required for a number of Trace transform implementations. Finally an acceleration of Pseudo 2-Dimensional Hidden Markov Model decoding, usable in a person detection system, is presented. Such a system can be used to extract frames of interest from a video sequence, to be subsequently processed by the Trace transform. All these architectures emphasise the need for considered, platform-driven design in achieving maximum performance through hardware acceleration

    Intelligent Sensor Networks

    Get PDF
    In the last decade, wireless or wired sensor networks have attracted much attention. However, most designs target general sensor network issues including protocol stack (routing, MAC, etc.) and security issues. This book focuses on the close integration of sensing, networking, and smart signal processing via machine learning. Based on their world-class research, the authors present the fundamentals of intelligent sensor networks. They cover sensing and sampling, distributed signal processing, and intelligent signal learning. In addition, they present cutting-edge research results from leading experts

    Intelligent Circuits and Systems

    Get PDF
    ICICS-2020 is the third conference initiated by the School of Electronics and Electrical Engineering at Lovely Professional University that explored recent innovations of researchers working for the development of smart and green technologies in the fields of Energy, Electronics, Communications, Computers, and Control. ICICS provides innovators to identify new opportunities for the social and economic benefits of society.  This conference bridges the gap between academics and R&D institutions, social visionaries, and experts from all strata of society to present their ongoing research activities and foster research relations between them. It provides opportunities for the exchange of new ideas, applications, and experiences in the field of smart technologies and finding global partners for future collaboration. The ICICS-2020 was conducted in two broad categories, Intelligent Circuits & Intelligent Systems and Emerging Technologies in Electrical Engineering

    Advances in Robotics, Automation and Control

    Get PDF
    The book presents an excellent overview of the recent developments in the different areas of Robotics, Automation and Control. Through its 24 chapters, this book presents topics related to control and robot design; it also introduces new mathematical tools and techniques devoted to improve the system modeling and control. An important point is the use of rational agents and heuristic techniques to cope with the computational complexity required for controlling complex systems. Through this book, we also find navigation and vision algorithms, automatic handwritten comprehension and speech recognition systems that will be included in the next generation of productive systems developed by man
    corecore