Search CORE

47 research outputs found

A Visual Environment for Real-Time Image Processing in Hardware (VERTIPH)

Author: C. T. Johnston
D. G. Bailey
P. Lyons
Publication venue: Springer
Publication date: 01/01/2006
Field of study

Real-time video processing is an image-processing application that is ideally suited to implementation on FPGAs. We discuss the strengths and weaknesses of a number of existing languages and hardware compilers that have been developed for specifying image processing algorithms on FPGAs. We propose VERTIPH, a new multiple-view visual language that avoids the weaknesses we identify. A VERTIPH design incorporates three different views, each tailored to a different aspect of the image processing system under development; an overall architectural view, a computational view, and a resource and scheduling view

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Hardware implementation of intelligent systems

Author: Ranga Vemuri
Sriram Govindarajan
Iyad Ouaiss
Meenakshi Kaul
Vinoo Srinivasan
Shankar Radhakrishnan
Sujatha Sundaraman
Satish Ganesan
Awartika Pandey
Preetham Lakshmikanthan
Publication venue: Springer Link
Publication date: 01/01/2001
Field of study

interior, interior vie

Crossref

Lebanese American University Repository

MIT Libraries Dome

Designing BEE: A Hardware Emulation Engine for Signal Processing in Low-Power Wireless Applications

Author
Publication venue: Springer
Publication date
Field of study

Springer - Publisher Connector

Design synthesis for dynamically reconfigurable logic systems

Author: Vasilko Milan
Publication venue
Publication date
Field of study

Dynamic reconfiguration of logic circuits has been a research problem for over four decades. While applications using logic reconfiguration in practical scenarios have been demonstrated, the design of these systems has proved to be a difficult process demanding the skills of an experienced reconfigurable logic design expert. This thesis proposes an automatic synthesis method which relieves designers of some of the difficulties associated with designing partially dynamically reconfigurable systems. A new design abstraction model for reconfigurable systems is proposed in order to support design exploration using the presented method. Given an input behavioural model, a technology server and a set of design constraints, the method will generate a reconfigurable design solution in the form of a 3D floorplan and a configuration schedule. The approach makes use of genetic algorithms. It facilitates global optimisation to accommodate multiple design objectives common in reconfigurable system design, while making realistic estimates of configuration overheads and of the potential for resource sharing between configurations. A set of custom evolutionary operators has been developed to cope with a multiple-objective search space. Furthermore, the application of a simulation technique verifying the lll results of such an automatic exploration is outlined in the thesis. The qualities of the proposed method are evaluated using a set of benchmark designs taking data from a real reconfigurable logic technology. Finally, some extensions to the proposed method and possible research directions are discussed

Bournemouth University Research Online

Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

Author: Gujarathi Hemal S
McDonald-Maier Klaus D
Qadri Muhammad Yasir
Publication venue: 'Academy Publisher'
Publication date: 01/01/2009
Field of study

The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

University of Essex Research Repository

CiteSeerX

Crossref

Recommended from our members

New polymorphous computing fabric.

Author: Gokhale M. (Maya)
McCabe K. P. (Kevin P.)
Wolinski C. (Christophe)
Publication venue: Los Alamos National Laboratory
Publication date: 01/01/2002
Field of study

This paper introduces a new polymorphous computing Fabric well suited to DSP and Image Processing and describes its implementation on a Configurable System on a Chip (CSOC). The architecture is highly parameterized and enables customization of the synthesized Fabric to achieve high performance for a specific class of application. For this reason it can be considered to be a generic model for hardware accelerator synthesis from a high level specification. Another important innovation is the Fabric uses a global memory concept, which gives the host processor random access to all the variables and instructions on the Fabric. The Fabric supports different computing models including MIMD, SPMD and systolic flow and permits dynamic reconfiguration. We present a specific implementation of a bank of FIR filters on a Fabric composed of 52 cells on the Altera Excalibur ARM running at 33 MHz. The theoretical performance of this Fabric is 1.8 GMACh. For the FIR application we obtain 1.6 GMAC/s real performance. Some automatic tools have been developed like the tool to provide a host access utility and assembler

UNT Digital Library

Rethinking FPGA Architectures for Deep Neural Network applications

Author: Rasoulinezhad Seyedramin
Publication venue: Faculty of Engineering, School of Electrical and Information Engineering
Publication date: 01/01/2023
Field of study

The prominence of machine learning-powered solutions instituted an unprecedented trend of integration into virtually all applications with a broad range of deployment constraints from tiny embedded systems to large-scale warehouse computing machines. While recent research confirms the edges of using contemporary FPGAs to deploy or accelerate machine learning applications, especially where the latency and energy consumption are strictly limited, their pre-machine learning optimised architectures remain a barrier to the overall efficiency and performance. Realizing this shortcoming, this thesis demonstrates an architectural study aiming at solutions that enable hidden potentials in the FPGA technology, primarily for machine learning algorithms. Particularly, it shows how slight alterations to the state-of-the-art architectures could significantly enhance the FPGAs toward becoming more machine learning-friendly while maintaining the near-promised performance for the rest of the applications. Eventually, it presents a novel systematic approach to deriving new block architectures guided by designing limitations and machine learning algorithm characteristics through benchmarking. First, through three modifications to Xilinx DSP48E2 blocks, an enhanced digital signal processing (DSP) block for important computations in embedded deep neural network (DNN) accelerators is described. Then, two tiers of modifications to FPGA logic cell architecture are explained that deliver a variety of performance and utilisation benefits with only minor area overheads. Eventually, with the goal of exploring this new design space in a methodical manner, a problem formulation involving computing nested loops over multiply-accumulate (MAC) operations is first proposed. A quantitative methodology for deriving efficient coarse-grained compute block architectures from benchmarks is then suggested together with a family of new embedded blocks, called MLBlocks

Sydney eScholarship

A unified hardware/software runtime environment for FPGA-based reconfigurable computers using BORPH

Author: Brodersen R
So HKH
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

Fulltext linkThis paper explores the design and implementation of BORPH, an operating system designed for FPGA-based reconfigurable computers. Hardware designs execute as normal UNIX processes under BORPH, having access to standard OS services, such as file system support. Hardware and software components of user designs may, therefore, run as communicating processes within BORPH's runtime environment. The familiar language independent UNIX kernel interface facilitates easy design reuse and rapid application development. To develop hardware designs, a Simulink-based design flow that integrates with BORPH is employed. Performances of BORPH on two on-chip systems implemented on a BEE2 platform are compared. © 2008 ACM.link_to_subscribed_fulltex

HKU Scholars Hub

Ant colony optimization on runtime reconfigurable architectures

Author: Scheuermann Bernd
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2005
Field of study

KITopen

A Field Programmable Gate Array Architecture for Two-Dimensional Partial Reconfiguration

Author: Wang Fei
Publication venue: CORE Scholar
Publication date: 01/01/2006
Field of study

Reconfigurable machines can accelerate many applications by adapting to their needs through hardware reconfiguration. Partial reconfiguration allows the reconfiguration of a portion of a chip while the rest of the chip is busy working on tasks. Operating system models have been proposed for partially reconfigurable machines to handle the scheduling and placement of tasks. They are called OS4RC in this dissertation. The main goal of this research is to address some problems that come from the gap between OS4RC and existing chip architectures and the gap between OS4RC models and practical applications. Some existing OS4RC models are based on an impractical assumption that there is no data exchange channel between IP (Intellectual Property) circuits residing on a Field Programmable Gate Array (FPGA) chip and between an IP circuit and FPGA I/O pins. For models that do not have such an assumption, their inter-IP communication channels have severe drawbacks. Those channels do not work well with 2-D partial reconfiguration. They are not suitable for intensive data stream processing. And frequently they are very complicated to design and very expensive. To address these problems, a new chip architecture that can better support inter-IP and IP-I/O communication is proposed and a corresponding OS4RC kernel is then specified. The proposed FPGA architecture is based on an array of clusters of configurable logic blocks, with each cluster serving as a partial reconfiguration unit, and a mesh of segmented buses that provides inter-IP and IP-I/O communication channels. The proposed OS4RC kernel takes care of the scheduling, placement, and routing of circuits under the constraints of the proposed architecture. Features of the new architecture in turns reduce the kernel execution times and enable the runtime scheduling, placement and routing. The area cost and the configuration memory size of the new chip architecture are calculated and analyzed. And the efficiency of the OS4RC kernel is evaluated via simulation using three different task models

OhioLINK Electronic Thesis and Dissertation Center

CORE