Search CORE

1,246 research outputs found

Power efficient dataflow design for a heterogeneous smart camera architecture

Author: Bhowmik Deepayan
Garcia Paulo
Michaelson Greg
Stewart Robert
Wallace Andrew
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2017
Field of study

Visual attention modelling characterises the scene to segment regions of visual interest and is increasingly being used as a pre-processing step in many computer vision applications including surveillance and security. Smart camera architectures are an emerging technology and a foundation of security and safety frameworks in modern vision systems. In this paper, we present a dataflow design of a visual saliency based camera architecture targeting a heterogeneous CPU+FPGA platform to propose a smart camera network infrastructure. The proposed design flow encompasses image processing algorithm implementation, hardware & software integration and network connectivity through a unified model. By leveraging the properties of the dataflow paradigm, we iteratively refine the algorithm specification into a deployable solution, addressing distinct requirements at each design stage: from algorithm accuracy to hardware-software interactions, real-time execution and power consumption. Our design achieved real-time run time performance and the power consumption of the optimised asynchronous design is reported at only 0.25 Watt. The resource usages on a Xilinx Zynq platform remains significantly low

Crossref

Heriot Watt Pure

Stirling Online Research Repository (RIOXX)

Sheffield Hallam University Research Archive

Stirling Online Research Repository

LEGaTO: first steps towards energy-efficient toolset for heterogeneous computing

Author: Alvarez Carlos
Bautista Leonardo
Becker Tobias
Billung-Meyer Gunnar
Carpenter Paul
Christmann Wolfgang
Cristal Adrian
De La Cruz Raul
Dubhashi Devdatt
Etsion Yoav
Felber Pascal
Fetzer Christof
Gaydadjiev Georgi
Göttel Christian
Hadar Elad
Hagemeyer Jens
Jimenez Daniel
Jungeblut Thorsten
Kaiser Martin
Klawonn Frank
Krupop Stefan
Kucza Nils
Madonar Sergi
Martorell Xavier
Mihklafi Amani
Mudge Trevor
Mudge Trevor
Pasin Marcelo
Pericàs Miquel
Pnevmatikatos Dionisios N.
Porrmann Mario
Port Oron
Rocha Isabelly
Salami Behzad
Salomonsson Hans
Schiavoni Valerio
Trancoso Pedro
Unsal Osman S.
vor dem Berge Micha
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

LEGaTO is a three-year EU H2020 project which started in December 2017. The LEGaTO project will leverage task-based programming models to provide a software ecosystem for Made-in-Europe heterogeneous hardware composed of CPUs, GPUs, FPGAs and dataflow engines. The aim is to attain one order of magnitude energy savings from the edge to the converged cloud/HPC.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Chalmers Research

Publications at Bielefeld University

The Octopus switch

Author: Havinga Paul Johannes Mattheus
Publication venue: University of Twente
Publication date: 01/01/2000
Field of study

This chapter1 discusses the interconnection architecture of the Mobile Digital Companion. The approach to build a low-power handheld multimedia computer presented here is to have autonomous, reconfigurable modules such as network, video and audio devices, interconnected by a switch rather than by a bus, and to offload as much as work as possible from the CPU to programmable modules placed in the data streams. Thus, communication between components is not broadcast over a bus but delivered exactly where it is needed, work is carried out where the data passes through, bypassing the memory. The amount of buffering is minimised, and if it is required at all, it is placed right on the data path, where it is needed. A reconfigurable internal communication network switch called Octopus exploits locality of reference and eliminates wasteful data copies. The switch is implemented as a simplified ATM switch and provides Quality of Service guarantees and enough bandwidth for multimedia applications. We have built a testbed of the architecture, of which we will present performance and energy consumption characteristics

University of Twente Research Information

Memory and information processing in neuromorphic systems

Author: Indiveri Giacomo
Liu Shih-Chii
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

A striking difference between brain-inspired neuromorphic processors and current von Neumann processors architectures is the way in which memory and processing is organized. As Information and Communication Technologies continue to address the need for increased computational power through the increase of cores within a digital processor, neuromorphic engineers and scientists can complement this need by building processor architectures where memory is distributed with the processing. In this paper we present a survey of brain-inspired processor architectures that support models of cortical networks and deep neural networks. These architectures range from serial clocked implementations of multi-neuron systems to massively parallel asynchronous ones and from purely digital systems to mixed analog/digital systems which implement more biological-like models of neurons and synapses together with a suite of adaptation and learning mechanisms analogous to the ones found in biological nervous systems. We describe the advantages of the different approaches being pursued and present the challenges that need to be addressed for building artificial neural processing systems that can display the richness of behaviors seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed neuromorphic computing platforms and system

arXiv.org e-Print Archive

ZORA

Crypto Acceleration Using Asynchronous FPGAs

Author: Barcelo Bryce Thomas
Taylor John Alexander
Publication venue: Digital WPI
Publication date: 23/04/2008
Field of study

The goal of this project, sponsored by General Dynamics C4 Systems, is to evaluate proprietary FPGA technology developed by Achronix Semiconductor Corporation and its effectiveness using a 128-bit, one clock cycle multiplier in a finite field, GF(2128), as a test application. The testing will determine if there is a significant increase in speed that can be achieved by simple modifications of existing synchronous HDL designs using three metrics: number of LUTs, number of registers, and clock speed

DigitalCommons@WPI

Design of Asynchronous Processor

Author: Puah Wei Boo
Publication venue
Publication date: 01/07/2001
Field of study

There has been a resurgence of interest in asynchronous design recently. The renewed interest in asynchronous design results from its potential to address the problem faced by the synchronous design methodology. In asynchronous methodology, there is no global clock controlling the synchronization of a circuit; instead, the data communication between each functional unit is completed through local request-acknowledge handshake protocol. The growth in demand of high performance portable systems has accelerated asynchronous logic design technique which can offers better performance and lower power consumption especially in the development of the asynchronous processor for mobile and portable application. In this thesis, the design and verification of an 8-bit asynchronous pipelined processor is presented. The developed asynchronous processor is based on Harvard architecture and uses Reduced Instruction Set Computer (RISC) instruction set architecture. 24 instructions are supported by the processor including register, memory, branch and jump operations. The processor has three-stage pipelining i.e. fetch, decode and execution pipeline. Micropipelines framework with 2-phase signalling protocol and bundled-data approach is employed in designing complex and powerful asynchronous control circuits for the processor. Very High Speed Integrated Circuit Hardware Description Language (VHDL) is used to design and construct all parts of the asynchronous processor. Simulation, synthesis and verification of the processor are carried out using MAX +PLUS II software. The simulation results have demonstrated that the developed 8-bit asynchronous RISC processor is working correctly using current Field Programmable Gate Array (FPGA) technology. This processor employed 903 logic cells and has 6144 memory bits for instruction and data memory. Each of the processor subsystem can operates at different cycle time, thus enable an asynchronous processor achieving 11.95MHz average speed performance

Universiti Putra Malaysia Institutional Repository