Search CORE

2,506 research outputs found

A C++-embedded Domain-Specific Language for programming the MORA soft processor array

Author: Chalamalasetti S.R.
Margala M.
Purohit S.
Vanderbauwhede W.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

MORA is a novel platform for high-level FPGA programming of streaming vector and matrix operations, aimed at multimedia applications. It consists of soft array of pipelined low-complexity SIMD processors-in-memory (PIM). We present a Domain-Specific Language (DSL) for high-level programming of the MORA soft processor array. The DSL is embedded in C++, providing designers with a familiar language framework and the ability to compile designs using a standard compiler for functional testing before generating the FPGA bitstream using the MORA toolchain. The paper discusses the MORA-C++ DSL and the compilation route into the assembly for the MORA machine and provides examples to illustrate the programming model and performance

Enlighten

HIGH-LEVEL SYNTHESIS OF ELASTICITY: FROM MODELS TO CIRCUITS

Author: Jelodari Mamaghani Mahdi
Publication venue
Publication date: 01/08/2016
Field of study

The University of Manchester - Institutional Repository

Practical advances in asynchronous design

Author: Brunvand Erik L.
Nowick Steven
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1997
Field of study

Journal ArticleRecent practical advances in asynchronous circuit and system design have resulted in renewed interest by circuit designers. Asynchronous systems are being viewed as in increasingly viable alternative to globally synchronous system organization. This tutorial will present the current state of the art in asynchronous circuit and system design in three different areas. The first section details asynchronous control systems. The second describes a variety of approaches to asynchronous datapaths. The third section is on asynchronous and self-timed circuits applied to the design of general purpose processors

The University of Utah: J. Willard Marriott Digital Library

Practical advances in asynchronous design and in asynchronous/synchronous interfaces

Author: Brunvand Erik L.
Nowick Steven
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1999
Field of study

Journal ArticleAsynchronous systems are being viewed as an increasingly viable alternative to purely synchronous systems. This paper gives an overview of the current state of the art in practical asynchronous circuit and system design in four areas: controllers, datapaths, processors, and the design of asynchronous/synchronous interfaces

The University of Utah: J. Willard Marriott Digital Library

Recommended from our members

Synthesis and Optimization of Pipelined Packet Processors

Author: Edwards Stephen A.
Hadzic Ilija
Soviani Cristian
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

We consider pipelined architectures of packet processors consisting of a sequence of simple packet-processing modules interconnected by first-in first-out buffers. We propose a new model for describing their function, an automated synthesis technique that generates efficient hardware for them, and an algorithm for computing minimum buffer sizes that allow such pipelines to achieve their maximum throughput. Our functional model provides a level of abstraction familiar to a network protocol designer; in particular, it does not require knowledge of register-transfer-level hardware design. Our synthesis tool implements the specified function in a sequential circuit that processes packet data a word at a time. Finally, our analysis technique computes the maximum throughput possible from the modules and then determines the smallest buffers that can achieve it. Experimental results conducted on industrial-strength examples suggest that our techniques are practical. Our synthesis algorithm can generate circuits that achieve 40 Gb/s on field-programmable gate arrays, equal to state-of-the-art manual implementations, and our buffer-sizing algorithm has a practically short runtime. Together, our techniques make it easier to quickly develop and deploy high-speed network switches

Columbia University Academic Commons

Recommended from our members

Module interconnection languages : a survey

Author: Neighbors James M.
Prieto-Diaz Ruben
Publication venue: eScholarship, University of California
Publication date: 01/01/1982
Field of study

eScholarship - University of California

Dataflow development of medium-grained parallel software

Author: Harley Jonathan William
Publication venue: Newcastle University
Publication date: 01/01/1993
Field of study

PhD ThesisIn the 1980s, multiple-processor computers (multiprocessors) based on conven- tional processing elements emerged as a popular solution to the continuing demand for ever-greater computing power. These machines offer a general-purpose parallel processing platform on which the size of program units which can be efficiently executed in parallel - the "grain size" - is smaller than that offered by distributed computing environments, though greater than that of some more specialised architectures. However, programming to exploit this medium-grained parallelism remains difficult. Concurrent execution is inherently complex, yet there is a lack of programming tools to support parallel programming activities such as program design, implementation, debugging, performance tuning and so on. In helping to manage complexity in sequential programming, visual tools have often been used to great effect, which suggests one approach towards the goal of making parallel programming less difficult. This thesis examines the possibilities which the dataflow paradigm has to offer as the basis for a set of visual parallel programming tools, and presents a dataflow notation designed as a framework for medium-grained parallel programming. The implementation of this notation as a programming language is discussed, and its suitability for the medium-grained level is examinedScience and Engineering Research Council of Great Britain EC ERASMUS schem

Newcastle University eTheses

High-Level Synthesis for Embedded Systems

Author: Michael Dossis
Publication venue: 'IntechOpen'
Publication date: 02/03/2012
Field of study

IntechOpen

Macroservers: An Execution Model for DRAM Processor-In-Memory Arrays

Author: Sterling Thomas L.
Zima Hans P.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2000
Field of study

The emergence of semiconductor fabrication technology allowing a tight coupling between high-density DRAM and CMOS logic on the same chip has led to the important new class of Processor-In-Memory (PIM) architectures. Newer developments provide powerful parallel processing capabilities on the chip, exploiting the facility to load wide words in single memory accesses and supporting complex address manipulations in the memory. Furthermore, large arrays of PIMs can be arranged into a massively parallel architecture. In this report, we describe an object-based programming model based on the notion of a macroserver. Macroservers encapsulate a set of variables and methods; threads, spawned by the activation of methods, operate asynchronously on the variables' state space. Data distributions provide a mechanism for mapping large data structures across the memory region of a macroserver, while work distributions allow explicit control of bindings between threads and data. Both data and work distributuions are first-class objects of the model, supporting the dynamic management of data and threads in memory. This offers the flexibility required for fully exploiting the processing power and memory bandwidth of a PIM array, in particular for irregular and adaptive applications. Thread synchronization is based on atomic methods, condition variables, and futures. A special type of lightweight macroserver allows the formulation of flexible scheduling strategies for the access to resources, using a monitor-like mechanism

CiteSeerX

Caltech Authors

Hardware Acceleration Using Functional Languages

Author: Hodaňová Andrea
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2013
Field of study

Cílem této práce je prozkoumat možnosti využití funkcionálního paradigmatu pro hardwarovou akceleraci, konkrétně pro datově paralelní úlohy. Úroveň abstrakce tradičních jazyků pro popis hardwaru, jako VHDL a Verilog, přestáví stačit. Pro popis na algoritmické či behaviorální úrovni se rozmáhají jazyky původně navržené pro vývoj softwaru a modelování, jako C/C++, SystemC nebo MATLAB. Funkcionální jazyky se s těmi imperativními nemůžou měřit v rozšířenosti a oblíbenosti mezi programátory, přesto je předčí v mnoha vlastnostech, např. ve verifikovatelnosti, schopnosti zachytit inherentní paralelismus a v kompaktnosti kódu. Pro akceleraci datově paralelních výpočtů se často používají jednotky FPGA, grafické karty (GPU) a vícejádrové procesory. Praktická část této práce rozšiřuje existující knihovnu Accelerate pro počítání na grafických kartách o výstup do VHDL. Accelerate je možno chápat jako doménově specifický jazyk vestavěný do Haskellu s backendem pro prostředí NVIDIA CUDA. Rozšíření pro vysokoúrovňovou syntézu obvodů ve VHDL představené v této práci používá stejný jazyk a frontend.The aim of this thesis is to research how the functional paradigm can be used for hardware acceleration with an emphasis on data-parallel tasks. The level of abstraction of the traditional hardware description languages, such as VHDL or Verilog, is becoming to low. High-level languages from the domains of software development and modeling, such as C/C++, SystemC or MATLAB, are experiencing a boom for hardware description on the algorithmic or behavioral level. Functional Languages are not so commonly used, but they outperform imperative languages in verification, the ability to capture inherent paralellism and the compactness of code. Data-parallel task are often accelerated on FPGAs, GPUs and multicore processors. In this thesis, we use a library for general-purpose GPU programs called Accelerate and extend it to produce VHDL. Accelerate is a domain-specific language embedded into Haskell with a backend for the NVIDIA CUDA platform. We use the language and its frontend, and create a new backend for high-level synthesis of circuits in VHDL.

Digital library of Brno University of Technology

National Repository of Grey Literature