4,497 research outputs found
A novel asynchronous FPGA architecture design and its performance evaluation
ABSTRACT This paper proposes GAPLA: a Globally Asynchronous Locally Synchronous Programmable Logic Array architecture. The whole FPGA area is divided into locally synchronous blocks wrapped with asynchronous I/O interfaces. Data communications between synchronous blocks are controlled by 2-phase handshaking signals under bundled-data delay assumption. The size and shape of each locally synchronous block are programmable so that different modules in a design can be effectively implemented. By dividing the FPGA area into smaller blocks, the delays of long interconnect wires, which could easily dominate other delays in conventional FPGAs, only come into picture when there are communications between blocks. Therefore, each block could run at higher speed. The area overhead of adopting the GALS style in GAPLA architecture is estimated to be very small (about 7%). Experimental results show an up to 55% performance improvement compared to the conventional FPGAs
The AXIOM software layers
AXIOM project aims at developing a heterogeneous computing board (SMP-FPGA).The Software Layers developed at the AXIOM project are explained.OmpSs provides an easy way to execute heterogeneous codes in multiple cores. People and objects will soon share the same digital network for information exchange in a world named as the age of the cyber-physical systems. The general expectation is that people and systems will interact in real-time. This poses pressure onto systems design to support increasing demands on computational power, while keeping a low power envelop. Additionally, modular scaling and easy programmability are also important to ensure these systems to become widespread. The whole set of expectations impose scientific and technological challenges that need to be properly addressed.The AXIOM project (Agile, eXtensible, fast I/O Module) will research new hardware/software architectures for cyber-physical systems to meet such expectations. The technical approach aims at solving fundamental problems to enable easy programmability of heterogeneous multi-core multi-board systems. AXIOM proposes the use of the task-based OmpSs programming model, leveraging low-level communication interfaces provided by the hardware. Modular scalability will be possible thanks to a fast interconnect embedded into each module. To this aim, an innovative ARM and FPGA-based board will be designed, with enhanced capabilities for interfacing with the physical world. Its effectiveness will be demonstrated with key scenarios such as Smart Video-Surveillance and Smart Living/Home (domotics).Peer ReviewedPostprint (author's final draft
An A-FPGA architecture for relative timing based asynchronous designs
pre-printThis paper presents an asynchronous FPGA architecture that is capable of implementing relative timing based asynchronous designs. The architecture uses the Xilinx 7-Series architecture as a starting point and proposes modifications that would make it asynchronous design capable while keeping it fully functional for synchronous designs. Even though the architecture requires additional components, it is observed when implemented on the 64-nm node, the area of the slice was increases marginally by 7%. The architecture leaves configurable routing structures untouched and does not compromise on performance of the synchronous architecture
A Binaural Neuromorphic Auditory Sensor for FPGA: A Spike Signal Processing Approach
This paper presents a new architecture, design
flow, and field-programmable gate array (FPGA) implementation
analysis of a neuromorphic binaural auditory sensor, designed
completely in the spike domain. Unlike digital cochleae that
decompose audio signals using classical digital signal processing
techniques, the model presented in this paper processes information
directly encoded as spikes using pulse frequency modulation
and provides a set of frequency-decomposed audio information
using an address-event representation interface. In this case,
a systematic approach to design led to a generic process for
building, tuning, and implementing audio frequency decomposers
with different features, facilitating synthesis with custom features.
This allows researchers to implement their own parameterized
neuromorphic auditory systems in a low-cost FPGA in order to
study the audio processing and learning activity that takes place
in the brain. In this paper, we present a 64-channel binaural
neuromorphic auditory system implemented in a Virtex-5 FPGA
using a commercial development board. The system was excited
with a diverse set of audio signals in order to analyze its response
and characterize its features. The neuromorphic auditory system
response times and frequencies are reported. The experimental
results of the proposed system implementation with 64-channel
stereo are: a frequency range between 9.6 Hz and 14.6 kHz
(adjustable), a maximum output event rate of 2.19 Mevents/s,
a power consumption of 29.7 mW, the slices requirements
of 11 141, and a system clock frequency of 27 MHz.Ministerio de Economía y Competitividad TEC2012-37868-C04-02Junta de Andalucía P12-TIC-130
Circuit design and analysis for on-FPGA communication systems
On-chip communication system has emerged as a prominently important subject in Very-Large-
Scale-Integration (VLSI) design, as the trend of technology scaling favours logics more than interconnects.
Interconnects often dictates the system performance, and, therefore, research for new
methodologies and system architectures that deliver high-performance communication services
across the chip is mandatory. The interconnect challenge is exacerbated in Field-Programmable
Gate Array (FPGA), as a type of ASIC where the hardware can be programmed post-fabrication.
Communication across an FPGA will be deteriorating as a result of interconnect scaling. The programmable
fabrics, switches and the specific routing architecture also introduce additional latency
and bandwidth degradation further hindering intra-chip communication performance.
Past research efforts mainly focused on optimizing logic elements and functional units in FPGAs.
Communication with programmable interconnect received little attention and is inadequately understood.
This thesis is among the first to research on-chip communication systems that are built on
top of programmable fabrics and proposes methodologies to maximize the interconnect throughput
performance. There are three major contributions in this thesis: (i) an analysis of on-chip
interconnect fringing, which degrades the bandwidth of communication channels due to routing
congestions in reconfigurable architectures; (ii) a new analogue wave signalling scheme that significantly
improves the interconnect throughput by exploiting the fundamental electrical characteristics
of the reconfigurable interconnect structures. This new scheme can potentially mitigate
the interconnect scaling challenges. (iii) a novel Dynamic Programming (DP)-network to provide
adaptive routing in network-on-chip (NoC) systems. The DP-network architecture performs runtime
optimization for route planning and dynamic routing which, effectively utilizes the in-silicon
bandwidth. This thesis explores a new horizon in reconfigurable system design, in which new
methodologies and concepts are proposed to enhance the on-FPGA communication throughput
performance that is of vital importance in new technology processes
A Comprehensive Workflow for General-Purpose Neural Modeling with Highly Configurable Neuromorphic Hardware Systems
In this paper we present a methodological framework that meets novel
requirements emerging from upcoming types of accelerated and highly
configurable neuromorphic hardware systems. We describe in detail a device with
45 million programmable and dynamic synapses that is currently under
development, and we sketch the conceptual challenges that arise from taking
this platform into operation. More specifically, we aim at the establishment of
this neuromorphic system as a flexible and neuroscientifically valuable
modeling tool that can be used by non-hardware-experts. We consider various
functional aspects to be crucial for this purpose, and we introduce a
consistent workflow with detailed descriptions of all involved modules that
implement the suggested steps: The integration of the hardware interface into
the simulator-independent model description language PyNN; a fully automated
translation between the PyNN domain and appropriate hardware configurations; an
executable specification of the future neuromorphic system that can be
seamlessly integrated into this biology-to-hardware mapping process as a test
bench for all software layers and possible hardware design modifications; an
evaluation scheme that deploys models from a dedicated benchmark library,
compares the results generated by virtual or prototype hardware devices with
reference software simulations and analyzes the differences. The integration of
these components into one hardware-software workflow provides an ecosystem for
ongoing preparative studies that support the hardware design process and
represents the basis for the maturity of the model-to-hardware mapping
software. The functionality and flexibility of the latter is proven with a
variety of experimental results
- …