19,702 research outputs found
System Synthesis for Networks of Programmable Blocks
The advent of sensor networks presents untapped opportunities for synthesis.
We examine the problem of synthesis of behavioral specifications into networks
of programmable sensor blocks. The particular behavioral specification we
consider is an intuitive user-created network diagram of sensor blocks, each
block having a pre-defined combinational or sequential behavior. We synthesize
this specification to a new network that utilizes a minimum number of
programmable blocks in place of the pre-defined blocks, thus reducing network
size and hence network cost and power. We focus on the main task of this
synthesis problem, namely partitioning pre-defined blocks onto a minimum number
of programmable blocks, introducing the efficient but effective PareDown
decomposition algorithm for the task. We describe the synthesis and simulation
tools we developed. We provide results showing excellent network size
reductions through such synthesis, and significant speedups of our algorithm
over exhaustive search while obtaining near-optimal results for 15 real network
designs as well as nearly 10,000 randomly generated designs.Comment: Submitted on behalf of EDAA (http://www.edaa.com/
OPTIMAL AREA AND PERFORMANCE MAPPING OF K-LUT BASED FPGAS
FPGA circuits are increasingly used in many fields: for rapid prototyping of new products (including fast ASIC implementation), for logic emulation, for producing a small number of a device, or if a device should be reconfigurable in use (reconfigurable computing). Determining if an arbitrary, given wide, function can be implemented by a programmable logic block, unfortunately, it is generally, a very difficult problem. This problem is called the Boolean matching problem. This paper introduces a new implemented algorithm able to map, both for area and performance, combinational networks using k-LUT based FPGAs.k-LUT based FPGAs, combinational circuits, performance-driven mapping.
Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions
In the past decade, Convolutional Neural Networks (CNNs) have demonstrated
state-of-the-art performance in various Artificial Intelligence tasks. To
accelerate the experimentation and development of CNNs, several software
frameworks have been released, primarily targeting power-hungry CPUs and GPUs.
In this context, reconfigurable hardware in the form of FPGAs constitutes a
potential alternative platform that can be integrated in the existing deep
learning ecosystem to provide a tunable balance between performance, power
consumption and programmability. In this paper, a survey of the existing
CNN-to-FPGA toolflows is presented, comprising a comparative study of their key
characteristics which include the supported applications, architectural
choices, design space exploration methods and achieved performance. Moreover,
major challenges and objectives introduced by the latest trends in CNN
algorithmic research are identified and presented. Finally, a uniform
evaluation methodology is proposed, aiming at the comprehensive, complete and
in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal,
201
A Modular Programmable CMOS Analog Fuzzy Controller Chip
We present a highly modular fuzzy inference analog CMOS chip architecture with on-chip digital programmability. This chip consists of the interconnection of parameterized instances of two different kind of blocks, namely label blocks and rule blocks. The architecture realizes a lattice partition of the universe of discourse, which at the hardware level means that the fuzzy labels associated to every input (realized by the label blocks) are shared among the rule blocks. This reduces the area and power consumption and is the key point for chip modularity. The proposed architecture is demonstrated through a 16-rule two input CMOS 1-μm prototype which features an operation speed of 2.5 Mflips (2.5×10^6 fuzzy inferences per second) with 8.6 mW power consumption. Core area occupation of this prototype is of only 1.6 mm 2 including the digital control and memory circuitry used for programmability. Because of the architecture modularity the number of inputs and rules can be increased with any hardly design effort.This work was
supported in part by the Spanish C.I.C.Y.T under Contract TIC96-1392-C02-
02 (SIVA)
Packet Transactions: High-level Programming for Line-Rate Switches
Many algorithms for congestion control, scheduling, network measurement,
active queue management, security, and load balancing require custom processing
of packets as they traverse the data plane of a network switch. To run at line
rate, these data-plane algorithms must be in hardware. With today's switch
hardware, algorithms cannot be changed, nor new algorithms installed, after a
switch has been built.
This paper shows how to program data-plane algorithms in a high-level
language and compile those programs into low-level microcode that can run on
emerging programmable line-rate switching chipsets. The key challenge is that
these algorithms create and modify algorithmic state. The key idea to achieve
line-rate programmability for stateful algorithms is the notion of a packet
transaction : a sequential code block that is atomic and isolated from other
such code blocks. We have developed this idea in Domino, a C-like imperative
language to express data-plane algorithms. We show with many examples that
Domino provides a convenient and natural way to express sophisticated
data-plane algorithms, and show that these algorithms can be run at line rate
with modest estimated die-area overhead.Comment: 16 page
- …