19,702 research outputs found

    System Synthesis for Networks of Programmable Blocks

    Full text link
    The advent of sensor networks presents untapped opportunities for synthesis. We examine the problem of synthesis of behavioral specifications into networks of programmable sensor blocks. The particular behavioral specification we consider is an intuitive user-created network diagram of sensor blocks, each block having a pre-defined combinational or sequential behavior. We synthesize this specification to a new network that utilizes a minimum number of programmable blocks in place of the pre-defined blocks, thus reducing network size and hence network cost and power. We focus on the main task of this synthesis problem, namely partitioning pre-defined blocks onto a minimum number of programmable blocks, introducing the efficient but effective PareDown decomposition algorithm for the task. We describe the synthesis and simulation tools we developed. We provide results showing excellent network size reductions through such synthesis, and significant speedups of our algorithm over exhaustive search while obtaining near-optimal results for 15 real network designs as well as nearly 10,000 randomly generated designs.Comment: Submitted on behalf of EDAA (http://www.edaa.com/

    OPTIMAL AREA AND PERFORMANCE MAPPING OF K-LUT BASED FPGAS

    Get PDF
    FPGA circuits are increasingly used in many fields: for rapid prototyping of new products (including fast ASIC implementation), for logic emulation, for producing a small number of a device, or if a device should be reconfigurable in use (reconfigurable computing). Determining if an arbitrary, given wide, function can be implemented by a programmable logic block, unfortunately, it is generally, a very difficult problem. This problem is called the Boolean matching problem. This paper introduces a new implemented algorithm able to map, both for area and performance, combinational networks using k-LUT based FPGAs.k-LUT based FPGAs, combinational circuits, performance-driven mapping.

    Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

    Get PDF
    In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep learning ecosystem to provide a tunable balance between performance, power consumption and programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics which include the supported applications, architectural choices, design space exploration methods and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal, 201

    A Modular Programmable CMOS Analog Fuzzy Controller Chip

    Get PDF
    We present a highly modular fuzzy inference analog CMOS chip architecture with on-chip digital programmability. This chip consists of the interconnection of parameterized instances of two different kind of blocks, namely label blocks and rule blocks. The architecture realizes a lattice partition of the universe of discourse, which at the hardware level means that the fuzzy labels associated to every input (realized by the label blocks) are shared among the rule blocks. This reduces the area and power consumption and is the key point for chip modularity. The proposed architecture is demonstrated through a 16-rule two input CMOS 1-μm prototype which features an operation speed of 2.5 Mflips (2.5×10^6 fuzzy inferences per second) with 8.6 mW power consumption. Core area occupation of this prototype is of only 1.6 mm 2 including the digital control and memory circuitry used for programmability. Because of the architecture modularity the number of inputs and rules can be increased with any hardly design effort.This work was supported in part by the Spanish C.I.C.Y.T under Contract TIC96-1392-C02- 02 (SIVA)

    Packet Transactions: High-level Programming for Line-Rate Switches

    Full text link
    Many algorithms for congestion control, scheduling, network measurement, active queue management, security, and load balancing require custom processing of packets as they traverse the data plane of a network switch. To run at line rate, these data-plane algorithms must be in hardware. With today's switch hardware, algorithms cannot be changed, nor new algorithms installed, after a switch has been built. This paper shows how to program data-plane algorithms in a high-level language and compile those programs into low-level microcode that can run on emerging programmable line-rate switching chipsets. The key challenge is that these algorithms create and modify algorithmic state. The key idea to achieve line-rate programmability for stateful algorithms is the notion of a packet transaction : a sequential code block that is atomic and isolated from other such code blocks. We have developed this idea in Domino, a C-like imperative language to express data-plane algorithms. We show with many examples that Domino provides a convenient and natural way to express sophisticated data-plane algorithms, and show that these algorithms can be run at line rate with modest estimated die-area overhead.Comment: 16 page
    • …
    corecore