44 research outputs found

    Application Specific Customization and Scalability of Soft Multiprocessors

    Full text link

    Implementation and Evaluation of an NoC Architecture for FPGAs

    Get PDF
    The Networks-on-Chip (NoC) approach for designing Systems-on-Chip (SoC) is currently emerging as an advanced concept for overcoming the scalability and efficiency problems of traditional bus-based systems. A great deal of theoretical research has been done in this area that provides good insight and shows promising results. There is a great need for research in hardware implementation of NoC-based systems to determine the feasibility of implementing various topologies and protocols, and also to accurately determine what design tradeoffs are involved in NoC implementation. This thesis addresses the challenges of implementing an NoC-based system on FPGAs for running real benchmark applications. The NoC used a mesh topology and circuit-switched communication protocol. An experimental framework was developed that allowed implementation of NoC-based system from a high level specification, using the Celoxica Handel-C hardware description language. Two test applications: charged couple device (CCD) and JPEG were developed in Handel-C to be used as our benchmark applications. Both benchmarks are computational expensive and require large quantities of data transfer that will test the NoC system. Implementation results show that the NoC-based system gives superior area utilization and speed performance compared to the bus-based system, running the same benchmarks

    Circuit design and analysis for on-FPGA communication systems

    No full text
    On-chip communication system has emerged as a prominently important subject in Very-Large- Scale-Integration (VLSI) design, as the trend of technology scaling favours logics more than interconnects. Interconnects often dictates the system performance, and, therefore, research for new methodologies and system architectures that deliver high-performance communication services across the chip is mandatory. The interconnect challenge is exacerbated in Field-Programmable Gate Array (FPGA), as a type of ASIC where the hardware can be programmed post-fabrication. Communication across an FPGA will be deteriorating as a result of interconnect scaling. The programmable fabrics, switches and the specific routing architecture also introduce additional latency and bandwidth degradation further hindering intra-chip communication performance. Past research efforts mainly focused on optimizing logic elements and functional units in FPGAs. Communication with programmable interconnect received little attention and is inadequately understood. This thesis is among the first to research on-chip communication systems that are built on top of programmable fabrics and proposes methodologies to maximize the interconnect throughput performance. There are three major contributions in this thesis: (i) an analysis of on-chip interconnect fringing, which degrades the bandwidth of communication channels due to routing congestions in reconfigurable architectures; (ii) a new analogue wave signalling scheme that significantly improves the interconnect throughput by exploiting the fundamental electrical characteristics of the reconfigurable interconnect structures. This new scheme can potentially mitigate the interconnect scaling challenges. (iii) a novel Dynamic Programming (DP)-network to provide adaptive routing in network-on-chip (NoC) systems. The DP-network architecture performs runtime optimization for route planning and dynamic routing which, effectively utilizes the in-silicon bandwidth. This thesis explores a new horizon in reconfigurable system design, in which new methodologies and concepts are proposed to enhance the on-FPGA communication throughput performance that is of vital importance in new technology processes

    Networks on Chips: From Research to Products

    Get PDF
    Research on Networks on Chips (NoCs) has spanned over a decade and its results are now visible in some products. Thus the seminal idea of using networking technology to address the chip-level interconnect problem has been shown to be correct. Moreover, as technology scales down in geometry and chips scale up in complexity, NoCs become the essential element to achieve the desired levels of performance and quality of service while curbing power consumption levels. Design and timing closure can only be achieved by a sophisticated set of tools that address NoC synthesis, optimization and validation

    Experimental Evaluation of an NoC Synthesis Tool

    Get PDF
    Rapid growth in the number of IP cores in SoCs resulted in the need for effective and scalable interconnect scheme for system components - Network-on-Chip (NoC). Design and implementation of an NoC from scratch is very time consuming and limits the NoC design space that can be explored. In this thesis we evaluate and compare NoC synthesis tool CONNECT with manually generated NoC design using Altera Quartus II. Three sizes of ring, mesh and torus NoC topologies are used for evaluation based on two metrics: logic resource utilization and maximum clock frequency. For larger NoC sizes manual design provides up to 85% reduction in area utilization. With respect to maximum clock frequency, CONNECT provides superior results for all NoC sizes, providing up to 80% higher clock frequency. These results provide an insight into the area versus frequency tradeoffs when using the CONNECT NoC synthesis tool

    Low power digital signal processing

    Get PDF

    A remote memory access infrastructure for global address space programming models in fpgas,”

    Get PDF
    ABSTRACT We are proposing a shared-memory communication infrastructure that provides a common parallel programming interface for FPGA and CPU components in a heterogeneous system. Our intent is to ease the integration of reconfigurable hardware into parallel programming models like Partitioned Global Address Space (PGAS). For this purpose, we introduce a remote memory access component based on Active Messages that implements the core API of the Berkeley GASNet communication library, and a simple controller that manages communication and synchronization for custom FPGA cores. We demonstrate how these components deliver a simple and easily configurable communication mechanism between distributed memories in a multi-FPGA system with processors as well as custom hardware nodes. Categories and Subject Descriptors MOTIVATION High-Performance Reconfigurable Computing (HPRC) systems present two main challenges to application programmers: What parallel programming model to use, and how to incorporate reconfigurable hardware into a software application. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. The first problem, inherent to all distributed computing, is what model of the existing hardware and memory distribution to present to the application programmer. This has implications for how to distribute and communicate data across the system, how to synchronize computations, and for how explicitly the programmer has to consider the physical makeup of the system. On the one end, Shared Memory presents a unified address space to the programmer, similar to the one found on a single host. On the other end, Distributed Memory only lets the programmer access the local memory, and all data exchange with other nodes happens explicitly through communication called Message Passing. The shared memory model is easy to program, but often leads to inefficient code, since the compiler cannot sufficiently reason about data access and communication patterns. The distributed model can produce very efficient implementations, but is very cumbersome to program. The second problem involves the fact that most highperformance application programmers understand software and CPU-based systems, but not reconfigurable hardware. Part of that problem is being attacked by emerging tools to translate high-level language CPU code into RegisterTransfer Language, with mixed results so far. However, besides an automatic synthesis path, applications also require an infrastructure for communication between software and hardware computation nodes, the equivalent of a communication API between CPU hosts. Preferably, this infrastructure should be independent from specific FPGA platforms, given the multitude of concepts and products that connect FPGAs with CPU-based host systems. Both problems presented above point to the larger issue of increasing software and hardware complexity. Performance and efficiency are still the most common metrics for computing systems, but productivity, as measured by the required effort to design, debug and maintain high-performance computing applications, has been recognized as essential to continued progress towards exascale systems In our opinion, a unified programming model and API for all components in a heterogeneous system (see In this paper, we will present our vision of a C++-based application design process that is based on the Partitioned Global Address Space model (PGAS). As our main contribution, we introduce an FPGA communication infrastructure compatible to GASNet[12], an existing PGAS communica
    corecore