38 research outputs found

    Master of Science

    Get PDF
    thesisThis thesis designs, implements, and evaluates modular Open Core Protocol (OCP) interfaces for Intellectual Property (IP) cores and Network-on-Chip (NoC) that re- duces System-On-Chip (SoC) design time and enables research on di erent architectural sequencing control methods. To utilize the NoCs design time optimization feature at the boundaries, a standardized industry socket was required, which can address the SoC shorter time-to-market requirements, design issues, and also the subsequent reuse of developed IP cores. OCP is an open industry standard socket interface speci cation used in this research to enable the IP cores reusability across multiple SoC designs. This research work designs and implements clocked OCP interfaces between IP cores and On-Chip Network Fabric (NoC), in single- and multi- frequency clocked domains. The NoC interfaces between IP cores and on-chip network fabric are implemented using the standard network interface structure. It consists of back-end and front-end submodules corresponding to customized interfaces to IP cores or network fabric and OCP Master and Slave entities, respectively. A generic domain interface (DI) protocol is designed which acts as the bridge between back-end and front-end submodules for synchronization and data ow control. Clocked OCP interfaces are synthesized, placed and routed using IBM's 65nm process technology. The implemented designs are veri ed for OCP compliance using SOLV (Sonics OCP Library for Veri cation). Finally, this thesis reports the performance metrics such as design target frequency of operation, latency, area, energy per transaction, and maximum bandwidth across network on-chip for single- and multifrequency clocked designs

    Analysis of asynchronous routers for network-on-chip applications

    Get PDF
    Asynchronous circuit design has been conventionally regarded as a valid alternative to synchronous logic due to its potential for low consumption of resources, power and delay. This includes areas such as the communication infrastructure of modern multi core processors, the so-called Network-on-Chip (NoC) paradigm on which this thesis focus on. In recent times, the transistor downscaling and the increasing clock frequencies have pushed synchronous design to high static power and delay. As a result, the interest for asynchronous integrated routers and links has re-emerged, especially in fields with ultra-low power requirements such as embedded systems. In this thesis, we construct an asynchronous router using Verilog code based on architectures found in the literature. We analyze the functionality of each of the building blocks and verify the operation of the implemented routing algorithm and arbitration mechanism. In the future, the results obtained here are expected to enable a complete implementation of the router in Verilog and its posterior analysis of its scalability

    The Future of Formal Methods and GALS Design

    Get PDF
    AbstractThe System-on-Chip era has arrived, and it arrived quickly. Modular composition of components through a shared interconnect is now becoming the standard, rather than the exotic. Asynchronous interconnect fabrics and globally asynchronous locally synchronous (GALS) design has been shown to be potentially advantageous. However, the arduous road to developing asynchronous on-chip communication and interfaces to clocked cores is still nascent. This road of converting to asynchronous networks, and potentially the core intellectual property block as well, will be rocky. Asynchronous circuit design has been employed since the 1950's. However, it is doubtful that its present form will be what we will see 10 years hence. This treatise is intended to provoke debate as it projects what technologies will look like in the future, and discusses, among other aspects, the role of formal verification, education, the CAD industry, and the ever present tradeoff between greed and fear

    Doctor of Philosophy

    Get PDF
    dissertationCommunication surpasses computation as the power and performance bottleneck in forthcoming exascale processors. Scaling has made transistors cheap, but on-chip wires have grown more expensive, both in terms of latency as well as energy. Therefore, the need for low energy, high performance interconnects is highly pronounced, especially for long distance communication. In this work, we examine two aspects of the global signaling problem. The first part of the thesis focuses on a high bandwidth asynchronous signaling protocol for long distance communication. Asynchrony among intellectual property (IP) cores on a chip has become necessary in a System on Chip (SoC) environment. Traditional asynchronous handshaking protocol suffers from loss of throughput due to the added latency of sending the acknowledge signal back to the sender. We demonstrate a method that supports end-to-end communication across links with arbitrarily large latency, without limiting the bandwidth, so long as line variation can be reliably controlled. We also evaluate the energy and latency improvements as a result of the design choices made available by this protocol. The use of transmission lines as a physical interconnect medium shows promise for deep submicron technologies. In our evaluations, we notice a lower energy footprint, as well as vastly reduced wire latency for transmission line interconnects. We approach this problem from two sides. Using field solvers, we investigate the physical design choices to determine the optimal way to implement these lines for a given back-end-of-line (BEOL) stack. We also approach the problem from a system designer's viewpoint, looking at ways to optimize the lines for different performance targets. This work analyzes the advantages and pitfalls of implementing asynchronous channel protocols for communication over long distances. Finally, the innovations resulting from this work are applied to a network-on-chip design example and the resulting power-performance benefits are reported

    DVFS using clock scheduling for Multicore Systems-on-Chip and Networks-on-Chip

    Get PDF
    A modern System-on-Chip (SoC) contains processor cores, application-specific process- ing elements, memory, peripherals, all connected with a high-bandwidth and low-latency Network-on-Chip (NoC). The downside of such very high level of integration and con- nectivity is the high power consumption. In CMOS technology this is made of a dynamic and a static component. To reduce the dynamic component, Dynamic voltage and Fre- quency Scaling (DVFS) has been adopted. Although DVFS is very effective chip-wide, the power optimization of complex SoCs calls for a finer grain application of DVFS. Ideally all the main components of an SoC should be provided with a DVFS controller. An SoC with a DVFS controller per component with individual DC-DC converters and PLL/DLL circuits cannot scale in size to hundreds of components, which are in the research agenda. We present an alternative that will permit such scaling. It is possible to achieve results close to an optimum DVFS by hopping between few voltage levels and by an innovative application of clock-gating that we term as clock scheduling. We obtain an effective clock frequency by periodically killing some clock cycles of a master clock. We can apply voltage scaling for some of the periodic clock schedules which yield effective clock 1/2, 1/3, . . . By dithering between few voltages we obtain results close to an ideal DVFS system in simple pipelined circuits and in a complex example, a NoC’s switch. Again in the context of a NoC, we show how clock scheduling and voltage scaling can be automatically determined by means of a proportional-integral loop controller that keeps track of the network load. We describe in detail its implementation and all the circuit-level issues that we found. For a single switch, result shows an advantage of up to 2X over simple frequency scaling without voltage scaling. By providing each NoC’s switch with our simple DVFS controller, power saving at network level can be significantly more than what a a global DVFS controller can get. In a realistic scenario represented by network traces generated by video applications (MPEG, PIP, MWD, VoPD), we obtain an average power saving of 33%. To reduce static power, the Power-Gating (PG) technique is used and consists in switching- off power supply of unused blocks via pMOS headers or nMOS footers in series with such blocks. Even though research has been done in this field, the application of PG to NoCs has not been fully investigated. We show that it is possible to apply PG to the input buffers of a NoC switch. Their leakage power contributes about 40-50% of total NoC power, hence reducing such contribution is worthwhile. We partitioned buffers in banks and apply PG only to inactive banks. With our technique, it is possible to save about 40% in leakage power, without impact on performance

    Design and Validation of Network-on-Chip Architectures for the Next Generation of Multi-synchronous, Reliable, and Reconfigurable Embedded Systems

    Get PDF
    NETWORK-ON-CHIP (NoC) design is today at a crossroad. On one hand, the design principles to efficiently implement interconnection networks in the resource-constrained on-chip setting have stabilized. On the other hand, the requirements on embedded system design are far from stabilizing. Embedded systems are composed by assembling together heterogeneous components featuring differentiated operating speeds and ad-hoc counter measures must be adopted to bridge frequency domains. Moreover, an unmistakable trend toward enhanced reconfigurability is clearly underway due to the increasing complexity of applications. At the same time, the technology effect is manyfold since it provides unprecedented levels of system integration but it also brings new severe constraints to the forefront: power budget restrictions, overheating concerns, circuit delay and power variability, permanent fault, increased probability of transient faults. Supporting different degrees of reconfigurability and flexibility in the parallel hardware platform cannot be however achieved with the incremental evolution of current design techniques, but requires a disruptive approach and a major increase in complexity. In addition, new reliability challenges cannot be solved by using traditional fault tolerance techniques alone but the reliability approach must be also part of the overall reconfiguration methodology. In this thesis we take on the challenge of engineering a NoC architectures for the next generation systems and we provide design methods able to overcome the conventional way of implementing multi-synchronous, reliable and reconfigurable NoC. Our analysis is not only limited to research novel approaches to the specific challenges of the NoC architecture but we also co-design the solutions in a single integrated framework. Interdependencies between different NoC features are detected ahead of time and we finally avoid the engineering of highly optimized solutions to specific problems that however coexist inefficiently together in the final NoC architecture. To conclude, a silicon implementation by means of a testchip tape-out and a prototype on a FPGA board validate the feasibility and effectivenes

    The MANGO clockless network-on-chip: Concepts and implementation

    Get PDF
    corecore