1,103 research outputs found

    Design of TSV-sharing topologies for cost-effective 3D networks-on-chip

    Get PDF
    The Through-Silicon Via (TSV) technology has led to major breakthroughs in 3D stacking by providing higher speed and bandwidth, as well as lower power dissipation for the inter-layer communication. However, the current TSV fabrication suffers from a considerable area footprint and yield loss. Thus, it is necessary to restrict the number of TSVs in order to design cost-effective 3D on-chip networks. This critical issue can be addressed by clustering the network such that all of the routers within each cluster share a single TSV pillar for the vertical packet transmission. In some of the existing topologies, additional cluster routers are augmented into the mesh structure to handle the shared TSVs. However, they impose either performance degradation or power/area overhead to the system. Furthermore, the resulting architecture is no longer a mesh. In this paper, we redefine the clusters by replacing some routers in the mesh with the cluster routers, such that the mesh structure is preserved. The simulation results demonstrate a better equilibrium between performance and cost, using the proposed models

    Intelligent approaches to VLSI routing

    Get PDF
    Very Large Scale Integrated-circuit (VLSI) routing involves many large-size and complex problems and most of them have been shown to be NP-hard or NP-complete. As a result, conventional approaches, which have been successfully used to handle relatively small-size routing problems, are not suitable to be used in tackling large-size routing problems because they lead to \u27combinatorial explosion\u27 in search space. Hence, there is a need for exploring more efficient routing approaches to be incorporated into today\u27s VLSI routing system. This thesis strives to use intelligent approaches, including symbolic intelligence and computational intelligence, to solve three VLSI routing problems: Three-Dimensional (3-D) Shortest Path Connection, Switchbox Routing and Constrained Via Minimization. The 3-D shortest path connection is a fundamental problem in VLSI routing. It aims to connect two terminals of a net that are distributed in a 3-D routing space subject to technological constraints and performance requirements. Aiming at increasing computation speed and decreasing storage space requirements, we present a new A* algorithm for the 3-D shortest path connection problem in this thesis. This new A*algorithm uses an economical representation and adopts a novel back- trace technique. It is shown that this algorithm can guarantee to find a path if one exists and the path found is the shortest one. In addition, its computation speed is fast, especially when routed nets are spare. The computational complexities of this A* algorithm at the best case and the worst case are O(Ɩ) and 0(Ɩ3), respectively, where Ɩ is the shortest path length between the two terminals. Most importantly, this A\u27 algorithm is superior to other shortest path connection algorithms as it is economical in terms of storage space requirement, i.e., 1 bit/grid. The switchbox routing problem aims to connect terminals at regular intervals on the four sides of a rectangle routing region. From a computational point of view, the problem is NP-hard. Furthermore, it is extremely complicated and as the consequence no existing algorithm can guarantee to find a solution even if one exists no matter how high the complexity of the algorithm is. Previous approaches to the switch box routing problem can be divided into algorithmic approaches and knowledge-based approaches. The algorithmic approaches are efficient in computational time, but they are unsucessful at achieving high routing completion rate, especially for some dense and complicated switchbox routing problems. On the other hand, the knowledge-based approaches can achieve high routing completion rate, but they are not efficient in computation speed. In this thesis we present a hybrid approach to the switchbox routing problem. This hybrid approach is based on a new knowledge-based routing technique, namely synchronized routing, and combines some efficient algorithmic routing techniques. Experimental results show it can achieve the high routing completion rate of the knowledge-based approaches and the high efficiency of the algorithmic approaches. The constrained via minimization is an important optimization problem in VLSI routing. Its objective is to minimize the number of vias introduced in VLSI routing. From computational perspective, the constrained via minimization is NP-complete. Although for a special case where the number of wire segments splits at a via candidate is not more than three, elegant theoretical results have been obtained. For a general case in which there exist more than three wire segment splits at a via candidate few approaches have been proposed, and those approaches are only suitable for tackling some particular routing styles and are difficult or impossible to adjust to meet practical requirements. In this thesis we propose a new graph-theoretic model, namely switching graph model, for the constrained via minimization problem. The switching graph model can represent both grid-based and grid less routing problems, and allows arbitrary wire segments split at a via candidate. Then on the basis of the model, we present the first genetic algorithm for the constrained via minimization problem. This genetic algorithm can tackle various kinds of routing styles and be configured to meet practical constraints. Experimental results show that the genetic algorithm can find the optimal solutions for most cases in reasonable time

    PACKER: a switchbox router based on conflict elimination by local transformations

    Get PDF
    PACKER is an algorithm for switchbox routing, based on a novel approach. In an initial phase, the connectivity of each net is established without taking the other nets into account. In general, this gives rise to conflicts (short circuits). In the second stage, the conflicts are removed iteratively using connectivity-preserving local transformations. They reshape a net by displacing one of its segments without disconnecting it from the net. The transformations are applied in a asystematic way using a scan line technique. The results obtained by PACKER are very positive: it solves all well-known benchmark example

    An event-based architecture for solving constraint satisfaction problems

    Full text link
    Constraint satisfaction problems (CSPs) are typically solved using conventional von Neumann computing architectures. However, these architectures do not reflect the distributed nature of many of these problems and are thus ill-suited to solving them. In this paper we present a hybrid analog/digital hardware architecture specifically designed to solve such problems. We cast CSPs as networks of stereotyped multi-stable oscillatory elements that communicate using digital pulses, or events. The oscillatory elements are implemented using analog non-stochastic circuits. The non-repeating phase relations among the oscillatory elements drive the exploration of the solution space. We show that this hardware architecture can yield state-of-the-art performance on a number of CSPs under reasonable assumptions on the implementation. We present measurements from a prototype electronic chip to demonstrate that a physical implementation of the proposed architecture is robust to practical non-idealities and to validate the theory proposed.Comment: First two authors contributed equally to this wor

    On the design of a high-performance adaptive router for CC-NUMA multiprocessors

    Get PDF
    Copyright © 2003 IEEEThis work presents the design and evaluation of an adaptive packet router aimed at supporting CC-NUMA traffic. We exploit a simple and efficient packet injection mechanism to avoid deadlock, which leads to a fully adaptive routing by employing only three virtual channels. In addition, we selectively use output buffers for implementing the most utilized virtual paths in order to reduce head-of-line blocking. The careful implementation of these features has resulted in a good trade off between network performance and hardware cost. The outcome of this research is a High-Performance Adaptive Router (HPAR), which adequately balances the needs of parallel applications: minimal network latency at low loads and high throughput at heavy loads. The paper includes an evaluation process in which HPAR is compared with other adaptive routers using FIFO input buffering, with or without additional virtual channels to reduce head-of-line blocking. This evaluation contemplates both the VLSI costs of each router and their performance under synthetic and real application workloads. To make the comparison fair, all the routers use the same efficient deadlock avoidance mechanism. In all the experiments, HPAR exhibited the best response among all the routers tested. The throughput gains ranged from 10 percent to 40 percent in respect to its most direct rival, which employs more hardware resources. Other results shown that HPAR achieves up to 83 percent of its theoretical maximum throughput under random traffic and up to 70 percent when running real applications. Moreover, the observed packet latencies were comparable to those exhibited by simpler routers. Therefore, HPAR can be considered as a suitable candidate to implement packet interchange in next generations of CC-NUMA multiprocessors.Valentín Puente, José-Ángel Gregorio, Ramón Beivide, and Cruz Iz

    Design and implementation of NoC routers and their application to Prdt-based NoC\u27s

    Full text link
    With a communication-centric design style, Networks-on-Chips (NoCs) emerges as a new paradigm of Systems-on-Chips (SoCs) to overcome the limitations of bus-based communication infrastructure. An important problem in the design of NoCs is the router design, which has great impact on the cost and performance of a NoC system. This thesis is focused on the design and implementation of an optimized parameterized router which can be applied in mesh/torus-based and Perfect Recursive Diagonal Torus (PRDT)-based NoCs; In specific, the router design includes the design and implementation of two routing algorithms (vector routing and circular coded vector routing), the wormhole switching scheme, the scheduling scheme, buffering strategy, and flow control scheme. Correspondingly, the following components are designed and implemented: input controller, output controller, crossbar switch, and scheduler. Verilog HDL codes are generated and synthesized on ASIC platforms. Most components are designed in parameterized way. Performance evaluation of each component of the router in terms of timing, area, and power consumption is conducted. The efficiency of the two routing algorithms and tradeoff between computational time (tsetup) and area are analyzed; To reduce the area cost of the router design, the two major components, the crossbar switch and the scheduler, are optimized. Particularly, for crossbar switch, a comparative study of two crossbar designs is performed with the aid of Magic Layout editor, Synopsys CosmosSE and Awaves; Based on the router design, the PRDT network composed of 4x4 routers is designed and synthesized on ASIC platforms

    A survey of recent contributions of high performance NoC architectures

    Get PDF
    The Network-on-Chip (NoC) paradigm has been herald as the solution to the communication limitation that System-On-Chip (SoC) poses. However, power consumption is one of its major defects. To ensure that a high performance architecture is constructed, analyzing how power can be reduced in each area of the network is essential. Power dissipation can be reduced by adjustments to the routers, the architecture itself and the communication links. In this paper, a survey is conducted on recent contributions and techniques employed by researchers towards the reduction of power in the router architecture, network architecture and communication links
    corecore