20 research outputs found

    Low Cost Interconnected Architecture for the Hardware Spiking Neural Networks

    Get PDF
    A novel low cost interconnected architecture (LCIA) is proposed in this paper, which is an efficient solution for the neuron interconnections for the hardware spiking neural networks (SNNs). It is based on an all-to-all connection that takes each paired input and output nodes of multi-layer SNNs as the source and destination of connections. The aim is to maintain an efficient routing performance under low hardware overhead. A Networks-on-Chip (NoC) router is proposed as the fundamental component of the LCIA, where an effective scheduler is designed to address the traffic challenge due to irregular spikes. The router can find requests rapidly, make the arbitration decision promptly, and provide equal services to different network traffic requests. Experimental results show that the LCIA can manage the intercommunication of the multi-layer neural networks efficiently and have a low hardware overhead which can maintain the scalability of hardware SNNs

    Fault-tolerant networks-on-chip routing with coarse and fine-grained look-ahead

    Get PDF
    Fault tolerance and adaptive capabilities are challenges for modern networks-on-chip (NoC) due to the increase in physical defects in advanced manufacturing processes. Two novel adaptive routing algorithms, namely coarse and fine-grained (FG) look-ahead algorithms, are proposed in this paper to enhance 2-D mesh/torus NoC system fault-tolerant capabilities. These strategies use fault flag codes from neighboring nodes to obtain the status or conditions of real-time traffic in an NoC region, then calculate the path weights and choose the route to forward packets. This approach enables the router to minimize congestion for the adjacent connected channels and also to bypass a path with faulty channels by looking ahead at distant neighboring router paths. The novelty of the proposed routing algorithms is the weighted path selection strategies, which make near-optimal routing decisions to maintain the NoC system performance under high fault rates. Results show that the proposed routing algorithms can achieve performance improvement compared to other state of the art works under various traffic loads and high fault rates. The routing algorithm with FG look-ahead capability achieves a higher throughput compared with the coarse-grained approach under complex fault patterns. The hardware area/power overheads of both routing approaches are relatively low which does not prohibit scalability for large-scale NoC implementations

    Adaptive Network on Chip Routing using the Turn Model

    Get PDF
    To create a viable network on chip, many technical challenges need to be solved. One of the aspects of solutions is the routing algorithm: how to route packets from one component (e.g., core CPU) to another without deadlock or livelock while avoiding congestion or faulty routers. Routing algorithms must deal with these problems while remaining simple enough to keep the hardware cost low. We have created a simple to implement, deadlock free, and livelock free routing algorithm that addresses these challenges. This routing algorithm, Weighted Non-Minimal OddEven (WeNMOE), gathers information on the state of the network (congestion/faults) from surrounding routers. The algorithm then uses this information to estimate a routing cost and routes down the path with the lowest estimated cost. A simulator was developed and used to study the performance and to compare the new routing algorithm against other state of the art routing algorithms. This simulator emulates bit reverse, complement, transpose, hotspots, and uniform random traffic patterns and measures the average latency of delivered packets. The results of the simulations showed that WeNMOE outperformed most routing algorithms. The only exception was the XY routing algorithm on uniform random and complement traffic. In these traffic patterns, the traffic load is uniformly distributed, limiting the opportunity for an improved route selection by WeNMOE

    Fault tolerant routing algorithm for fully- and partially-defective NoC switches

    Get PDF
    Recently network-on-chip (NoC) has become a broad topic of research and development and is going to displace bus and crossbar approaches for Systems-on-chip interconnection. NoCs provide the needs of an efficient communication infrastructure of complex SoC. In order to meet the communication requirements even in presence of faults, fault tolerant routing algorithms become one of the most dominant issues for NoC systems. There has been significant works on fault tolerant routing algorithms for NoCs which mostly support only fully defective switches, but in this thesis, a new deadlock and live-lock free fault tolerant routing algorithm that tolerates fully- and partially-defective NoC switches will be introduced. The proposed algorithm is an enhancement of the available region-based approach for NoCs. The novelty of our approach is that link failures are modeled as semi-faulty switches and as a result the faulty region is smaller and less healthy switches are deactivated. The algorithm does not need any virtual channel. In addition, the routing algorithm does not require routing table in every switch. The performance comparison shows the advantages of the proposed algorithm with state-of-the-art fault tolerant routing algorithms. Since our algorithm has less deactivated switches it has always higher throughput and less latency

    Efficient Multicast Algorithms for Mesh and Torus Networks

    Get PDF
    With the increasing popularity of multicomputers, efficient way of communication within its processors has become a popular area of research. Multicomputers refer to a computer system that has multiple processors, they have high computational power and they can perform multiple tasks concurrently. Mesh and Torus are some of the commonly used network topologies in building multicomputer systems. Their performance highly depends on the underlying network communication such as multicast. Multicast is a communication method in which a message is sent from a source node to a certain number of destinations. Two major parameters used to evaluate multicast are time that a multicast process takes to deliver the message to all destinations and traffic that indicates the number of links used for this process. Research indicates that in general, it is NP- complete to find an optimal multicasting algorithm which is efficient on both time and traffic. This thesis suggests two new algorithms to achieve multicast in mesh and torus networks. Extensive simulations of these algorithms show that in practice they perform better than existing ones

    Multi-Objective Routing for Distributed Controllers

    Get PDF
    A long-term goal of future naval shipboard power systems is the ability to manage energy flow with sufficient flexibility to accommodate future platform requirements such as better survivability, continuity, and support of pulsed and other demanding loads. To facilitate scalable, low-latency global distributed system control, each control module can include an integrated network interface connected through multiple channels onto a direct, multi-hop network topology. In this work, we focus on a 2D Torus, in which control nodes are arranged in a regular 2D grid, with each node connected through point-to-point connections to its four immediate neighbors. An important advantage of 2D Tori is their redundant topology where there is more than one minimal path between any source and destination as long as they do not share the same row or column in the grid. For the static, all-to-one traffic pattern used by a central controller, the number of minimal routing tables grows as O(N!N2). This dissertation presents a novel approach to generating routing tables that achieve two performance objectives: (1) minimal control period latency, the lower bound of which is the round trip latency of the messages exchanged between the controller and the node having the longest route, and (2) minimal latency jitter. Our approach relies on creating a large system of integer linear algebra equations describing (i) functionality of a network and (ii) constraints needed for perfect load balance and low jitter. We use Gurobi ILA solver to find a satisfying assignment of all boolean variables representing where packets are scheduled to be in a certain timeframe. Experimental results show that our software pipeline generates routing tables that (i) are guaranteed to have perfect load balance regardless of shape and size of the network and (ii) lower jitter than any of randomly generated routing tables which we simulated. Our software also has an option of generating routing tables that allow packets to follow non minimum hop count paths as well as being held in the source nodes for some time instead of immediately rushing to the master node. That helps packets avoid congested areas, and, as the results show, achieves up to 2x improvement in jitter

    On Fault Tolerance Methods for Networks-on-Chip

    Get PDF
    Technology scaling has proceeded into dimensions in which the reliability of manufactured devices is becoming endangered. The reliability decrease is a consequence of physical limitations, relative increase of variations, and decreasing noise margins, among others. A promising solution for bringing the reliability of circuits back to a desired level is the use of design methods which introduce tolerance against possible faults in an integrated circuit. This thesis studies and presents fault tolerance methods for network-onchip (NoC) which is a design paradigm targeted for very large systems-onchip. In a NoC resources, such as processors and memories, are connected to a communication network; comparable to the Internet. Fault tolerance in such a system can be achieved at many abstraction levels. The thesis studies the origin of faults in modern technologies and explains the classification to transient, intermittent and permanent faults. A survey of fault tolerance methods is presented to demonstrate the diversity of available methods. Networks-on-chip are approached by exploring their main design choices: the selection of a topology, routing protocol, and flow control method. Fault tolerance methods for NoCs are studied at different layers of the OSI reference model. The data link layer provides a reliable communication link over a physical channel. Error control coding is an efficient fault tolerance method especially against transient faults at this abstraction level. Error control coding methods suitable for on-chip communication are studied and their implementations presented. Error control coding loses its effectiveness in the presence of intermittent and permanent faults. Therefore, other solutions against them are presented. The introduction of spare wires and split transmissions are shown to provide good tolerance against intermittent and permanent errors and their combination to error control coding is illustrated. At the network layer positioned above the data link layer, fault tolerance can be achieved with the design of fault tolerant network topologies and routing algorithms. Both of these approaches are presented in the thesis together with realizations in the both categories. The thesis concludes that an optimal fault tolerance solution contains carefully co-designed elements from different abstraction levelsSiirretty Doriast

    Run-time management of many-core SoCs: A communication-centric approach

    Get PDF
    The single core performance hit the power and complexity limits in the beginning of this century, moving the industry towards the design of multi- and many-core system-on-chips (SoCs). The on-chip communication between the cores plays a criticalrole in the performance of these SoCs, with power dissipation, communication latency, scalability to many cores, and reliability against the transistor failures as the main design challenges. Accordingly, we dedicate this thesis to the communicationcentered management of the many-core SoCs, with the goal to advance the state-ofthe-art in addressing these challenges. To this end, we contribute to on-chip communication of many-core SoCs in three main directions. First, we start with a synthesizable SoC with full system simulation. We demonstrate the importance of the networking overhead in a practical system, and propose our sophisticated network interface (NI) that offloads the work from SW to HW. Our results show around 5x and up to 50x higher network performance, compared to previous works. As the second direction of this thesis, we study the significance of run-time application mapping. We demonstrate that contiguous application mapping not only improves the network latency (by 23%) and power dissipation (by 50%), but also improves the system throughput (by 3%) and quality-of-service (QoS) of soft real-time applications (up to 100x less deadline misses). Also our hierarchical run-time application mapping provides 99.41% successful mapping when up to 8 links are broken. As the final direction of the thesis, we propose a fault-tolerant routing algorithm, the maze-routing. It is the first-in-class algorithm that provides guaranteed delivery, a fully-distributed solution, low area overhead (by 16x), and instantaneous reconfiguration (vs. 40K cycles down time of previous works), all at the same time. Besides the individual goals of each contribution, when applicable, we ensure that our solutions scale to extreme network sizes like 12x12 and 16x16. This thesis concludes that the communication overhead and its optimization play a significant role in the performance of many-core SoC

    A reconfigurable routing algorithm for a fault-tolerant 2D-Mesh Network-on-Chip

    No full text
    6 pagesInternational audienceIn this paper we present a reconfigurable routing algorithm for a 2D-Mesh Network-on-Chip (NoC) dedicated to fault-tolerant, Massively Parallel Multi-Processors Systems on Chip (MP2-SoC). The routing algorithm can be dynamically reconfigured, to adapt to the modification of the micro-network topology caused by a faulty router. This algorithm has been implemented in a reconfigurable version of the DSPIN micro-network, and evaluated from the point of view of performance (penalty on the network saturation threshold), and cost (extra silicon area occupied by the reconfigurable version of the router)
    corecore