86 research outputs found

    Analyzing Traffic and Multicast Switch Issues in an ATM Network.

    Get PDF
    This dissertation attempts to solve two problems related to an ATM network. First, we consider packetized voice and video sources as the incoming traffic to an ATM multiplexer and propose modeling methods for both individual and aggregated traffic sources. These methods are, then, used to analyze performance parameters such as buffer occupancy, cell loss probability, and cell delay. Results, thus obtained, for different buffer sizes and number of voice and video sources are analyzed and compared with those generated from existing techniques. Second, we study the priority handling feature for time critical services in an ATM multicast switch. For this, we propose a non-blocking copy network and priority handling algorithms. We, then, analyze the copy network using an analytical method and simulation. The analysis utilizes both priority and non-priority cells for two different output reservation schemes. The performance parameters, based on cell delay, delay jitter, and cell loss probability, are studied for different buffer sizes and fan-outs under various input traffic loads. Our results show that the proposed copy network provides a better performance for the priority cells while the performance for the non-priority cells is slightly inferior in comparison with the scenario when the network does not consider priority handling. We also study the fault-tolerant behavior of the copy network, specially for the broadcast banyan network subsection, and present a routing scheme considering the non-blocking property under a specific pattern of connection assignments. A fault tolerant characteristic can be quantified using the full access probability. The computation of the full access probability for a general network is known to be NP-hard. We, therefore, provide a new bounding technique utilizing the concept of minimal cuts to compute full access probability of the copy network. Our study for the fault-tolerant multi-stage interconnection network having either an extra stage or chaining shows that the proposed technique provides tighter bounds as compared to those given by existing approaches. We also apply our bounding method to compute full access probability of the fault-tolerant copy network

    A combinatorial method for the evaluation of yield of fault-tolerant systems-on-chip

    Get PDF
    In this paper we develop a combinatorial method for the evaluation of yield of fault-tolerant systems-on-chip. The method assumes that defects are produced according to a model in which defects are lethal and affect given components of the system following a distribution common to all defects. The distribution of the number of defects is arbitrary. The method is based on the formulation of the yield as 1 minus the probability that a given boolean function with multiple-valued variables has value 1. That probability is computed by analyzing a ROMDD (reduced ordered multiple-valuedecision diagram) representation of the function. For efficiency reasons, we first build a coded ROBDD (reduced ordered binary decision diagram) representation of the function and then transform that coded ROBDD into the ROMDD required by the method. We present numerical experiments showing that the method is able to cope with quite large systems in moderate CPU times.Postprint (published version

    Combinatorial methods for the evaluation of yield and operational reliability of fault-tolerant systems-on-chip

    Get PDF
    In this paper we develop combinatorial methods for the evaluation of yield and operational reliability of fault-tolerant systems-on-chip. The method for yield computation assumes that defects are produced according to a model in which defects are lethal and affect given components of the system following a distribution common to all defects; the method for the computation of operational reliability also assumes that the fault-tree function of the system is increasing. The distribution of the number of defects is arbitrary. The methods are based on the formulation of, respectively, the yield and the operational reliability as the probability that a given boolean function with multiple-valued variables has value 1. That probability is computed by analyzing a ROMDD (reduced ordered multiple-value decision diagram) representation of the function. For efficiency reasons, a coded ROBDD (reduced ordered binary decision diagram) representation of the function is built first and, then, that coded ROBDD is transformed into the ROMDD required by the methods. We present numerical experiments showing that the methods are able to cope with quite large systems in moderate CPU times.Postprint (published version

    Design and evaluation of the Hamal parallel computer

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, February 2003."December 2002."Includes bibliographical references (p. 145-152).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Parallel shared-memory machines with hundreds or thousands of processor-memory nodes have been built; in the future we will see machines with millions or even billions of nodes. Associated with such large systems is a new set of design challenges. Many problems must be addressed by an architecture in order for it to be successful; of these, we focus on three in particular. First, a scalable memory system is required. Second, the network messaging protocol must be fault-tolerant. Third, the overheads of thread creation, thread management and synchronization must be extremely low. This thesis presents the complete system design for Hamal, a shared-memory architecture which addresses these concerns and is directly scalable to one million nodes. Virtual memory and distributed objects are implemented in a manner that requires neither inter-node synchronization nor the storage of globally coherent translations at each node. We develop a lightweight fault-tolerant messaging protocol that guarantees message delivery and idempotence across a discarding network. A number of hardware mechanisms provide efficient support for massive multithreading and fine-grained synchronization.(cont.) Experiments are conducted in simulation, using a trace-driven network simulator to investigate the messaging protocol and a cycle-accurate simulator to evaluate the Hamal architecture. We determine implementation parameters for the messaging protocol which optimize performance. A discarding network is easier to design and can be clocked at a higher rate, and we find that with this protocol its performance can approach that of a non-discarding network. Our simulations of Hamal demonstrate the effectiveness of its thread management and synchronization primitives. In particular, we find register-based synchronization to be an extremely efficient mechanism which can be used to implement a software barrier with a latency of only 523 cycles on a 512 node machine.by J.B. Grossman.Ph.D

    Design and Evaluation of the Hamal Parallel Computer

    Get PDF
    Parallel shared-memory machines with hundreds or thousands of processor-memory nodes have been built; in the future we will see machines with millions or even billions of nodes. Associated with such large systems is a new set of design challenges. Many problems must be addressed by an architecture in order for it to be successful; of these, we focus on three in particular. First, a scalable memory system is required. Second, the network messaging protocol must be fault-tolerant. Third, the overheads of thread creation, thread management and synchronization must be extremely low. This thesis presents the complete system design for Hamal, a shared-memory architecture which addresses these concerns and is directly scalable to one million nodes. Virtual memory and distributed objects are implemented in a manner that requires neither inter-node synchronization nor the storage of globally coherent translations at each node. We develop a lightweight fault-tolerant messaging protocol that guarantees message delivery and idempotence across a discarding network. A number of hardware mechanisms provide efficient support for massive multithreading and fine-grained synchronization. Experiments are conducted in simulation, using a trace-driven network simulator to investigate the messaging protocol and a cycle-accurate simulator to evaluate the Hamal architecture. We determine implementation parameters for the messaging protocol which optimize performance. A discarding network is easier to design and can be clocked at a higher rate, and we find that with this protocol its performance can approach that of a non-discarding network. Our simulations of Hamal demonstrate the effectiveness of its thread management and synchronization primitives. In particular, we find register-based synchronization to be an extremely efficient mechanism which can be used to implement a software barrier with a latency of only 523 cycles on a 512 node machine

    Routing on the Channel Dependency Graph:: A New Approach to Deadlock-Free, Destination-Based, High-Performance Routing for Lossless Interconnection Networks

    Get PDF
    In the pursuit for ever-increasing compute power, and with Moore's law slowly coming to an end, high-performance computing started to scale-out to larger systems. Alongside the increasing system size, the interconnection network is growing to accommodate and connect tens of thousands of compute nodes. These networks have a large influence on total cost, application performance, energy consumption, and overall system efficiency of the supercomputer. Unfortunately, state-of-the-art routing algorithms, which define the packet paths through the network, do not utilize this important resource efficiently. Topology-aware routing algorithms become increasingly inapplicable, due to irregular topologies, which either are irregular by design, or most often a result of hardware failures. Exchanging faulty network components potentially requires whole system downtime further increasing the cost of the failure. This management approach becomes more and more impractical due to the scale of today's networks and the accompanying steady decrease of the mean time between failures. Alternative methods of operating and maintaining these high-performance interconnects, both in terms of hardware- and software-management, are necessary to mitigate negative effects experienced by scientific applications executed on the supercomputer. However, existing topology-agnostic routing algorithms either suffer from poor load balancing or are not bounded in the number of virtual channels needed to resolve deadlocks in the routing tables. Using the fail-in-place strategy, a well-established method for storage systems to repair only critical component failures, is a feasible solution for current and future HPC interconnects as well as other large-scale installations such as data center networks. Although, an appropriate combination of topology and routing algorithm is required to minimize the throughput degradation for the entire system. This thesis contributes a network simulation toolchain to facilitate the process of finding a suitable combination, either during system design or while it is in operation. On top of this foundation, a key contribution is a novel scheduling-aware routing, which reduces fault-induced throughput degradation while improving overall network utilization. The scheduling-aware routing performs frequent property preserving routing updates to optimize the path balancing for simultaneously running batch jobs. The increased deployment of lossless interconnection networks, in conjunction with fail-in-place modes of operation and topology-agnostic, scheduling-aware routing algorithms, necessitates new solutions to solve the routing-deadlock problem. Therefore, this thesis further advances the state-of-the-art by introducing a novel concept of routing on the channel dependency graph, which allows the design of an universally applicable destination-based routing capable of optimizing the path balancing without exceeding a given number of virtual channels, which are a common hardware limitation. This disruptive innovation enables implicit deadlock-avoidance during path calculation, instead of solving both problems separately as all previous solutions

    An efficient cutset approach for evaluating communication-network reliability with heterogeneous link-capacities

    Get PDF

    Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip

    Get PDF
    The sustained demand for faster, more powerful chips has been met by the availability of chip manufacturing processes allowing for the integration of increasing numbers of computation units onto a single die. The resulting outcome, especially in the embedded domain, has often been called SYSTEM-ON-CHIP (SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC). MPSoC design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable answer to this design concern. By bringing large-scale networking concepts to the on-chip domain, they guarantee a structured answer to present and future communication requirements. The point-to-point connection and packet switching paradigms they involve are also of great help in minimizing wiring overhead and physical routing issues. However, as with any technology of recent inception, NoC design is still an evolving discipline. Several main areas of interest require deep investigation for NoCs to become viable solutions: • The design of the NoC architecture needs to strike the best tradeoff among performance, features and the tight area and power constraints of the onchip domain. • Simulation and verification infrastructure must be put in place to explore, validate and optimize the NoC performance. • NoCs offer a huge design space, thanks to their extreme customizability in terms of topology and architectural parameters. Design tools are needed to prune this space and pick the best solutions. • Even more so given their global, distributed nature, it is essential to evaluate the physical implementation of NoCs to evaluate their suitability for next-generation designs and their area and power costs. This dissertation performs a design space exploration of network-on-chip architectures, in order to point-out the trade-offs associated with the design of each individual network building blocks and with the design of network topology overall. The design space exploration is preceded by a comparative analysis of state-of-the-art interconnect fabrics with themselves and with early networkon- chip prototypes. The ultimate objective is to point out the key advantages that NoC realizations provide with respect to state-of-the-art communication infrastructures and to point out the challenges that lie ahead in order to make this new interconnect technology come true. Among these latter, technologyrelated challenges are emerging that call for dedicated design techniques at all levels of the design hierarchy. In particular, leakage power dissipation, containment of process variations and of their effects. The achievement of the above objectives was enabled by means of a NoC simulation environment for cycleaccurate modelling and simulation and by means of a back-end facility for the study of NoC physical implementation effects. Overall, all the results provided by this work have been validated on actual silicon layout

    Small-world interconnection networks for large parallel computer systems

    Get PDF
    The use of small-world graphs as interconnection networks of multicomputers is proposed and analysed in this work. Small-world interconnection networks are constructed by adding (or modifying) edges to an underlying local graph. Graphs with a rich local structure but with a large diameter are shown to be the most suitable candidates for the underlying graph. Generation models based on random and deterministic wiring processes are proposed and analysed. For the random case basic properties such as degree, diameter, average length and bisection width are analysed, and the results show that a fast transition from a large diameter to a small diameter is experienced when the number of new edges introduced is increased. Random traffic analysis on these networks is undertaken, and it is shown that although the average latency experiences a similar reduction, networks with a small number of shortcuts have a tendency to saturate as most of the traffic flows through a small number of links. An analysis of the congestion of the networks corroborates this result and provides away of estimating the minimum number of shortcuts required to avoid saturation. To overcome these problems deterministic wiring is proposed and analysed. A Linear Feedback Shift Register is used to introduce shortcuts in the LFSR graphs. A simple routing algorithm has been constructed for the LFSR and extended with a greedy local optimisation technique. It has been shown that a small search depth gives good results and is less costly to implement than a full shortest path algorithm. The Hilbert graph on the other hand provides some additional characteristics, such as support for incremental expansion, efficient layout in two dimensional space (using two layers), and a small fixed degree of four. Small-world hypergraphs have also been studied. In particular incomplete hypermeshes have been introduced and analysed and it has been shown that they outperform the complete traditional implementations under a constant pinout argument. Since it has been shown that complete hypermeshes outperform the mesh, the torus, low dimensional m-ary d-cubes (with and without bypass channels), and multi-stage interconnection networks (when realistic decision times are accounted for and with a constant pinout), it follows that incomplete hypermeshes outperform them as well

    How Physicality Enables Trust: A New Era of Trust-Centered Cyberphysical Systems

    Full text link
    Multi-agent cyberphysical systems enable new capabilities in efficiency, resilience, and security. The unique characteristics of these systems prompt a reevaluation of their security concepts, including their vulnerabilities, and mechanisms to mitigate these vulnerabilities. This survey paper examines how advancement in wireless networking, coupled with the sensing and computing in cyberphysical systems, can foster novel security capabilities. This study delves into three main themes related to securing multi-agent cyberphysical systems. First, we discuss the threats that are particularly relevant to multi-agent cyberphysical systems given the potential lack of trust between agents. Second, we present prospects for sensing, contextual awareness, and authentication, enabling the inference and measurement of ``inter-agent trust" for these systems. Third, we elaborate on the application of quantifiable trust notions to enable ``resilient coordination," where ``resilient" signifies sustained functionality amid attacks on multiagent cyberphysical systems. We refer to the capability of cyberphysical systems to self-organize, and coordinate to achieve a task as autonomy. This survey unveils the cyberphysical character of future interconnected systems as a pivotal catalyst for realizing robust, trust-centered autonomy in tomorrow's world
    • …
    corecore