468 research outputs found

    Communication algorithms for isotropic tasks in hypercubes and wraparound meshes

    Get PDF
    Cover title.Includes bibliographical references (p. 29-30).Research supported by the NSF. NSF-ECS-8519058 Research supported by the ARO. DAAL03-86-K-0171by Emmanouel A. Varvarigos and Dimitri P. Bertsekas

    Algorithms for security in robotics and networks

    Get PDF
    The dissertation presents algorithms for robotics and security. The first chapter gives an overview of the area of visibility-based pursuit-evasion. The following two chapters introduce two specific algorithms in that area. The algorithms are based on research done together with Dr. Giora Slutzki and Dr. Steven LaValle. Chapter 2 presents a polynomial-time algorithm for clearing a polygon by a single 1-searcher. The result is extended to a polynomial-time algorithm for a pair of 1-searchers in Chapter 3.;Chapters 4 and 5 contain joint research with Dr. Srini Tridandapani, Dr. Jason Jue and Dr. Michael Borella in the area of computer networks. Chapter 4 presents a method of providing privacy over an insecure channel which does not require encryption. Chapter 5 gives approximate bounds for the link utilization in multicast traffic

    Survey On Fault Tolerance In Grid Computing

    Full text link

    Efficient All-to-All Collective Communication Schedules for Direct-Connect Topologies

    Full text link
    The all-to-all collective communications primitive is widely used in machine learning (ML) and high performance computing (HPC) workloads, and optimizing its performance is of interest to both ML and HPC communities. All-to-all is a particularly challenging workload that can severely strain the underlying interconnect bandwidth at scale. This is mainly because of the quadratic scaling in the number of messages that must be simultaneously serviced combined with large message sizes. This paper takes a holistic approach to optimize the performance of all-to-all collective communications on supercomputer-scale direct-connect interconnects. We address several algorithmic and practical challenges in developing efficient and bandwidth-optimal all-to-all schedules for any topology, lowering the schedules to various backends and fabrics that may or may not expose additional forwarding bandwidth, establishing an upper bound on all-to-all throughput, and exploring novel topologies that deliver near-optimal all-to-all performance

    Simulation Of Multi-core Systems And Interconnections And Evaluation Of Fat-Mesh Networks

    Get PDF
    Simulators are very important in computer architecture research as they enable the exploration of new architectures to obtain detailed performance evaluation without building costly physical hardware. Simulation is even more critical to study future many-core architectures as it provides the opportunity to assess currently non-existing computer systems. In this thesis, a multiprocessor simulator is presented based on a cycle accurate architecture simulator called SESC. The shared L2 cache system is extended into a distributed shared cache (DSC) with a directory-based cache coherency protocol. A mesh network module is extended and integrated into SESC to replace the bus for scalable inter-processor communication. While these efforts complete an extended multiprocessor simulation infrastructure, two interconnection enhancements are proposed and evaluated. A novel non-uniform fat-mesh network structure similar to the idea of fat-tree is proposed. This non-uniform mesh network takes advantage of the average traffic pattern, typically all-to-all in DSC, to dedicate additional links for connections with heavy traffic (e.g., near the center) and fewer links for lighter traffic (e.g., near the periphery). Two fat-mesh schemes are implemented based on different routing algorithms. Analytical fat-mesh models are constructed by presenting the expressions for the traffic requirements of personalized all-to-all traffic. Performance improvements over the uniform mesh are demonstrated in the results from the simulator. A hybrid network consisting of one packet switching plane and multiple circuit switching planes is constructed as the second enhancement. The circuit switching planes provide fast paths between neighbors with heavy communication traffic. A compiler technique that abstracts the symbolic expressions of benchmarks' communication patterns can be used to help facilitate the circuit establishment

    Real-Time Energy Price-Aware Anycast RWA for Scheduled Lightpath Demands in Optical Data Center Networks

    Get PDF
    The energy consumption of the data center networks and the power consumption associated with transporting data to the users is considerably large, and it constitutes a significant portion of their costs. Hence, development of energy efficient schemes is very crucial to address this problem. Our research considers the fixed window traffic allocation model and the anycast routing scheme to select the best option for the destination node. Proper routing schemes and appropriate combination of the replicas can take care of the issue for energy utilization and at the same time help diminish costs for the data centers. We have also considered the real-time pricing model (which considers price changes every hour) to select routes for the lightpaths. Hence, we propose an ILP to handle the energyaware routing and wavelength assignment (RWA) problem for fixed window scheduled traffic model, with an objective to minimize the overall electricity costs of a datacenter network by reducing the actual power consumption, and using low-cost resources whenever possible


    Get PDF
    Limitations of bus-based interconnections related to scalability, latency, bandwidth, and power consumption for supporting the related huge number of on-chip resources result in a communication bottleneck. These challenges can be efficiently addressed with the implementation of a network-on-chip (NoC) system. This book gives a detailed analysis of various on-chip communication architectures and covers different areas of NoCs such as potentials, architecture, technical challenges, optimization, design explorations, and research directions. In addition, it discusses current and future trends that could make an impactful and meaningful contribution to the research and design of on-chip communications and NoC systems

    Optimal transmission schedules for lightwave networks embedded with de Bruijn graphs

    Get PDF
    AbstractWe consider the problem of embedding a virtual de Bruijn topology, both directed and undirected, in a physical optical passive star time and wavelength division multiplexed (TWDM) network and constructing a schedule to transmit packets along all edges of the virtual topology in the shortest possible time. We develop general graph theoretic results and algorithms and using these build optimal embeddings and optimal transmission schedules, assuming certain conditions on the network parameters. We prove our transmission schedules are optimal over all possible embeddings.As a general framework we use a model of the passive star network with fixed tuned receivers and tunable transmitters. Our transmission schedules are optimal regardless of the tuning time. Our results are also applicable to models with one or more fixed tuned transmitters per node. We give results that minimize the number of tunings needed. For the directed de Bruijn topology a single fixed tuning of the transmitter suffices. For the undirected de Bruijn topology two tunings per cycle (or two fixed tuned transmitters per node) suffice and we prove this is the minimum possible

    Processor allocation strategies for modified hypercubes

    Get PDF
    Parallel processing has been widely accepted to be the future in high speed computing. Among the various parallel architectures proposed/implemented, the hypercube has shown a lot of promise because of its poweful properties, like regular topology, fault tolerance, low diameter, simple routing, and ability to efficiently emulate other architectures. The major drawback of the hypercube network is that it can not be expanded in practice because the number of communication ports for each processor grows as the logarithm of the total number of processors in the system. Therefore, once a hypercube supercomputer of a certain dimensionality has been built, any future expansions can be accomplished only by replacing the VLSI chips. This is an undesirable feature and a lot of work has been under progress to eliminate this stymie, thus providing a platform for easier expansion. Modified hypercubes (MHs) have been proposed as the building blocks of hypercube-based systems supporting incremental growth techniques without introducing extra resources for individual hypercubes. However, processor allocation on MHs proves to be a challenge due to a slight deviation in their topology from that of the standard hypercube network. This thesis addresses the issue of processor allocation on MHs and proposes various strategies which are based, partially or entirely, on table look-up approaches. A study of the various task allocation strategies for standard hypercubes is conducted and their suitability for MHs is evaluated. It is shown that the proposed strategies have a perfect subcube recognition ability and a superior performance. Existing processor allocation strategies for pure hypercube networks are demonstrated to be ineffective for MHs, in the light of their inability to recognize all available subcubes. A comparative analysis that involves the buddy strategy and the new strategies is carried out using simulation results
    • …