134 research outputs found

    Scheduled routing for the NuMesh

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.Includes bibliographical references (leaves 66-68).by Milan Singh Minsky.M.S

    Towards Distributed Quantum Computing by Qubit and Gate Graph Partitioning Techniques

    Full text link
    Distributed quantum computing is motivated by the difficulty in building large-scale, individual quantum computers. To solve that problem, a large quantum circuit is partitioned and distributed to small quantum computers for execution. Partitions running on different quantum computers share quantum information using entangled Bell pairs. However, entanglement generation and purification introduces both a runtime and memory overhead on distributed quantum computing. In this paper we study that trade-off by proposing two techniques for partitioning large quantum circuits and for distribution to small quantum computers. Our techniques map a quantum circuit to a graph representation. We study two approaches: one that considers only gate teleportation, and another that considers both gate and state teleportation to achieve the distributed execution. Then we apply the METIS graph partitioning algorithm to obtain the partitions and the number of entanglement requests between them. We use the SeQUeNCe quantum communication simulator to measure the time required for generating all the entanglements required to execute the distributed circuit. We find that the best partitioning technique will depend on the specific circuit of interest.Comment: Presented at IEEE Quantum Week 2023 (QCE23

    Space and time adaptation for parallel applications via data over-partitioning.

    Get PDF
    Adaptive resource allocation is a new feature to run parallel applications. It is used to obtain better space and time sharing according to current workload, to schedule around obstacles through reservation and to cope with lack of accurate predictability on heterogeneous resources. The implementation of resource adaptation is potentially very expensive if total remapping or partitioning from scratch has to be performed. The existing popular run-time systems include AMPI and Dome. AMPI, which uses huge numbers of threads in MPI process to implement resource adaptation, suffers from frequent thread switches and loss of cache locality; and Dome, an object-based migration environment, suffers from lack of general language supports. When resource adaptation occurs, load balancing techniques are used to allocate the workload fairly across processors, so that each processor takes roughly the same time to execute the processes assigned to it, and that every processor has the same workload to obtain the best performance and maximize resource utilization. This thesis proposes a novel approach---Adaptive Time/space sharing via Over-Partitioning (ATOP)---to implement resource adaptation with better performance in terms of time overhead. Total workload is represented by a data graph. ATOP performs over-partitioning on the graph to create a certain number of workload pieces, or partitions, while processing partitions per processor as one data collection in a single MPI process. Typically, the number of partitions is set equal to the number of processors potentially allocated. This approach is feasible for the applications using 2n processors. In the cases where our over-partitioning approach does not perform well, or non-fitting numbers of resources need to be chosen, ATOP still provides the alternative option to repartition from scratch. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .H36. Source: Masters Abstracts International, Volume: 43-03, page: 0876. Adviser: A. C. Sodan. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    Deadlock avoidance with virtual channels

    Get PDF
    High Performance Computing is a rapidly evolving area of computer science which attends to solve complicated computational problems with the combination of computational nodes connected through high speed networks. This work concentrates on the networks problems that appear in such networks and specially focuses on the Deadlock problem that can decrease the efficiency of the communication or even destroy the balance and paralyze the network. Goal of this work is the Deadlock avoidance with the use of virtual channels, in the switches of the network where the problem appears. The deadlock avoidance assures that will not be loss of data inside network, having as result the increased latency of the served packets, due to the extra calculation that the switches have to make to apply the policy.La computación de alto rendimiento es una zona de rápida evolución de la informática que busca resolver complicados problemas de cálculo con la combinación de los nodos de cómputo conectados a través de redes de alta velocidad. Este trabajo se centra en los problemas de las redes que aparecen en este tipo de sistemas y especialmente se centra en el problema del "deadlock" que puede disminuir la eficacia de la comunicación con la paralización de la red. El objetivo de este trabajo es la evitación de deadlock con el uso de canales virtuales, en los conmutadores de la red donde aparece el problema. Evitar el deadlock asegura que no se producirá la pérdida de datos en red, teniendo como resultado el aumento de la latencia de los paquetes, debido al overhead extra de cálculo que los conmutadores tienen que hacer para aplicar la política.La computació d'alt rendiment és una àrea de ràpida evolució de la informàtica que pretén resoldre complicats problemes de càlcul amb la combinació de nodes de còmput connectats a través de xarxes d'alta velocitat. Aquest treball se centra en els problemes de les xarxes que apareixen en aquest tipus de sistemes i especialment se centra en el problema del "deadlock" que pot disminuir l'eficàcia de la comunicació amb la paralització de la xarxa. L'objectiu d'aquest treball és l'evitació de deadlock amb l'ús de canals virtuals, en els commutadors de la xarxa on apareix el problema. Evitar deadlock assegura que no es produirà la pèrdua de dades en xarxa, tenint com a resultat l'augment de la latència dels paquets, degut al overhead extra de càlcul que els commutadors han de fer per aplicar la política

    Design Space Exploration for MPSoC Architectures

    Get PDF
    Multiprocessor system-on-chip (MPSoC) designs utilize the available technology and communication architectures to meet the requirements of the upcoming applications. In MPSoC, the communication platform is both the key enabler, as well as the key differentiator for realizing efficient MPSoCs. It provides product differentiation to meet a diverse, multi-dimensional set of design constraints, including performance, power, energy, reconfigurability, scalability, cost, reliability and time-to-market. The communication resources of a single interconnection platform cannot be fully utilized by all kind of applications, such as the availability of higher communication bandwidth for computation but not data intensive applications is often unfeasible in the practical implementation. This thesis aims to perform the architecture-level design space exploration towards efficient and scalable resource utilization for MPSoC communication architecture. In order to meet the performance requirements within the design constraints, careful selection of MPSoC communication platform, resource aware partitioning and mapping of the application play important role. To enhance the utilization of communication resources, variety of techniques such as resource sharing, multicast to avoid re-transmission of identical data, and adaptive routing can be used. For implementation, these techniques should be customized according to the platform architecture. To address the resource utilization of MPSoC communication platforms, variety of architectures with different design parameters and performance levels, namely Segmented bus (SegBus), Network-on-Chip (NoC) and Three-Dimensional NoC (3D-NoC), are selected. Average packet latency and power consumption are the evaluation parameters for the proposed techniques. In conventional computing architectures, fault on a component makes the connected fault-free components inoperative. Resource sharing approach can utilize the fault-free components to retain the system performance by reducing the impact of faults. Design space exploration also guides to narrow down the selection of MPSoC architecture, which can meet the performance requirements with design constraints.Siirretty Doriast

    A methodology to implement real-time applications on reconfigurable circuits

    Get PDF
    Special Issue Engineering of Configurable SystemsInternational audienceThis paper presents an extension of our AAA rapid prototyping methodology for the optimized implementation ofreal-time applications onto reconfigurable circuits. This extension is based on an unified model of factorized datadependence graphs as well to specify the application algorihtm, as to deduce the possible implementations ontoreconfigurable hardware, in terms of graphs transformations. This transformation flow has been implemented inSynDEx, a system level CAD software tool

    Multi-Provider Service Chain Embedding With Nestor

    Get PDF
    Network function (NF) virtualization decouples NFs from the underlying middlebox hardware and promotes their deployment on virtualized network infrastructures. This essentially paves the way for the migration of NFs into clouds (i.e., NF-as-a-Service), achieving a drastic reduction of middlebox investment and operational costs for enterprises. In this context, service chains (expressing middlebox policies in the enterprise network) should be mapped onto datacenter networks, ensuring correctness, resource efficiency, as well as compliance with the provider's policy. The network service embedding (NSE) problem is further exacerbated by two challenging aspects: 1) traffic scaling caused by certain NFs (e.g., caches and WAN optimizers) and 2) NF location dependencies. Traffic scaling requires resource reservations different from the ones specified in the service chain, whereas NF location dependencies, in conjunction with the limited geographic footprint of NF providers (NFPs), raise the need for NSE across multiple NFPs. In this paper, we present a holistic solution to the multi-provider NSE problem. We decompose NSE into: 1) NF-graph partitioning performed by a centralized coordinator and 2) NF-subgraph mapping onto datacenter networks. We present linear programming formulations to derive near-optimal solutions for both problems. We address the challenging aspect of traffic scaling by introducing a new service model that supports demand transformations. We also define topology abstractions for NF-graph partitioning. Furthermore, we discuss the steps required to embed service chains across multiple NFPs, using our NSE orchestrator (Nestor). We perform an evaluation study of multi-provider NSE with emphasis on NF-graph partitioning optimizations tailored to the client and NFPs. Our evaluation results further uncover significant savings in terms of service cost and resource consumption due to the demand transformations. © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works..EU/FP7/T-NOVA/619520DFG/Collaborative Research Center/1053 (MAKI)EU/FP7/T-NOVADFG/CRC/105

    FPGA dynamic and partial reconfiguration : a survey of architectures, methods, and applications

    Get PDF
    Dynamic and partial reconfiguration are key differentiating capabilities of field programmable gate arrays (FPGAs). While they have been studied extensively in academic literature, they find limited use in deployed systems. We review FPGA reconfiguration, looking at architectures built for the purpose, and the properties of modern commercial architectures. We then investigate design flows, and identify the key challenges in making reconfigurable FPGA systems easier to design. Finally, we look at applications where reconfiguration has found use, as well as proposing new areas where this capability places FPGAs in a unique position for adoption
    corecore