11,756 research outputs found

    On Rearrangement of Items Stored in Stacks

    Full text link
    There are n2n \ge 2 stacks, each filled with dd items, and one empty stack. Every stack has capacity d>0d > 0. A robot arm, in one stack operation (step), may pop one item from the top of a non-empty stack and subsequently push it onto a stack not at capacity. In a {\em labeled} problem, all ndnd items are distinguishable and are initially randomly scattered in the nn stacks. The items must be rearranged using pop-and-pushs so that in the end, the kthk^{\rm th} stack holds items (k1)d+1,,kd(k-1)d +1, \ldots, kd, in that order, from the top to the bottom for all 1kn1 \le k \le n. In an {\em unlabeled} problem, the ndnd items are of nn types of dd each. The goal is to rearrange items so that items of type kk are located in the kthk^{\rm th} stack for all 1kn1 \le k \le n. In carrying out the rearrangement, a natural question is to find the least number of required pop-and-pushes. Our main contributions are: (1) an algorithm for restoring the order of n2n^2 items stored in an n×nn \times n table using only 2n2n column and row permutations, and its generalization, and (2) an algorithm with a guaranteed upper bound of O(nd)O(nd) steps for solving both versions of the stack rearrangement problem when dcnd \le \lceil cn \rceil for arbitrary fixed positive number cc. In terms of the required number of steps, the labeled and unlabeled version have lower bounds Ω(nd+ndlogdlogn)\Omega(nd + nd{\frac{\log d}{\log n}}) and Ω(nd)\Omega(nd), respectively

    On the acceleration of wavefront applications using distributed many-core architectures

    Get PDF
    In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectures to accelerate pipelined wavefront applications—a ubiquitous class of parallel algorithms used for the solution of a number of scientific and engineering applications. Specifically, we employ a recently developed port of the LU solver (from the NAS Parallel Benchmark suite) to investigate the performance of these algorithms on high-performance computing solutions from NVIDIA (Tesla C1060 and C2050) as well as on traditional clusters (AMD/InfiniBand and IBM BlueGene/P). Benchmark results are presented for problem classes A to C and a recently developed performance model is used to provide projections for problem classes D and E, the latter of which represents a billion-cell problem. Our results demonstrate that while the theoretical performance of GPU solutions will far exceed those of many traditional technologies, the sustained application performance is currently comparable for scientific wavefront applications. Finally, a breakdown of the GPU solution is conducted, exposing PCIe overheads and decomposition constraints. A new k-blocking strategy is proposed to improve the future performance of this class of algorithm on GPU-based architectures

    Cold Matter Assembled Atom-by-Atom

    Get PDF
    The realization of large-scale fully controllable quantum systems is an exciting frontier in modern physical science. We use atom-by-atom assembly to implement a novel platform for the deterministic preparation of regular arrays of individually controlled cold atoms. In our approach, a measurement and feedback procedure eliminates the entropy associated with probabilistic trap occupation and results in defect-free arrays of over 50 atoms in less than 400 ms. The technique is based on fast, real-time control of 100 optical tweezers, which we use to arrange atoms in desired geometric patterns and to maintain these configurations by replacing lost atoms with surplus atoms from a reservoir. This bottom-up approach enables controlled engineering of scalable many-body systems for quantum information processing, quantum simulations, and precision measurements.Comment: 12 pages, 9 figures, 3 movies as ancillary file

    A Comprehensive Analysis of Swarming-based Live Streaming to Leverage Client Heterogeneity

    Full text link
    Due to missing IP multicast support on an Internet scale, over-the-top media streams are delivered with the help of overlays as used by content delivery networks and their peer-to-peer (P2P) extensions. In this context, mesh/pull-based swarming plays an important role either as pure streaming approach or in combination with tree/push mechanisms. However, the impact of realistic client populations with heterogeneous resources is not yet fully understood. In this technical report, we contribute to closing this gap by mathematically analysing the most basic scheduling mechanisms latest deadline first (LDF) and earliest deadline first (EDF) in a continuous time Markov chain framework and combining them into a simple, yet powerful, mixed strategy to leverage inherent differences in client resources. The main contributions are twofold: (1) a mathematical framework for swarming on random graphs is proposed with a focus on LDF and EDF strategies in heterogeneous scenarios; (2) a mixed strategy, named SchedMix, is proposed that leverages peer heterogeneity. The proposed strategy, SchedMix is shown to outperform the other two strategies using different abstractions: a mean-field theoretic analysis of buffer probabilities, simulations of a stochastic model on random graphs, and a full-stack implementation of a P2P streaming system.Comment: Technical report and supplementary material to http://ieeexplore.ieee.org/document/7497234
    corecore