342 research outputs found

    A lower bound for linear approximate compaction

    Get PDF
    The {\em λ\lambda-approximate compaction} problem is: given an input array of nn values, each either 0 or 1, place each value in an output array so that all the 1's are in the first (1+λ)k(1+\lambda)k array locations, where kk is the number of 1's in the input. λ\lambda is an accuracy parameter. This problem is of fundamental importance in parallel computation because of its applications to processor allocation and approximate counting. When λ\lambda is a constant, the problem is called {\em Linear Approximate Compaction} (LAC). On the CRCW PRAM model, %there is an algorithm that solves approximate compaction in \order{(\log\log n)^3} time for λ=1loglogn\lambda = \frac{1}{\log\log n}, using n(loglogn)3\frac{n}{(\log\log n)^3} processors. Our main result shows that this is close to the best possible. Specifically, we prove that LAC requires %Ω(loglogn)\Omega(\log\log n) time using \order{n} processors. We also give a tradeoff between λ\lambda and the processing time. For ϵ<1\epsilon < 1, and λ=nϵ\lambda = n^{\epsilon}, the time required is Ω(log1ϵ)\Omega(\log \frac{1}{\epsilon})

    On a compaction theorem of ragde

    No full text
    Ragde demonstrated that in constant time a PRAM with nn processors can move at most kk items, stored in distinct cells of an array of size nn, to distinct cells in an array of size at most k4k^4. We show that the exponent of 4 in the preceding sentence can be replaced by any constant greater than~2

    Fast deterministic processor allocation

    No full text
    Interval allocation has been suggested as a possible formalization for the PRAM of the (vaguely defined) processor allocation problem, which is of fundamental importance in parallel computing. The interval allocation problem is, given nn nonnegative integers x1,,xnx_1,\ldots,x_n, to allocate nn nonoverlapping subarrays of sizes x1,,xnx_1,\ldots,x_n from within a base array of O(j=1nxj)O(\sum_{j=1}^n x_j) cells. We show that interval allocation problems of size nn can be solved in O((loglogn)3)O((\log\log n)^3) time with optimal speedup on a deterministic CRCW PRAM. In addition to a general solution to the processor allocation problem, this implies an improved deterministic algorithm for the problem of approximate summation. For both interval allocation and approximate summation, the fastest previous deterministic algorithms have running times of Θ(logn/loglogn)\Theta({{\log n}/{\log\log n}}). We also describe an application to the problem of computing the connected components of an undirected graph

    Compiling for an Heterogeneous Vector Image Processor

    No full text
    International audienceWe present a new compilation strategy, implemented at a small cost, to optimize image applications developed on top of a high level image processing library for an heterogeneous processor with a vector image processing accelerator. The library provides the semantics of the image computations. The pipelined structure of the accelerator allows to compute whole expressions with dozens of elementary image instructions, but is constrained as intermediate image values cannot be extracted. We adapted standard compilation techniques to perform this task automatically. Our strategy is implemented in PIPS, a source-to-source compiler which greatly reduces the development cost as standard phases are reused and parameterized for the target. Experiments were run on the hardware functional simulator. We compile 1217 cases, from elementary tests to full applications. All are optimal but a few which are mostly within a mere accelerator call of optimality. Our contribu- tions include: 1) a general low cost compilation strategy for image processing applications, based on the semantics provided by library calls, which improves locality by an order of magnitude; 2) a specific heuristic to minimize execution time on the target vector accelerator; 3) numerous experiments that show the effectiveness of our strategy

    An Arbitrary CRCW PRAM Algorithm for Sorting Integers Into a LinkedList and Chaining on a Trie

    Get PDF
    Title from PDF of title page viewed June 1, 2020Thesis advisor: Yijie HanVitaIncludes bibliographical references (pages 22-23)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2020The research work comprises of two parts. Part one is using an Arbitrary CRCW PRAM algorithm for sorting integers into a linked list. There are various algorithms and techniques to sort the integers in LinkedList. Arbitrary CRCW PRAM model, being the weakest model is able to sort n integers in a LinkedList in “constant time” using nlogm processors and if we use nt processors, then it can be sorted in O(loglogm/logt) time by converting Arbitrary CRCW PRAM model to Priority CRCW PRAM model. Part two is Chaining on a Trie. This research paper solves the problem of chaining on a Trie by providing more efficient complexity. This Algorithm takes “constant time” using n(logm+1) processors to chain the nodes on a Trie for n input integers on the Arbitrary CRCW PRAM model.Introduction -- Sort integers into a linked list -- Chaining on a Trie --Conclusio

    Introduction to the scheduling problem

    Full text link
    corecore