5,440 research outputs found

    Parallel scheduling of recursively defined arrays

    Get PDF
    A new method of automatic generation of concurrent programs which constructs arrays defined by sets of recursive equations is described. It is assumed that the time of computation of an array element is a linear combination of its indices, and integer programming is used to seek a succession of hyperplanes along which array elements can be computed concurrently. The method can be used to schedule equations involving variable length dependency vectors and mutually recursive arrays. Portions of the work reported here have been implemented in the PS automatic program generation system

    Semi-spectral Chebyshev method in Quantum Mechanics

    Get PDF
    Traditionally, finite differences and finite element methods have been by many regarded as the basic tools for obtaining numerical solutions in a variety of quantum mechanical problems emerging in atomic, nuclear and particle physics, astrophysics, quantum chemistry, etc. In recent years, however, an alternative technique based on the semi-spectral methods has focused considerable attention. The purpose of this work is first to provide the necessary tools and subsequently examine the efficiency of this method in quantum mechanical applications. Restricting our interest to time independent two-body problems, we obtained the continuous and discrete spectrum solutions of the underlying Schroedinger or Lippmann-Schwinger equations in both, the coordinate and momentum space. In all of the numerically studied examples we had no difficulty in achieving the machine accuracy and the semi-spectral method showed exponential convergence combined with excellent numerical stability.Comment: RevTeX, 12 EPS figure

    On Tail Index Estimation for Dependent, Heterogenous Data

    Get PDF
    In this paper we analyze the asymptotic properties of the popular distribution tail index estimator by B. Hill (1975) for possibly heavy- tailed, heterogenous, dependent processes. We prove the Hill estimator is weakly consistent for processes with extremes that form mixingale sequences, and asymptotically normal for processes with extremes that are near-epoch-dependent on the extremes of a mixing process. Our limit theory covers infinitely many ARFIMA and FIGARCH processes, stochastic recurrence equations, and simple bilinear processes. Moreover, we develop a simple non-parametric kernel estimator of the asymptotic variance of the Hill estimator, and prove consistency for extremal-NED processes.Hill estimator; regular variation; infinite variance; near epoch dependence; mixingale; kernel estimator; tail array sum.

    Parallelization of dynamic programming recurrences in computational biology

    Get PDF
    The rapid growth of biosequence databases over the last decade has led to a performance bottleneck in the applications analyzing them. In particular, over the last five years DNA sequencing capacity of next-generation sequencers has been doubling every six months as costs have plummeted. The data produced by these sequencers is overwhelming traditional compute systems. We believe that in the future compute performance, not sequencing, will become the bottleneck in advancing genome science. In this work, we investigate novel computing platforms to accelerate dynamic programming algorithms, which are popular in bioinformatics workloads. We study algorithm-specific hardware architectures that exploit fine-grained parallelism in dynamic programming kernels using field-programmable gate arrays: FPGAs). We advocate a high-level synthesis approach, using the recurrence equation abstraction to represent dynamic programming and polyhedral analysis to exploit parallelism. We suggest a novel technique within the polyhedral model to optimize for throughput by pipelining independent computations on an array. This design technique improves on the state of the art, which builds latency-optimal arrays. We also suggest a method to dynamically switch between a family of designs using FPGA reconfiguration to achieve a significant performance boost. We have used polyhedral methods to parallelize the Nussinov RNA folding algorithm to build a family of accelerators that can trade resources for parallelism and are between 15-130x faster than a modern dual core CPU implementation. A Zuker RNA folding accelerator we built on a single workstation with four Xilinx Virtex 4 FPGAs outperforms 198 3 GHz Intel Core 2 Duo processors. Furthermore, our design running on a single FPGA is an order of magnitude faster than competing implementations on similar-generation FPGAs and graphics processors. Our work is a step toward the goal of automated synthesis of hardware accelerators for dynamic programming algorithms
    corecore