48,656 research outputs found

    Optimizing the double description method for normal surface enumeration

    Full text link
    Many key algorithms in 3-manifold topology involve the enumeration of normal surfaces, which is based upon the double description method for finding the vertices of a convex polytope. Typically we are only interested in a small subset of these vertices, thus opening the way for substantial optimization. Here we give an account of the vertex enumeration problem as it applies to normal surfaces, and present new optimizations that yield strong improvements in both running time and memory consumption. The resulting algorithms are tested using the freely available software package Regina.Comment: 27 pages, 12 figures; v2: Removed the 3^n bound from Section 3.3, fixed the projective equation in Lemma 4.4, clarified "most triangulations" in the introduction to section 5; v3: replace -ise with -ize for Mathematics of Computation (note that this changes the title of the paper

    Simulation of Field Theories in Wavelet Representation

    Get PDF
    The field is expanded in a wavelet series and the wavelet coefficients are varied in a simulation of the 2D ϕ4\phi^4 field theory. The drastically reduced autocorrelations result in a substantial decrease of computing requirements, compared to those in local Metropolis simulations. A large part of the improvement is shown to be the result of an additional freedom in the choice of the allowed range of change at the Metropolis update of wavelet components, namely the range can be optimized independently for all wavelet sizes.Comment: 10 pages, LaTeX with 8 figures, Swansea preprint SWAT/3

    Format Abstraction for Sparse Tensor Algebra Compilers

    Full text link
    This paper shows how to build a sparse tensor algebra compiler that is agnostic to tensor formats (data layouts). We develop an interface that describes formats in terms of their capabilities and properties, and show how to build a modular code generator where new formats can be added as plugins. We then describe six implementations of the interface that compose to form the dense, CSR/CSF, COO, DIA, ELL, and HASH tensor formats and countless variants thereof. With these implementations at hand, our code generator can generate code to compute any tensor algebra expression on any combination of the aforementioned formats. To demonstrate our technique, we have implemented it in the taco tensor algebra compiler. Our modular code generator design makes it simple to add support for new tensor formats, and the performance of the generated code is competitive with hand-optimized implementations. Furthermore, by extending taco to support a wider range of formats specialized for different application and data characteristics, we can improve end-user application performance. For example, if input data is provided in the COO format, our technique allows computing a single matrix-vector multiplication directly with the data in COO, which is up to 3.6×\times faster than by first converting the data to CSR.Comment: Presented at OOPSLA 201

    Fast Computation of Fourier Integral Operators

    Get PDF
    We introduce a general purpose algorithm for rapidly computing certain types of oscillatory integrals which frequently arise in problems connected to wave propagation and general hyperbolic equations. The problem is to evaluate numerically a so-called Fourier integral operator (FIO) of the form e2πiΦ(x,ξ)a(x,ξ)f^(ξ)dξ\int e^{2\pi i \Phi(x,\xi)} a(x,\xi) \hat{f}(\xi) \mathrm{d}\xi at points given on a Cartesian grid. Here, ξ\xi is a frequency variable, f^(ξ)\hat f(\xi) is the Fourier transform of the input ff, a(x,ξ)a(x,\xi) is an amplitude and Φ(x,ξ)\Phi(x,\xi) is a phase function, which is typically as large as ξ|\xi|; hence the integral is highly oscillatory at high frequencies. Because an FIO is a dense matrix, a naive matrix vector product with an input given on a Cartesian grid of size NN by NN would require O(N4)O(N^4) operations. This paper develops a new numerical algorithm which requires O(N2.5logN)O(N^{2.5} \log N) operations, and as low as O(N)O(\sqrt{N}) in storage space. It operates by localizing the integral over polar wedges with small angular aperture in the frequency plane. On each wedge, the algorithm factorizes the kernel e2πiΦ(x,ξ)a(x,ξ)e^{2 \pi i \Phi(x,\xi)} a(x,\xi) into two components: 1) a diffeomorphism which is handled by means of a nonuniform FFT and 2) a residual factor which is handled by numerical separation of the spatial and frequency variables. The key to the complexity and accuracy estimates is that the separation rank of the residual kernel is \emph{provably independent of the problem size}. Several numerical examples demonstrate the efficiency and accuracy of the proposed methodology. We also discuss the potential of our ideas for various applications such as reflection seismology.Comment: 31 pages, 3 figure
    corecore