65,115 research outputs found

    Fast multi-dimensional scattered data approximation with Neumann boundary conditions

    Full text link
    An important problem in applications is the approximation of a function ff from a finite set of randomly scattered data f(xj)f(x_j). A common and powerful approach is to construct a trigonometric least squares approximation based on the set of exponentials {e2πikx}\{e^{2\pi i kx}\}. This leads to fast numerical algorithms, but suffers from disturbing boundary effects due to the underlying periodicity assumption on the data, an assumption that is rarely satisfied in practice. To overcome this drawback we impose Neumann boundary conditions on the data. This implies the use of cosine polynomials cos(πkx)\cos (\pi kx) as basis functions. We show that scattered data approximation using cosine polynomials leads to a least squares problem involving certain Toeplitz+Hankel matrices. We derive estimates on the condition number of these matrices. Unlike other Toeplitz+Hankel matrices, the Toeplitz+Hankel matrices arising in our context cannot be diagonalized by the discrete cosine transform, but they still allow a fast matrix-vector multiplication via DCT which gives rise to fast conjugate gradient type algorithms. We show how the results can be generalized to higher dimensions. Finally we demonstrate the performance of the proposed method by applying it to a two-dimensional geophysical scattered data problem

    Wavelets and Fast Numerical Algorithms

    Full text link
    Wavelet based algorithms in numerical analysis are similar to other transform methods in that vectors and operators are expanded into a basis and the computations take place in this new system of coordinates. However, due to the recursive definition of wavelets, their controllable localization in both space and wave number (time and frequency) domains, and the vanishing moments property, wavelet based algorithms exhibit new and important properties. For example, the multiresolution structure of the wavelet expansions brings about an efficient organization of transformations on a given scale and of interactions between different neighbouring scales. Moreover, wide classes of operators which naively would require a full (dense) matrix for their numerical description, have sparse representations in wavelet bases. For these operators sparse representations lead to fast numerical algorithms, and thus address a critical numerical issue. We note that wavelet based algorithms provide a systematic generalization of the Fast Multipole Method (FMM) and its descendents. These topics will be the subject of the lecture. Starting from the notion of multiresolution analysis, we will consider the so-called non-standard form (which achieves decoupling among the scales) and the associated fast numerical algorithms. Examples of non-standard forms of several basic operators (e.g. derivatives) will be computed explicitly.Comment: 32 pages, uuencoded tar-compressed LaTeX file. Uses epsf.sty (see `macros'

    High-level synthesis optimization for blocked floating-point matrix multiplication

    Get PDF
    In the last decade floating-point matrix multiplication on FPGAs has been studied extensively and efficient architectures as well as detailed performance models have been developed. By design these IP cores take a fixed footprint which not necessarily optimizes the use of all available resources. Moreover, the low-level architectures are not easily amenable to a parameterized synthesis. In this paper high-level synthesis is used to fine-tune the configuration parameters in order to achieve the highest performance with maximal resource utilization. An\ exploration strategy is presented to optimize the use of critical resources (DSPs, memory) for any given FPGA. To account for the limited memory size on the FPGA, a block-oriented matrix multiplication is organized such that the block summation is done on the CPU while the block multiplication occurs on the logic fabric simultaneously. The communication overhead between the CPU and the FPGA is minimized by streaming the blocks in a Gray code ordering scheme which maximizes the data reuse for consecutive block matrix product calculations. Using high-level synthesis optimization, the programmable logic operates at 93% of the theoretical peak performance and the combined CPU-FPGA design achieves 76% of the available hardware processing speed for the floating-point multiplication of 2K by 2K matrices

    Dominance Product and High-Dimensional Closest Pair under LL_\infty

    Get PDF
    Given a set SS of nn points in Rd\mathbb{R}^d, the Closest Pair problem is to find a pair of distinct points in SS at minimum distance. When dd is constant, there are efficient algorithms that solve this problem, and fast approximate solutions for general dd. However, obtaining an exact solution in very high dimensions seems to be much less understood. We consider the high-dimensional LL_\infty Closest Pair problem, where d=nrd=n^r for some r>0r > 0, and the underlying metric is LL_\infty. We improve and simplify previous results for LL_\infty Closest Pair, showing that it can be solved by a deterministic strongly-polynomial algorithm that runs in O(DP(n,d)logn)O(DP(n,d)\log n) time, and by a randomized algorithm that runs in O(DP(n,d))O(DP(n,d)) expected time, where DP(n,d)DP(n,d) is the time bound for computing the {\em dominance product} for nn points in Rd\mathbb{R}^d. That is a matrix DD, such that D[i,j]={kpi[k]pj[k]}D[i,j] = \bigl| \{k \mid p_i[k] \leq p_j[k]\} \bigr|; this is the number of coordinates at which pjp_j dominates pip_i. For integer coordinates from some interval [M,M][-M, M], we obtain an algorithm that runs in O~(min{Mnω(1,r,1),DP(n,d)})\tilde{O}\left(\min\{Mn^{\omega(1,r,1)},\, DP(n,d)\}\right) time, where ω(1,r,1)\omega(1,r,1) is the exponent of multiplying an n×nrn \times n^r matrix by an nr×nn^r \times n matrix. We also give slightly better bounds for DP(n,d)DP(n,d), by using more recent rectangular matrix multiplication bounds. Computing the dominance product itself is an important task, since it is applied in many algorithms as a major black-box ingredient, such as algorithms for APBP (all pairs bottleneck paths), and variants of APSP (all pairs shortest paths)

    Lower Bounds for Online Integer Multiplication and Convolution in the Cell-Probe Model

    Get PDF
    corecore