20,960 research outputs found

    Efficient Algorithms for Clustering and Interpolation of Large Spatial Data Sets

    Get PDF
    Categorizing, analyzing, and integrating large spatial data sets are of great importance in various areas such as image processing, pattern recognition, remote sensing, and life sciences. For example, NASA alone is faced with huge data sets gathered from around the globe on a daily basis to help scientists better understand our planet. Many approaches for accurately clustering, interpolating, and integrating these data sets are very computationally expensive. The focus of my PhD thesis is on the development of efficient implementations of data clustering and interpolation methods for large spatial data sets, and the application of these methods to geostatistics and remote sensing. In particular, I have developed fast implementations of ISODATA clustering and kriging interpolation algorithms. These implementations derive their efficiency through the use of both exact and approximate computational techniques from computational geometry and scientific computing. My work on the ISODATA clustering algorithm employs the kd-tree data structure and the filtering algorithm to speed up distance and nearest neighbor calculations. In the case of kriging interpolation, I applied techniques from scientific computing including iterative methods, tapering, fast multipole methods, and nearest neighbor searching techniques. I also present an application of kriging interpolation method to the problem of data fusion of remotely sensed data

    A visual workspace for constructing hybrid MDS algorithms and coordinating multiple views

    Get PDF
    Data can be distinguished according to volume, variable types and distribution, and each of these characteristics imposes constraints upon the choice of applicable algorithms for their visualisation. This has led to an abundance of often disparate algorithmic techniques. Previous work has shown that a hybrid algorithmic approach can be successful in addressing the impact of data volume on the feasibility of multidimensional scaling (MDS). This paper presents a system and framework in which a user can easily explore algorithms as well as their hybrid conjunctions and the data flowing through them. Visual programming and a novel algorithmic architecture let the user semi-automatically define data flows and the co-ordination of multiple views of algorithmic and visualisation components. We propose that our approach has two main benefits: significant improvements in run times of MDS algorithms can be achieved, and intermediate views of the data and the visualisation program structure can provide greater insight and control over the visualisation process

    Interpolating point spread function anisotropy

    Full text link
    Planned wide-field weak lensing surveys are expected to reduce the statistical errors on the shear field to unprecedented levels. In contrast, systematic errors like those induced by the convolution with the point spread function (PSF) will not benefit from that scaling effect and will require very accurate modeling and correction. While numerous methods have been devised to carry out the PSF correction itself, modeling of the PSF shape and its spatial variations across the instrument field of view has, so far, attracted much less attention. This step is nevertheless crucial because the PSF is only known at star positions while the correction has to be performed at any position on the sky. A reliable interpolation scheme is therefore mandatory and a popular approach has been to use low-order bivariate polynomials. In the present paper, we evaluate four other classical spatial interpolation methods based on splines (B-splines), inverse distance weighting (IDW), radial basis functions (RBF) and ordinary Kriging (OK). These methods are tested on the Star-challenge part of the GRavitational lEnsing Accuracy Testing 2010 (GREAT10) simulated data and are compared with the classical polynomial fitting (Polyfit). We also test all our interpolation methods independently of the way the PSF is modeled, by interpolating the GREAT10 star fields themselves (i.e., the PSF parameters are known exactly at star positions). We find in that case RBF to be the clear winner, closely followed by the other local methods, IDW and OK. The global methods, Polyfit and B-splines, are largely behind, especially in fields with (ground-based) turbulent PSFs. In fields with non-turbulent PSFs, all interpolators reach a variance on PSF systematics σsys2\sigma_{sys}^2 better than the 1×1071\times10^{-7} upper bound expected by future space-based surveys, with the local interpolators performing better than the global ones

    Machine Learning for Neuroimaging with Scikit-Learn

    Get PDF
    Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g. multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g. resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.Comment: Frontiers in neuroscience, Frontiers Research Foundation, 2013, pp.1

    A virtual workspace for hybrid multidimensional scaling algorithms

    Get PDF
    In visualising multidimensional data, it is well known that different types of algorithms to process them. Data sets might be distinguished according to volume, variable types and distribution, and each of these characteristics imposes constraints upon the choice of applicable algorithms for their visualization. Previous work has shown that a hybrid algorithmic approach can be successful in addressing the impact of data volume on the feasibility of multidimensional scaling (MDS). This suggests that hybrid combinations of appropriate algorithms might also successfully address other characteristics of data. This paper presents a system and framework in which a user can easily explore hybrid algorithms and the data flowing through them. Visual programming and a novel algorithmic architecture let the user semi-automatically define data flows and the co-ordination of multiple views

    Sparse optical flow regularisation for real-time visual tracking

    Get PDF
    Optical flow can greatly improve the robustness of visual tracking algorithms. While dense optical flow algorithms have various applications, they can not be used for real-time solutions without resorting to GPU calculations. Furthermore, most optical flow algorithms fail in challenging lighting environments due to the violation of the brightness constraint. We propose a simple but effective iterative regularisation scheme for real-time, sparse optical flow algorithms, that is shown to be robust to sudden illumination changes and can handle large displacements. The algorithm proves to outperform well known techniques in real life video sequences, while being much faster to calculate. Our solution increases the robustness of a real-time particle filter based tracking application, consuming only a fraction of the available CPU power. Furthermore, a new and realistic optical flow dataset with annotated ground truth is created and made freely available for research purposes

    Solving Inverse Problems with Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity

    Full text link
    A general framework for solving image inverse problems is introduced in this paper. The approach is based on Gaussian mixture models, estimated via a computationally efficient MAP-EM algorithm. A dual mathematical interpretation of the proposed framework with structured sparse estimation is described, which shows that the resulting piecewise linear estimate stabilizes the estimation when compared to traditional sparse inverse problem techniques. This interpretation also suggests an effective dictionary motivated initialization for the MAP-EM algorithm. We demonstrate that in a number of image inverse problems, including inpainting, zooming, and deblurring, the same algorithm produces either equal, often significantly better, or very small margin worse results than the best published ones, at a lower computational cost.Comment: 30 page

    A Multiscale Pyramid Transform for Graph Signals

    Get PDF
    Multiscale transforms designed to process analog and discrete-time signals and images cannot be directly applied to analyze high-dimensional data residing on the vertices of a weighted graph, as they do not capture the intrinsic geometric structure of the underlying graph data domain. In this paper, we adapt the Laplacian pyramid transform for signals on Euclidean domains so that it can be used to analyze high-dimensional data residing on the vertices of a weighted graph. Our approach is to study existing methods and develop new methods for the four fundamental operations of graph downsampling, graph reduction, and filtering and interpolation of signals on graphs. Equipped with appropriate notions of these operations, we leverage the basic multiscale constructs and intuitions from classical signal processing to generate a transform that yields both a multiresolution of graphs and an associated multiresolution of a graph signal on the underlying sequence of graphs.Comment: 16 pages, 13 figure

    Algorithmic patterns for H\mathcal{H}-matrices on many-core processors

    Get PDF
    In this work, we consider the reformulation of hierarchical (H\mathcal{H}) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). H\mathcal{H} matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of H\mathcal{H} matrix operations on many-core processors is difficult due to the complex nature of the underlying algorithms. While previous algorithmic advances for many-core hardware focused on accelerating existing H\mathcal{H} matrix CPU implementations by many-core processors, we here aim at totally relying on that processor type. As main contribution, we introduce the necessary parallel algorithmic patterns allowing to map the full H\mathcal{H} matrix construction and the fast matrix-vector product to many-core hardware. Here, crucial ingredients are space filling curves, parallel tree traversal and batching of linear algebra operations. The resulting model GPU implementation hmglib is the, to the best of the authors knowledge, first entirely GPU-based Open Source H\mathcal{H} matrix library of this kind. We conclude this work by an in-depth performance analysis and a comparative performance study against a standard H\mathcal{H} matrix library, highlighting profound speedups of our many-core parallel approach
    corecore