16,239 research outputs found

    A conjugate gradient algorithm for the astrometric core solution of Gaia

    Full text link
    The ESA space astrometry mission Gaia, planned to be launched in 2013, has been designed to make angular measurements on a global scale with micro-arcsecond accuracy. A key component of the data processing for Gaia is the astrometric core solution, which must implement an efficient and accurate numerical algorithm to solve the resulting, extremely large least-squares problem. The Astrometric Global Iterative Solution (AGIS) is a framework that allows to implement a range of different iterative solution schemes suitable for a scanning astrometric satellite. In order to find a computationally efficient and numerically accurate iteration scheme for the astrometric solution, compatible with the AGIS framework, we study an adaptation of the classical conjugate gradient (CG) algorithm, and compare it to the so-called simple iteration (SI) scheme that was previously known to converge for this problem, although very slowly. The different schemes are implemented within a software test bed for AGIS known as AGISLab, which allows to define, simulate and study scaled astrometric core solutions. After successful testing in AGISLab, the CG scheme has been implemented also in AGIS. The two algorithms CG and SI eventually converge to identical solutions, to within the numerical noise (of the order of 0.00001 micro-arcsec). These solutions are independent of the starting values (initial star catalogue), and we conclude that they are equivalent to a rigorous least-squares estimation of the astrometric parameters. The CG scheme converges up to a factor four faster than SI in the tested cases, and in particular spatially correlated truncation errors are much more efficiently damped out with the CG scheme.Comment: 24 pages, 16 figures. Accepted for publication in Astronomy & Astrophysic

    Filling in CMB map missing data using constrained Gaussian realizations

    Full text link
    For analyzing maps of the cosmic microwave background sky, it is necessary to mask out the region around the galactic equator where the parasitic foreground emission is strongest as well as the brightest compact sources. Since many of the analyses of the data, particularly those searching for non-Gaussianity of a primordial origin, are most straightforwardly carried out on full-sky maps, it is of great interest to develop efficient algorithms for filling in the missing information in a plausible way. We explore practical algorithms for filling in based on constrained Gaussian realizations. Although carrying out such realizations is in principle straightforward, for finely pixelized maps as will be required for the Planck analysis a direct brute force method is not numerically tractable. We present some concrete solutions to this problem, both on a spatially flat sky with periodic boundary conditions and on the pixelized sphere. One approach is to solve the linear system with an appropriately preconditioned conjugate gradient method. While this approach was successfully implemented on a rectangular domain with periodic boundary conditions and worked even for very wide masked regions, we found that the method failed on the pixelized sphere for reasons that we explain here. We present an approach that works for full-sky pixelized maps on the sphere involving a kernel-based multi-resolution Laplace solver followed by a series of conjugate gradient corrections near the boundary of the mask.Comment: 22 pages, 14 figures, minor changes, a few missing references adde

    Learning Output Kernels for Multi-Task Problems

    Full text link
    Simultaneously solving multiple related learning tasks is beneficial under a variety of circumstances, but the prior knowledge necessary to correctly model task relationships is rarely available in practice. In this paper, we develop a novel kernel-based multi-task learning technique that automatically reveals structural inter-task relationships. Building over the framework of output kernel learning (OKL), we introduce a method that jointly learns multiple functions and a low-rank multi-task kernel by solving a non-convex regularization problem. Optimization is carried out via a block coordinate descent strategy, where each subproblem is solved using suitable conjugate gradient (CG) type iterative methods for linear operator equations. The effectiveness of the proposed approach is demonstrated on pharmacological and collaborative filtering data

    Extension of Wirtinger's Calculus to Reproducing Kernel Hilbert Spaces and the Complex Kernel LMS

    Full text link
    Over the last decade, kernel methods for nonlinear processing have successfully been used in the machine learning community. The primary mathematical tool employed in these methods is the notion of the Reproducing Kernel Hilbert Space. However, so far, the emphasis has been on batch techniques. It is only recently, that online techniques have been considered in the context of adaptive signal processing tasks. Moreover, these efforts have only been focussed on real valued data sequences. To the best of our knowledge, no adaptive kernel-based strategy has been developed, so far, for complex valued signals. Furthermore, although the real reproducing kernels are used in an increasing number of machine learning problems, complex kernels have not, yet, been used, in spite of their potential interest in applications that deal with complex signals, with Communications being a typical example. In this paper, we present a general framework to attack the problem of adaptive filtering of complex signals, using either real reproducing kernels, taking advantage of a technique called \textit{complexification} of real RKHSs, or complex reproducing kernels, highlighting the use of the complex gaussian kernel. In order to derive gradients of operators that need to be defined on the associated complex RKHSs, we employ the powerful tool of Wirtinger's Calculus, which has recently attracted attention in the signal processing community. To this end, in this paper, the notion of Wirtinger's calculus is extended, for the first time, to include complex RKHSs and use it to derive several realizations of the Complex Kernel Least-Mean-Square (CKLMS) algorithm. Experiments verify that the CKLMS offers significant performance improvements over several linear and nonlinear algorithms, when dealing with nonlinearities.Comment: 15 pages (double column), preprint of article accepted in IEEE Trans. Sig. Pro

    FALKON: An Optimal Large Scale Kernel Method

    Get PDF
    Kernel methods provide a principled way to perform non linear, nonparametric learning. They rely on solid functional analytic foundations and enjoy optimal statistical properties. However, at least in their basic form, they have limited applicability in large scale scenarios because of stringent computational requirements in terms of time and especially memory. In this paper, we take a substantial step in scaling up kernel methods, proposing FALKON, a novel algorithm that allows to efficiently process millions of points. FALKON is derived combining several algorithmic principles, namely stochastic subsampling, iterative solvers and preconditioning. Our theoretical analysis shows that optimal statistical accuracy is achieved requiring essentially O(n)O(n) memory and O(nn)O(n\sqrt{n}) time. An extensive experimental analysis on large scale datasets shows that, even with a single machine, FALKON outperforms previous state of the art solutions, which exploit parallel/distributed architectures.Comment: NIPS 201

    Multi-GPU maximum entropy image synthesis for radio astronomy

    Full text link
    The maximum entropy method (MEM) is a well known deconvolution technique in radio-interferometry. This method solves a non-linear optimization problem with an entropy regularization term. Other heuristics such as CLEAN are faster but highly user dependent. Nevertheless, MEM has the following advantages: it is unsupervised, it has a statistical basis, it has a better resolution and better image quality under certain conditions. This work presents a high performance GPU version of non-gridding MEM, which is tested using real and simulated data. We propose a single-GPU and a multi-GPU implementation for single and multi-spectral data, respectively. We also make use of the Peer-to-Peer and Unified Virtual Addressing features of newer GPUs which allows to exploit transparently and efficiently multiple GPUs. Several ALMA data sets are used to demonstrate the effectiveness in imaging and to evaluate GPU performance. The results show that a speedup from 1000 to 5000 times faster than a sequential version can be achieved, depending on data and image size. This allows to reconstruct the HD142527 CO(6-5) short baseline data set in 2.1 minutes, instead of 2.5 days that takes a sequential version on CPU.Comment: 11 pages, 13 figure

    Generalization Error Bounds of Gradient Descent for Learning Over-parameterized Deep ReLU Networks

    Full text link
    Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very good generalization performance in the over-parameterization regime, where DNNs can easily fit a random labeling of the training data. Very recently, a line of work explains in theory that with over-parameterization and proper random initialization, gradient-based methods can find the global minima of the training loss for DNNs. However, existing generalization error bounds are unable to explain the good generalization performance of over-parameterized DNNs. The major limitation of most existing generalization bounds is that they are based on uniform convergence and are independent of the training algorithm. In this work, we derive an algorithm-dependent generalization error bound for deep ReLU networks, and show that under certain assumptions on the data distribution, gradient descent (GD) with proper random initialization is able to train a sufficiently over-parameterized DNN to achieve arbitrarily small generalization error. Our work sheds light on explaining the good generalization performance of over-parameterized deep neural networks.Comment: 27 pages. This version simplifies the proof and improves the presentation in Version 3. In AAAI 202
    • …
    corecore