16,239 research outputs found
A conjugate gradient algorithm for the astrometric core solution of Gaia
The ESA space astrometry mission Gaia, planned to be launched in 2013, has
been designed to make angular measurements on a global scale with
micro-arcsecond accuracy. A key component of the data processing for Gaia is
the astrometric core solution, which must implement an efficient and accurate
numerical algorithm to solve the resulting, extremely large least-squares
problem. The Astrometric Global Iterative Solution (AGIS) is a framework that
allows to implement a range of different iterative solution schemes suitable
for a scanning astrometric satellite. In order to find a computationally
efficient and numerically accurate iteration scheme for the astrometric
solution, compatible with the AGIS framework, we study an adaptation of the
classical conjugate gradient (CG) algorithm, and compare it to the so-called
simple iteration (SI) scheme that was previously known to converge for this
problem, although very slowly. The different schemes are implemented within a
software test bed for AGIS known as AGISLab, which allows to define, simulate
and study scaled astrometric core solutions. After successful testing in
AGISLab, the CG scheme has been implemented also in AGIS. The two algorithms CG
and SI eventually converge to identical solutions, to within the numerical
noise (of the order of 0.00001 micro-arcsec). These solutions are independent
of the starting values (initial star catalogue), and we conclude that they are
equivalent to a rigorous least-squares estimation of the astrometric
parameters. The CG scheme converges up to a factor four faster than SI in the
tested cases, and in particular spatially correlated truncation errors are much
more efficiently damped out with the CG scheme.Comment: 24 pages, 16 figures. Accepted for publication in Astronomy &
Astrophysic
Filling in CMB map missing data using constrained Gaussian realizations
For analyzing maps of the cosmic microwave background sky, it is necessary to
mask out the region around the galactic equator where the parasitic foreground
emission is strongest as well as the brightest compact sources. Since many of
the analyses of the data, particularly those searching for non-Gaussianity of a
primordial origin, are most straightforwardly carried out on full-sky maps, it
is of great interest to develop efficient algorithms for filling in the missing
information in a plausible way. We explore practical algorithms for filling in
based on constrained Gaussian realizations. Although carrying out such
realizations is in principle straightforward, for finely pixelized maps as will
be required for the Planck analysis a direct brute force method is not
numerically tractable. We present some concrete solutions to this problem, both
on a spatially flat sky with periodic boundary conditions and on the pixelized
sphere. One approach is to solve the linear system with an appropriately
preconditioned conjugate gradient method. While this approach was successfully
implemented on a rectangular domain with periodic boundary conditions and
worked even for very wide masked regions, we found that the method failed on
the pixelized sphere for reasons that we explain here. We present an approach
that works for full-sky pixelized maps on the sphere involving a kernel-based
multi-resolution Laplace solver followed by a series of conjugate gradient
corrections near the boundary of the mask.Comment: 22 pages, 14 figures, minor changes, a few missing references adde
Learning Output Kernels for Multi-Task Problems
Simultaneously solving multiple related learning tasks is beneficial under a
variety of circumstances, but the prior knowledge necessary to correctly model
task relationships is rarely available in practice. In this paper, we develop a
novel kernel-based multi-task learning technique that automatically reveals
structural inter-task relationships. Building over the framework of output
kernel learning (OKL), we introduce a method that jointly learns multiple
functions and a low-rank multi-task kernel by solving a non-convex
regularization problem. Optimization is carried out via a block coordinate
descent strategy, where each subproblem is solved using suitable conjugate
gradient (CG) type iterative methods for linear operator equations. The
effectiveness of the proposed approach is demonstrated on pharmacological and
collaborative filtering data
Extension of Wirtinger's Calculus to Reproducing Kernel Hilbert Spaces and the Complex Kernel LMS
Over the last decade, kernel methods for nonlinear processing have
successfully been used in the machine learning community. The primary
mathematical tool employed in these methods is the notion of the Reproducing
Kernel Hilbert Space. However, so far, the emphasis has been on batch
techniques. It is only recently, that online techniques have been considered in
the context of adaptive signal processing tasks. Moreover, these efforts have
only been focussed on real valued data sequences. To the best of our knowledge,
no adaptive kernel-based strategy has been developed, so far, for complex
valued signals. Furthermore, although the real reproducing kernels are used in
an increasing number of machine learning problems, complex kernels have not,
yet, been used, in spite of their potential interest in applications that deal
with complex signals, with Communications being a typical example. In this
paper, we present a general framework to attack the problem of adaptive
filtering of complex signals, using either real reproducing kernels, taking
advantage of a technique called \textit{complexification} of real RKHSs, or
complex reproducing kernels, highlighting the use of the complex gaussian
kernel. In order to derive gradients of operators that need to be defined on
the associated complex RKHSs, we employ the powerful tool of Wirtinger's
Calculus, which has recently attracted attention in the signal processing
community. To this end, in this paper, the notion of Wirtinger's calculus is
extended, for the first time, to include complex RKHSs and use it to derive
several realizations of the Complex Kernel Least-Mean-Square (CKLMS) algorithm.
Experiments verify that the CKLMS offers significant performance improvements
over several linear and nonlinear algorithms, when dealing with nonlinearities.Comment: 15 pages (double column), preprint of article accepted in IEEE Trans.
Sig. Pro
FALKON: An Optimal Large Scale Kernel Method
Kernel methods provide a principled way to perform non linear, nonparametric
learning. They rely on solid functional analytic foundations and enjoy optimal
statistical properties. However, at least in their basic form, they have
limited applicability in large scale scenarios because of stringent
computational requirements in terms of time and especially memory. In this
paper, we take a substantial step in scaling up kernel methods, proposing
FALKON, a novel algorithm that allows to efficiently process millions of
points. FALKON is derived combining several algorithmic principles, namely
stochastic subsampling, iterative solvers and preconditioning. Our theoretical
analysis shows that optimal statistical accuracy is achieved requiring
essentially memory and time. An extensive experimental
analysis on large scale datasets shows that, even with a single machine, FALKON
outperforms previous state of the art solutions, which exploit
parallel/distributed architectures.Comment: NIPS 201
Multi-GPU maximum entropy image synthesis for radio astronomy
The maximum entropy method (MEM) is a well known deconvolution technique in
radio-interferometry. This method solves a non-linear optimization problem with
an entropy regularization term. Other heuristics such as CLEAN are faster but
highly user dependent. Nevertheless, MEM has the following advantages: it is
unsupervised, it has a statistical basis, it has a better resolution and better
image quality under certain conditions. This work presents a high performance
GPU version of non-gridding MEM, which is tested using real and simulated data.
We propose a single-GPU and a multi-GPU implementation for single and
multi-spectral data, respectively. We also make use of the Peer-to-Peer and
Unified Virtual Addressing features of newer GPUs which allows to exploit
transparently and efficiently multiple GPUs. Several ALMA data sets are used to
demonstrate the effectiveness in imaging and to evaluate GPU performance. The
results show that a speedup from 1000 to 5000 times faster than a sequential
version can be achieved, depending on data and image size. This allows to
reconstruct the HD142527 CO(6-5) short baseline data set in 2.1 minutes,
instead of 2.5 days that takes a sequential version on CPU.Comment: 11 pages, 13 figure
Generalization Error Bounds of Gradient Descent for Learning Over-parameterized Deep ReLU Networks
Empirical studies show that gradient-based methods can learn deep neural
networks (DNNs) with very good generalization performance in the
over-parameterization regime, where DNNs can easily fit a random labeling of
the training data. Very recently, a line of work explains in theory that with
over-parameterization and proper random initialization, gradient-based methods
can find the global minima of the training loss for DNNs. However, existing
generalization error bounds are unable to explain the good generalization
performance of over-parameterized DNNs. The major limitation of most existing
generalization bounds is that they are based on uniform convergence and are
independent of the training algorithm. In this work, we derive an
algorithm-dependent generalization error bound for deep ReLU networks, and show
that under certain assumptions on the data distribution, gradient descent (GD)
with proper random initialization is able to train a sufficiently
over-parameterized DNN to achieve arbitrarily small generalization error. Our
work sheds light on explaining the good generalization performance of
over-parameterized deep neural networks.Comment: 27 pages. This version simplifies the proof and improves the
presentation in Version 3. In AAAI 202
- …