112,252 research outputs found
A randomised primal-dual algorithm for distributed radio-interferometric imaging
Next generation radio telescopes, like the Square Kilometre Array, will
acquire an unprecedented amount of data for radio astronomy. The development of
fast, parallelisable or distributed algorithms for handling such large-scale
data sets is of prime importance. Motivated by this, we investigate herein a
convex optimisation algorithmic structure, based on primal-dual
forward-backward iterations, for solving the radio interferometric imaging
problem. It can encompass any convex prior of interest. It allows for the
distributed processing of the measured data and introduces further flexibility
by employing a probabilistic approach for the selection of the data blocks used
at a given iteration. We study the reconstruction performance with respect to
the data distribution and we propose the use of nonuniform probabilities for
the randomised updates. Our simulations show the feasibility of the
randomisation given a limited computing infrastructure as well as important
computational advantages when compared to state-of-the-art algorithmic
structures.Comment: 5 pages, 3 figures, Proceedings of the European Signal Processing
Conference (EUSIPCO) 2016, Related journal publication available at
https://arxiv.org/abs/1601.0402
High-performance Kernel Machines with Implicit Distributed Optimization and Randomization
In order to fully utilize "big data", it is often required to use "big
models". Such models tend to grow with the complexity and size of the training
data, and do not make strong parametric assumptions upfront on the nature of
the underlying statistical dependencies. Kernel methods fit this need well, as
they constitute a versatile and principled statistical methodology for solving
a wide range of non-parametric modelling problems. However, their high
computational costs (in storage and time) pose a significant barrier to their
widespread adoption in big data applications.
We propose an algorithmic framework and high-performance implementation for
massive-scale training of kernel-based statistical models, based on combining
two key technical ingredients: (i) distributed general purpose convex
optimization, and (ii) the use of randomization to improve the scalability of
kernel methods. Our approach is based on a block-splitting variant of the
Alternating Directions Method of Multipliers, carefully reconfigured to handle
very large random feature matrices, while exploiting hybrid parallelism
typically found in modern clusters of multicore machines. Our implementation
supports a variety of statistical learning tasks by enabling several loss
functions, regularization schemes, kernels, and layers of randomized
approximations for both dense and sparse datasets, in a highly extensible
framework. We evaluate the ability of our framework to learn models on data
from applications, and provide a comparison against existing sequential and
parallel libraries.Comment: Work presented at MMDS 2014 (June 2014) and JSM 201
- …