33,523 research outputs found

    Distributed Regression in Sensor Networks: Training Distributively with Alternating Projections

    Full text link
    Wireless sensor networks (WSNs) have attracted considerable attention in recent years and motivate a host of new challenges for distributed signal processing. The problem of distributed or decentralized estimation has often been considered in the context of parametric models. However, the success of parametric methods is limited by the appropriateness of the strong statistical assumptions made by the models. In this paper, a more flexible nonparametric model for distributed regression is considered that is applicable in a variety of WSN applications including field estimation. Here, starting with the standard regularized kernel least-squares estimator, a message-passing algorithm for distributed estimation in WSNs is derived. The algorithm can be viewed as an instantiation of the successive orthogonal projection (SOP) algorithm. Various practical aspects of the algorithm are discussed and several numerical simulations validate the potential of the approach.Comment: To appear in the Proceedings of the SPIE Conference on Advanced Signal Processing Algorithms, Architectures and Implementations XV, San Diego, CA, July 31 - August 4, 200

    Replica theory for learning curves for Gaussian processes on random graphs

    Full text link
    Statistical physics approaches can be used to derive accurate predictions for the performance of inference methods learning from potentially noisy data, as quantified by the learning curve defined as the average error versus number of training examples. We analyse a challenging problem in the area of non-parametric inference where an effectively infinite number of parameters has to be learned, specifically Gaussian process regression. When the inputs are vertices on a random graph and the outputs noisy function values, we show that replica techniques can be used to obtain exact performance predictions in the limit of large graphs. The covariance of the Gaussian process prior is defined by a random walk kernel, the discrete analogue of squared exponential kernels on continuous spaces. Conventionally this kernel is normalised only globally, so that the prior variance can differ between vertices; as a more principled alternative we consider local normalisation, where the prior variance is uniform

    Understanding and Comparing Scalable Gaussian Process Regression for Big Data

    Full text link
    As a non-parametric Bayesian model which produces informative predictive distribution, Gaussian process (GP) has been widely used in various fields, like regression, classification and optimization. The cubic complexity of standard GP however leads to poor scalability, which poses challenges in the era of big data. Hence, various scalable GPs have been developed in the literature in order to improve the scalability while retaining desirable prediction accuracy. This paper devotes to investigating the methodological characteristics and performance of representative global and local scalable GPs including sparse approximations and local aggregations from four main perspectives: scalability, capability, controllability and robustness. The numerical experiments on two toy examples and five real-world datasets with up to 250K points offer the following findings. In terms of scalability, most of the scalable GPs own a time complexity that is linear to the training size. In terms of capability, the sparse approximations capture the long-term spatial correlations, the local aggregations capture the local patterns but suffer from over-fitting in some scenarios. In terms of controllability, we could improve the performance of sparse approximations by simply increasing the inducing size. But this is not the case for local aggregations. In terms of robustness, local aggregations are robust to various initializations of hyperparameters due to the local attention mechanism. Finally, we highlight that the proper hybrid of global and local scalable GPs may be a promising way to improve both the model capability and scalability for big data.Comment: 25 pages, 15 figures, preprint submitted to KB
    • …
    corecore