14 research outputs found

    Computational Limits of A Distributed Algorithm For Smoothing Spline

    Get PDF
    In this paper, we explore statistical versus computational trade-off to address a basic question in the application of a distributed algorithm: what is the minimal computational cost in obtaining statistical optimality? In smoothing spline setup, we observe a phase transition phenomenon for the number of deployed machines that ends up being a simple proxy for computing cost. Specifically, a sharp upper bound for the number of machines is established: when the number is below this bound, statistical optimality (in terms of nonparametric estimation or testing) is achievable; otherwise, statistical optimality becomes impossible. These sharp bounds partly capture intrinsic computational limits of the distributed algorithm considered in this paper, and turn out to be fully determined by the smoothness of the regression function. As a side remark, we argue that sample splitting may be viewed as an alternative form of regularization, playing a similar role as smoothing parameter.Comment: To Appear in Journal of Machine Learning Researc

    Improving the Simultaneous Application of the DSN-PC and NOAA GFS Datasets

    Get PDF
    Our surface-based sensor network, called Distributed Sensor Network for Prediction Calculations (DSN-PC) obviously has limitations in terms of vertical atmospheric data. While efforts are being made to approximate these upper-air parameters from surface-level, as a first step it was necessary to test the network’s capability of making distributed computations by applying a hybrid approach. We accessed public databases like NOAA Global Forecast System (GFS) and the initial values for the 2-dimensional computational grid were produced by using both DSN-PC measurements and NOAA GFS data for each grid point. However, though the latter consists of assimilated and initialized (smoothed) data the stations of the DSN-PC network provide raw measurements which can cause numerical instability due to measurement errors or local weather phenomena. Previously we simultaneously interpolated both DSN-PC and GFS data. As a step forward, we wanted for our network to have a more significant role in the production of the initial values. Therefore it was necessary to apply 2D smoothing algorithms on the initial conditions. We found significant difference regarding numerical stability between calculating with raw and smoothed initial data. Applying the smoothing algorithms greatly improved the prediction reliability compared to the cases when raw data were used. The size of the grid portion used for smoothing has a significant impact on the goodness of the forecasts and it’s worth further investigation. We could verify the viability of direct integration of DSN-PC data since it provided forecast errors similar to the previous approach. In this paper we present one simple method for smoothing our initial data and the results of the weather prediction calculations
    corecore