14 research outputs found
Computational Limits of A Distributed Algorithm For Smoothing Spline
In this paper, we explore statistical versus computational trade-off to
address a basic question in the application of a distributed algorithm: what is
the minimal computational cost in obtaining statistical optimality? In
smoothing spline setup, we observe a phase transition phenomenon for the number
of deployed machines that ends up being a simple proxy for computing cost.
Specifically, a sharp upper bound for the number of machines is established:
when the number is below this bound, statistical optimality (in terms of
nonparametric estimation or testing) is achievable; otherwise, statistical
optimality becomes impossible. These sharp bounds partly capture intrinsic
computational limits of the distributed algorithm considered in this paper, and
turn out to be fully determined by the smoothness of the regression function.
As a side remark, we argue that sample splitting may be viewed as an
alternative form of regularization, playing a similar role as smoothing
parameter.Comment: To Appear in Journal of Machine Learning Researc
Improving the Simultaneous Application of the DSN-PC and NOAA GFS Datasets
Our surface-based sensor network, called Distributed Sensor Network for Prediction Calculations (DSN-PC) obviously has limitations in terms of vertical atmospheric data. While efforts are being made to approximate these upper-air parameters from surface-level, as a first step it was necessary to test the networkâs capability of making distributed computations by applying a hybrid approach. We accessed public databases like NOAA Global Forecast System (GFS) and the initial values for the 2-dimensional computational grid were produced by using both DSN-PC measurements and NOAA GFS data for each grid point. However, though the latter consists of assimilated and initialized (smoothed) data the stations of the DSN-PC network provide raw measurements which can cause numerical instability due to measurement errors or local weather phenomena. Previously we simultaneously interpolated both DSN-PC and GFS data. As a step forward, we wanted for our network to have a more significant role in the production of the initial values. Therefore it was necessary to apply 2D smoothing algorithms on the initial conditions. We found significant difference regarding numerical stability between calculating with raw and smoothed initial data. Applying the smoothing algorithms greatly improved the prediction reliability compared to the cases when raw data were used. The size of the grid portion used for smoothing has a significant impact on the goodness of the forecasts and itâs worth further investigation. We could verify the viability of direct integration of DSN-PC data since it provided forecast errors similar to the previous approach. In this paper we present one simple method for smoothing our initial data and the results of the weather prediction calculations