241 research outputs found

    Asymptotic normality for weighted sums of linear processes

    Get PDF
    We establish asymptotic normality of weighted sums of linear processes with general triangular array weights and when the innovations in the linear process are martingale differences. The results are obtained under minimal conditions on the weights and innovations. We also obtain weak convergence of weighted partial sum processes. The results are applicable to linear processes that have short or long memory or exhibit seasonal long memory behavior. In particular, they are applicable to GARCH and ARCH(∞) models and to their squares. They are also useful in deriving asymptotic normality of kernel-type estimators of a nonparametric regression function with short or long memory moving average errors

    Design Issues for Generalized Linear Models: A Review

    Full text link
    Generalized linear models (GLMs) have been used quite effectively in the modeling of a mean response under nonstandard conditions, where discrete as well as continuous data distributions can be accommodated. The choice of design for a GLM is a very important task in the development and building of an adequate model. However, one major problem that handicaps the construction of a GLM design is its dependence on the unknown parameters of the fitted model. Several approaches have been proposed in the past 25 years to solve this problem. These approaches, however, have provided only partial solutions that apply in only some special cases, and the problem, in general, remains largely unresolved. The purpose of this article is to focus attention on the aforementioned dependence problem. We provide a survey of various existing techniques dealing with the dependence problem. This survey includes discussions concerning locally optimal designs, sequential designs, Bayesian designs and the quantile dispersion graph approach for comparing designs for GLMs.Comment: Published at http://dx.doi.org/10.1214/088342306000000105 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Bias-Variance Tradeoff in a Sliding Window Implementation of the Stochastic Gradient Algorithm

    Get PDF
    This paper provides a framework to analyze stochastic gradient algorithms in a mean squared error (MSE) sense using the asymptotic normality result of the stochastic gradient descent (SGD) iterates. We perform this analysis by taking the asymptotic normality result and applying it to the finite iteration case. Specifically, we look at problems where the gradient estimators are biased and have reduced variance and compare the iterates generated by these gradient estimators to the iterates generated by the SGD algorithm. We use the work of Fabian to characterize the mean and the variance of the distribution of the iterates in terms of the bias and the covariance matrix of the gradient estimators. We introduce the sliding window SGD (SW-SGD) algorithm, with its proof of convergence, which incurs a lower MSE than the SGD algorithm on quadratic and convex problems. Lastly, we present some numerical results to show the effectiveness of this framework and the superiority of SW-SGD algorithm over the SGD algorithm

    Point estimation, stochastic approximation, and robust Kalman filtering

    Get PDF
    Caption title.Includes bibliographical references (p. 23-25).Supported by the U.S. Air Force Office of Scientific Research. AFOSR-85-0227 AFOSR-89-0276Sanjoy K. Mitter and Irvin C. Schick

    Constant Step Size Least-Mean-Square: Bias-Variance Trade-offs and Optimal Sampling Distributions

    Get PDF
    We consider the least-squares regression problem and provide a detailed asymptotic analysis of the performance of averaged constant-step-size stochastic gradient descent (a.k.a. least-mean-squares). In the strongly-convex case, we provide an asymptotic expansion up to explicit exponentially decaying terms. Our analysis leads to new insights into stochastic approximation algorithms: (a) it gives a tighter bound on the allowed step-size; (b) the generalization error may be divided into a variance term which is decaying as O(1/n), independently of the step-size γ\gamma, and a bias term that decays as O(1/γ\gamma 2 n 2); (c) when allowing non-uniform sampling, the choice of a good sampling density depends on whether the variance or bias terms dominate. In particular, when the variance term dominates, optimal sampling densities do not lead to much gain, while when the bias term dominates, we can choose larger step-sizes that leads to significant improvements

    Efficiency of the stochastic approximation method

    Get PDF
    The practical aspect of the stochastic approximation method (SA) is studied. Specifically, we investigated the efficiency depending on the coefficients that generate the step length in optimization algorithm, as well as the efficiency depending on the type and the level of the corresponding noise. Efficiency is measured by the mean values of the objective function at the final estimates of the algorithm, over the specified number of replications. This paper provides suggestions how to choose already mentioned coefficients, in order to achieve better performance of the stochastic approximation algorithm

    Cyclic Stochastic Optimization: Generalizations, Convergence, and Applications in Multi-Agent Systems

    Get PDF
    Stochastic approximation (SA) is a powerful class of iterative algorithms for nonlinear root-finding that can be used for minimizing a loss function, L(θ), with respect to a parameter vector θ, when only noisy observations of L(θ) or its gradient are available (through the natural connection between root-finding and minimization); SA algorithms can be thought of as stochastic line search methods where the entire parameter vector is updated at each iteration. The cyclic approach to SA is a variant of SA procedures where θ is divided into multiple subvectors that are updated one at a time in a cyclic manner. This dissertation focuses on studying the asymptotic properties of cyclic SA and of the generalized cyclic SA (GCSA) algorithm, a variant of cyclic SA where the subvector to update may be selected according to a random variable or according to a predetermined pattern, and where the noisy update direction can be based on the updates of any SA algorithm (e.g., stochastic gradient, Kiefer–Wolfowitz, or simultaneous perturbation SA). The convergence of GCSA, asymptotic normality of GCSA (related to rate of convergence), and efficiency of GCSA relative to its non-cyclic counterpart are investigated both analytically and numerically. Specifically, conditions are obtained for the convergence with probability one of the GCSA iterates and for the asymptotic normality of the normalized iterates of a special case of GCSA. Further, an analytic expression is given for the asymptotic relative efficiency (when efficiency is defined in terms of mean squared error) between a special case of GCSA and its non-cyclic counterpart. Finally, an application of the cyclic SA scheme to a multi-agent stochastic optimization problem is investigated. This dissertation also contains two appendices. The first appendix generalizes Theorem 2.2 in Fabian (1968) (a seminal paper in the SA literature that derives general conditions for the asymptotic normality of SA procedures) to make the result more applicable to some modern applications of SA including (but not limited to) the GCSA algorithm, certain root-finding SA algorithms, and certain second-order SA algorithms. The second appendix considers the problem of determining the presence and location of a static object within an area of interest by combining information from multiple sensors using a maximum-likelihood-based approach
    corecore