501 research outputs found

    From Averaging to Acceleration, There is Only a Step-size

    Get PDF
    We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for non-strongly-convex problems may be reformulated as constant parameter second-order difference equation algorithms, where stability of the system is equivalent to convergence at rate O(1/n 2), where n is the number of iterations. We provide a detailed analysis of the eigenvalues of the corresponding linear dynamical system , showing various oscillatory and non-oscillatory behaviors, together with a sharp stability result with explicit constants. We also consider the situation where noisy gradients are available, where we extend our general convergence result, which suggests an alternative algorithm (i.e., with different step sizes) that exhibits the good aspects of both averaging and acceleration

    Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression

    Get PDF
    We consider the optimization of a quadratic objective function whose gradients are only accessible through a stochastic oracle that returns the gradient at any given point plus a zero-mean finite variance random error. We present the first algorithm that achieves jointly the optimal prediction error rates for least-squares regression, both in terms of forgetting of initial conditions in O(1/n 2), and in terms of dependence on the noise and dimension d of the problem, as O(d/n). Our new algorithm is based on averaged accelerated regularized gradient descent, and may also be analyzed through finer assumptions on initial conditions and the Hessian matrix, leading to dimension-free quantities that may still be small while the " optimal " terms above are large. In order to characterize the tightness of these new bounds, we consider an application to non-parametric regression and use the known lower bounds on the statistical performance (without computational limits), which happen to match our bounds obtained from a single pass on the data and thus show optimality of our algorithm in a wide variety of particular trade-offs between bias and variance

    Optimal Rates of Statistical Seriation

    Full text link
    Given a matrix the seriation problem consists in permuting its rows in such way that all its columns have the same shape, for example, they are monotone increasing. We propose a statistical approach to this problem where the matrix of interest is observed with noise and study the corresponding minimax rate of estimation of the matrices. Specifically, when the columns are either unimodal or monotone, we show that the least squares estimator is optimal up to logarithmic factors and adapts to matrices with a certain natural structure. Finally, we propose a computationally efficient estimator in the monotonic case and study its performance both theoretically and experimentally. Our work is at the intersection of shape constrained estimation and recent work that involves permutation learning, such as graph denoising and ranking.Comment: V2 corrects an error in Lemma A.1, v3 corrects appendix F on unimodal regression where the bounds now hold with polynomial probability rather than exponentia

    Saddle-to-Saddle Dynamics in Diagonal Linear Networks

    Full text link
    In this paper we fully describe the trajectory of gradient flow over diagonal linear networks in the limit of vanishing initialisation. We show that the limiting flow successively jumps from a saddle of the training loss to another until reaching the minimum â„“1\ell_1-norm solution. This saddle-to-saddle dynamics translates to an incremental learning process as each saddle corresponds to the minimiser of the loss constrained to an active set outside of which the coordinates must be zero. We explicitly characterise the visited saddles as well as the jumping times through a recursive algorithm reminiscent of the LARS algorithm used for computing the Lasso path. Our proof leverages a convenient arc-length time-reparametrisation which enables to keep track of the heteroclinic transitions between the jumps. Our analysis requires negligible assumptions on the data, applies to both under and overparametrised settings and covers complex cases where there is no monotonicity of the number of active coordinates. We provide numerical experiments to support our findings

    Apports de la modélisation des effets des toxiques sur l’individu et la population en écotoxicologie aquatique

    Get PDF
    En général, les résultats des bioessais d’écotoxicologie sont étudiés par des méthodes statistiques et les paramètres estimés n’ont pas de signification biologique. La modélisation est apparue plus récemment en écotoxicologie et bénéficie même ces temps derniers d’un regain d’intérêt. Son développement s’effectue actuellement dans deux directions complémentaires que nous avons voulu présenter ici en en montrant les principaux apports. D’une part les effets sur les individus font l’objet d’efforts de modélisation afin de donner un sens biologique aux paramètres des tests de toxicité pour pouvoir intégrer des facteurs confondant au cours des tests comme par exemple des variations de la concentration d’exposition ou pour pouvoir déterminer les modes d’action des composés. D’autre part, l’écosystème étant l’objet d’étude par excellence de l’écotoxicologie, la modélisation est utilisée pour déduire les effets au niveau des populations à partir d’essais réalisés sur les individus. Jusqu’à présent, des approches classiques, qui se fondent sur l’équation d’Euler ou la diagonalisation de matrices de Leslie, ont été utilisées et ont permis une meilleure définition des paramètres à rechercher au niveau des tests de toxicité. D’autres approches sont à développer pour gagner en pertinence vis-à-vis du terrain (notamment hétérogénéité spatiale de la pollution et des habitats).Traditional analysis of toxicity tests provides toxicity parameters that are estimated with purely statistical methods. Consequently, these parameters do not have any intrinsic biological meaning and these methods provide no information about the mode of action of the tested chemicals. It is also difficult for these methods to change scale from the individual level to the population level, or to account for temporal and spatial heterogeneity. Modelling is an important tool in ecotoxicology and recently it appears to have gained more interest. Developments in modelling are currently expanding in two directions, modelling effects at the individual level and applying toxicity data obtained at the individual level to responses at the population level. The objective of the current study was to present these two complementary modelling approaches together with the opportunities they offer.Modelling at the individual level provides parameters that are biologically relevant. Modelling also facilitates the formulation and the testing of hypotheses concerning toxicity processes (physiological mode of action and kinetics). Confounding factors such as time, varying exposure concentrations, or feeding can also be incorporated into models. In this paper, two kinds of models were examined: biochemistry-based models (Hill models) and energy-based models (Dynamic Energy Budget models). In the Hill approach, effects are modelled as the interaction between chemicals and receptors in the organisms, which leads to a relationship between concentration and effects close to the logistic equation often used in toxicity test analysis. In the energy-based approach, models are built on the dynamic energy budget theory, in which energy derived from food is used for maintenance, growth and reproduction. The effect of compounds is then described as a change in one of the parameters describing these physiological functions. Kinetics are taken into account by a one-compartment model. The uptake rate is proportional to the exposure concentration, whereas the elimination rate is proportional to the concentration in the tissue. This model is simple but is relevant for many organisms and compounds (KOOIJMAN and BEDAUX, 1996). As time is taken into account through kinetic modelling, the estimation of the other parameters, such as the No Effect Concentration, does not depend on the exposure duration. An energy relevant model has many advantages. First, observed effect profiles are more in agreement with expectations (KOOIJMAN and BEDAUX, 1996). Second, it becomes possible to account for the fact that an effect on survival increases the amount of food consumed per surviving organisms, which in turn partly compensates for the negative effects of pollutants. Third, it allows for the examination of effects at the population level on density and biomass, complementary to the usual study of population growth rate.Most of the recent modelling research is related to deriving effects at the population level from effects at the individual level, because ecosystems are the target of ecotoxicology. Until recently, classical approaches, like the Euler equation or Leslie matrices, were used with population growth rates as endpoints. They provide interesting tools to determine the impact of life cycle parameters at the population level and to assess which level of effects has to be assessed. Even a simple approach such as that proposed by CALOW et al. (1997), separating the population into two different classes, juveniles and adults, can produce very interesting results. For instance, the authors showed that in populations for which females die just after reproduction, juvenile survival had much more importance than for populations where females can reproduce several times during their lifetime. The opposite is true concerning adult survival. However, these approaches do have some limits that make complementary approaches necessary to fully understand the effects of pollutants at the population level. First, they do not account for effects on the carrying capacity. SIBLY (1999) pointed out that there is a need for ecological studies on the effects of pollutants that measure their effects on density dependence and carrying capacity. Indeed an effect on population growth rate only accounts for a risk of disappearance for the population, but cannot help in the understanding of effects on biomass or density. Effects on the carrying capacity can have substantial effects at the ecosystem level, especially when studying species that constitute a food resource for other species. Second, more complex tools have to be developed to take into account spatial heterogeneity of pollution and habitats in order to be relevant from an ecosystem point of view. Indeed, it has been shown that uncontaminated sites can be significantly disturbed if they are connected, through the migration of organisms, with contaminated sites (SPROMBERG et al., 1998)
    • …
    corecore