50,311 research outputs found

    Approximating Data with weighted smoothing Splines

    Full text link
    Given a data set (t_i, y_i), i=1,..., n with the t_i in [0,1] non-parametric regression is concerned with the problem of specifying a suitable function f_n:[0,1] -> R such that the data can be reasonably approximated by the points (t_i, f_n(t_i)), i=1,..., n. If a data set exhibits large variations in local behaviour, for example large peaks as in spectroscopy data, then the method must be able to adapt to the local changes in smoothness. Whilst many methods are able to accomplish this they are less successful at adapting derivatives. In this paper we show how the goal of local adaptivity of the function and its first and second derivatives can be attained in a simple manner using weighted smoothing splines. A residual based concept of approximation is used which forces local adaptivity of the regression function together with a global regularization which makes the function as smooth as possible subject to the approximation constraints

    Nonparametric Regression, Confidence Regions and Regularization

    Full text link
    In this paper we offer a unified approach to the problem of nonparametric regression on the unit interval. It is based on a universal, honest and non-asymptotic confidence region which is defined by a set of linear inequalities involving the values of the functions at the design points. Interest will typically centre on certain simplest functions in that region where simplicity can be defined in terms of shape (number of local extremes, intervals of convexity/concavity) or smoothness (bounds on derivatives) or a combination of both. Once some form of regularization has been decided upon the confidence region can be used to provide honest non-asymptotic confidence bounds which are less informative but conceptually much simpler

    Shifts in hexapod diversification and what Haldane could have said

    Get PDF
    Data on species richness and taxon age are assembled for the extant hexapod orders (insects and their six-legged relatives). Coupled with estimates of phylogenetic relatedness, and simple statistical null models, these data are used to locate where, on the hexapod tree, significant changes in the rate of cladogenesis (speciation-minus-extinction rate) have occurred. Significant differences are found between many successive pairs of sister taxa near the base of the hexapod tree, all of which are attributable to a shift in diversification rate after the origin of the Neoptera (insects with wing flexion) and before the origin of the Holometabola (insects with complete metamorphosis). No other shifts are identifiable amongst supraordinal taxa. Whilst the Coleoptera have probably diversified faster than either of their putative sister lineages, they do not stand out relative to other closely related clades. These results suggest that any Creator had a fondness for a much more inclusive clade than the Coleoptera, definitely as large as the Eumetabola (Holometabola plus bugs and their relatives), and possibly as large as the entire Neoptera. Simultaneous, hence probable causative events are discussed, of which the origin of wing flexion has been the focus of much attention

    Long range financial data and model choice

    Get PDF
    Long range financial data as typified by the daily returns of the Standard and Poor's index exhibit common features such as heavy tails, long range memory of the absolute values and clustering of periods of high and low volatility. These and other features are often referred to as stylized facts and parametric models for such data are required to reproduce them in some sense. Typically this is done by simulating some data sets under the model and demonstrating that the simulations also exhibits the stylized facts. Nevertheless when the parameters of such models are to be estimated recourse is very often taken to likelihood either in the form of maximum likelihood or Bayes. In this paper we expound a method of determining parameter values which depends solely on the ability of the model to reproduce the relevant features of the data set. We introduce a new measure of the volatility of the volatility and show how it can be combined with the distribution of the returns and the autocorrelation of the absolute returns to determine parameter values. We also give a parametric model for such data and show that it can reproduce the required features

    Approximating data (Approximating data and statistical procedures

    Get PDF
    Stochastic models approximate data and are not true representations of the same. Statistical procedure make use of approximate stochastic models to facilitate the analysis of data. --

    The granular silo as a continuum plastic flow: the hour-glass vs the clepsydra

    Full text link
    The granular silo is one of the many interesting illustrations of the thixotropic property of granular matter: a rapid flow develops at the outlet, propagating upwards through a dense shear flow while material at the bottom corners of the container remains static. For large enough outlets, the discharge flow is continuous; however, by contrast with the clepsydra for which the flow velocity depends on the height of fluid left in the container, the discharge rate of granular silos is constant. Implementing a plastic rheology in a 2D Navier-Stokes solver (following the mu(I)-rheology or a constant friction), we simulate the continuum counterpart of the granular silo. Doing so, we obtain a constant flow rate during the discharge and recover the Beverloo scaling independently of the initial filling height of the silo. We show that lowering the value of the coefficient of friction leads to a transition toward a different behavior, similar to that of a viscous fluid, and where the filling height becomes active in the discharge process. The pressure field shows that large enough values of the coefficient of friction (≃\simeq 0.3) allow for a low-pressure cavity to form above the outlet, and can thus explain the Beverloo scaling. In conclusion, the difference between the discharge of a hourglass and a clepsydra seems to reside in the existence or not of a plastic yield stress.Comment: 6 pages, 6 figure

    Approximating data with weighted smoothing splines

    Get PDF
    Given a data set (t_i, y_i), i = 1,... ,n with the t_i ∈ [0, 1] non-parametric regression is concerned with the problem of specifying a suitable function f_n : [0, 1] → R such that the data can be reasonably approximated by the points (t_i, f_n(t_i)), i = 1,... ,n. A common desideratum is that the function fn be smooth but the path towards this goal is often the indirect one of assuming a “true” data generating function f and then measuring performance by the expected mean square. The approach taken in this paper is a different one. We specify precisely what we mean by a function fn being an adequate approximation to the data and then, using weighted splines, we try to maximize the smoothness given the approximation constraints

    breakdown and groups

    Get PDF
    The concept of breakdown point was introduced by Hodges (1967) and Hampel (1968, 1971) and still plays an important though at times a controversial role in robust statistics. It has proved most successful in the context of location, scale and regression problems. In this paper we argue that this success is intimately connected to the fact that the translation and affine groups act on the sample space and give rise to a definition of equivariance for statistical functionals. For such functionals a nontrivial upper bound for the breakdown point can be shown. In the absence of such a group structure a breakdown point of one is attainable and this is perhaps the decisive reason why the concept of breakdown point in other situations has not proved as successful. Even if a natural group is present it is often not sufficiently large to allow a nontrivial upper bound for the breakdown point. One exception to this is the problem of the autocorrelation structure of time series where we derive a nontrivial upper breakdown point using the group of realizable linear filters. The paper is formulated in an abstract manner to emphasize the role of the group and the resulting equivariance structure
    • 

    corecore