496 research outputs found

    Inference for variograms

    Get PDF
    The empirical variogram is a standard tool in the investigation and modelling of spatial covariance. However, its properties can be difficult to identify and exploit in the context of exploring the characteristics of individual datasets. This is particularly true when seeking to move beyond description towards inferential statements about the structure of the spatial covariance which may be present. A robust form of empirical variogram based on a fourth-root transformation is used. This takes advantage of the normal approximation which gives an excellent description of the variation exhibited on this scale. Calculations of mean, variance and covariance of the binned empirical variogram then allow useful computations such as confidence intervals to be added to the underlying estimator. The comparison of variograms for different datasets provides an illustration of this. The suitability of simplifying assumptions such as isotropy and stationarity can then also be investigated through the construction of appropriate test statistics and the distributional calculations required in the associated p-values can be performed through quadratic form methods. Examples of the use of these methods in assessing the form of spatial covariance present in datasets are shown, both through hypothesis tests and in graphical form. A simulation study explores the properties of the tests while pollution data on mosses in Galicia (North-West Spain) are used to provide a real data illustration

    Nonparametric Kernel Smoothing Methods. The sm library in Xlisp-Stat

    Get PDF
    In this paper we describe the Xlisp-Stat version of the sm library, a software for applying nonparametric kernel smoothing methods. The original version of the sm library was written by Bowman and Azzalini in S-Plus, and it is documented in their book Applied Smoothing Techniques for Data Analysis (1997). This is also the main reference for a complete description of the statistical methods implemented. The sm library provides kernel smoothing methods for obtaining nonparametric estimates of density functions and regression curves for different data structures. Smoothing techniques may be employed as a descriptive graphical tool for exploratory data analysis. Furthermore, they can also serve for inferential purposes as, for instance, when a nonparametric estimate is used for checking a proposed parametric model. The Xlisp-Stat version includes some extensions to the original sm library, mainly in the area of local likelihood estimation for generalized linear models. The Xlisp-Stat version of the sm library has been written following an object-oriented approach. This should allow experienced Xlisp-Stat users to implement easily their own methods and new research ideas into the built-in prototypes

    Assessment of apparent nonstationarity in time series of annual inflow, daily precipitation, and atmospheric circulation indices: A case study from southwest Western Australia

    Get PDF
    The southwest region of Western Australia has experienced a sustained sequence of low annual inflows to major water supply dams over the past 30 years. Until recently, the dominant interpretation of this phenomenon has been predicated on the existence of one or more sharp breaks (change or jump points), with inflows fluctuating around relatively constant levels between them. This paper revisits this interpretation. To understand the mechanisms behind the changes, we also analyze daily precipitation series at multiple sites in the vicinity and time series for several indices of regional atmospheric circulation that may be considered as drivers of regional precipitation. We focus on the winter half-year for the region (May to October) as up to 80% of annual precipitation occurs during this "season". We find that the decline in the annual inflow is in fact more consistent with a smooth declining trend than with a sequence of sharp breaks, the decline is associated with decreases both in the frequency of daily precipitation occurrence and in wet-day amounts, and the decline in regional precipitation is strongly associated with a marked decrease in moisture content in the lower troposphere, an increase in regionally averaged sea level pressure in the first half of the season, and intraseasonal changes in the regional north-south sea level pressure gradient. Overall, our approach provides an integrated understanding of the linkages between declining dam inflows, declining precipitation, and changes in regional atmospheric circulation that favor drier conditions

    Variable selection and sensitivity analysis using dynamic trees, with an application to computer code performance tuning

    Full text link
    We investigate an application in the automatic tuning of computer codes, an area of research that has come to prominence alongside the recent rise of distributed scientific processing and heterogeneity in high-performance computing environments. Here, the response function is nonlinear and noisy and may not be smooth or stationary. Clearly needed are variable selection, decomposition of influence, and analysis of main and secondary effects for both real-valued and binary inputs and outputs. Our contribution is a novel set of tools for variable selection and sensitivity analysis based on the recently proposed dynamic tree model. We argue that this approach is uniquely well suited to the demands of our motivating example. In illustrations on benchmark data sets, we show that the new techniques are faster and offer richer feature sets than do similar approaches in the static tree and computer experiment literature. We apply the methods in code-tuning optimization, examination of a cold-cache effect, and detection of transformation errors.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS590 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Hierarchical and multidimensional smoothing with applications to longitudinal and mortality data

    Get PDF
    This thesis is concerned with two themes: (a) smooth mixed models in hierarchical settings with applications to grouped longitudinal data and (b) multi-dimensional smoothing with reference to the modelling and forecasting of mortality data. In part (a), we examine a popular method to smooth models for longitudinal data, which consists of expressing the model as a mixed model. This approach is particularly appealing when truncated polynomials are used as a basis for the smoothing, as the mixed model representation is almost immediate. We show that this approach can lead to a severely biased estimate of the group and subject effects, and to confidence intervals with undesirable properties. We use penalization to investigate an alternative approach with either B-spline or truncated polynomial bases and show that this new approach does not suffer from the same defects. Our models are defined in terms of B-splines or truncated polynomials with appropriate penalties, but we re-parametrize them as mixed models and this gives access to fitting with standard procedures. In part (b), we first demonstrate the adverse impact of over-dispersion (and heterogeneity) in the modelling of mortality data, and describe the resolution of this problem through a two-stage smoothing of mean and dispersion effects via penalized quasi-likelihoods. Next, we propose a method for the joint modelling of several mortality tables (e.g. male and female mortality in Demography, mortality by lives and by amounts in Life Insurance, etc) and describe how this joint approach leads to the classification and simple comparison of these tables. Finally, we deal with the smooth modelling of mortality improvement factors, which are two-dimensional correlated data; here we first form a basic flexible model incorporating the correlation structure, and then extend this model to cope with cohort and period shock effectsEngineering and Physical Sciences Research Council (EPSRC

    Statistical analysis of freshwater parameters monitored at different temporal resolutions

    Get PDF
    Nowadays, it is of great importance in ecological and environmental studies to investigate some prominent features in environmental determinants using appropriate statistical approaches. The initial motivation of this work was provided by the enthusiasm of the limnologist, biologist and statistician, interested in exploring and investigating certain features of time series data at different temporal resolutions to environmental parameters in freshwater. This thesis introduces a variety of statistical techniques which are used to provide sufficient information on the features of interest in the environmental variables in freshwater. Chapter 1 gives the background of the work, explores the details of the locations of the case studies, presents several statistical and ecological issues and outlines the aims and objectives of the thesis. Chapter 2 provides a review of some commonly used statistical modelling approaches to model trend and seasonality. All the modelling approaches are then applied to low temporal resolution (monthly data) of temperature and chlorophyll measurements from 1987-2005 for the north and south basins of Loch Lomond, Scotland. An investigation into the influence of temperature and nutrients on the variability of log chlorophyll is also carried out. Chapter 3 extends the modelling for temperature in Chapter 2 with the use of a mixed-effects model with different error structures for temperature data at a moderate temporal resolution (1 and 3 hourly data) in the north, mid and south basins. Three approaches are proposed to estimate the positions of a sharp change in gradient of temperature (thermocline) in deeper basins, using the maximum relative rate of change, changepoint regression and derivatives of a smooth curve. Chapter 4 investigates several features in semi-continuous environmental variables (15 and 30 minutes data). The temporal pattern of temperature, pH, conductivity and barometric pressure, and the evidence of similarity of the signals of pH and conductivity is determined, using wavelets. The time taken for pH and conductivity to return to `baseline levels' (recovery period) following extreme discharge is determined for different thresholds of `extreme discharge' for the Rivers Charr and Drumtee Burn, Scotland and models for the recovery period are proposed and fitted. Model validation is carried out for the River Charr and the occurrence of clusters of extreme discharge for both rivers is investigated using the extremal index. Chapter 5 summarises the main findings within this thesis and several potential areas for future work are suggested

    Conditional covariance estimation for dimension reduction and sensivity analysis

    Get PDF
    Cette thÚse se concentre autour du problÚme de l'estimation de matrices de covariance conditionnelles et ses applications, en particulier sur la réduction de dimension et l'analyse de sensibilités. Dans le Chapitre 2 nous plaçons dans un modÚle d'observation de type régression en grande dimension pour lequel nous souhaitons utiliser une méthodologie de type régression inverse par tranches. L'utilisation d'un opérateur fonctionnel, nous permettra d'appliquer une décomposition de Taylor autour d'un estimateur préliminaire de la densité jointe. Nous prouverons deux choses : notre estimateur est asymptoticalement normale avec une variance que dépend de la partie linéaire, et cette variance est efficace selon le point de vue de Cramér-Rao. Dans le Chapitre 3, nous étudions l'estimation de matrices de covariance conditionnelle dans un premier temps coordonnée par coordonnée, lesquelles dépendent de la densité jointe inconnue que nous remplacerons par un estimateur à noyaux. Nous trouverons que l'erreur quadratique moyenne de l'estimateur converge à une vitesse paramétrique si la distribution jointe appartient à une classe de fonctions lisses. Sinon, nous aurons une vitesse plus lent en fonction de la régularité de la densité de la densité jointe. Pour l'estimateur de la matrice complÚte, nous allons appliquer une transformation de régularisation de type "banding". Finalement, dans le Chapitre 4, nous allons utiliser nos résultats pour estimer des indices de Sobol utilisés en analyses de sensibilité. Ces indices mesurent l'influence des entrées par rapport a la sortie dans modÚles complexes. L'avantage de notre implémentation est d'estimer les indices de Sobol sans l'utilisation de coûteuses méthodes de type Monte-Carlo. Certaines illustrations sont présentées dans le chapitre pour montrer les capacités de notre estimateur.This thesis will be focused in the estimation of conditional covariance matrices and their applications, in particular, in dimension reduction and sensitivity analyses. In Chapter 2, we are in a context of high-dimensional nonlinear regression. The main objective is to use the sliced inverse regression methodology. Using a functional operator depending on the joint density, we apply a Taylor decomposition around a preliminary estimator. We will prove two things: our estimator is asymptotical normal with variance depending only the linear part, and this variance is efficient from the Cramér-Rao point of view. In the Chapter 3, we study the estimation of conditional covariance matrices, first coordinate-wise where those parameters depend on the unknown joint density which we will replace it by a kernel estimator. We prove that the mean squared error of the nonparametric estimator has a parametric rate of convergence if the joint distribution belongs to some class of smooth functions. Otherwise, we get a slower rate depending on the regularity of the model. For the estimator of the whole matrix estimator, we will apply a regularization of type "banding". Finally, in Chapter 4, we apply our results to estimate the Sobol or sensitivity indices. These indices measure the influence of the inputs with respect to the output in complex models. The advantage of our implementation is that we can estimate the Sobol indices without use computing expensive Monte-Carlo methods. Some illustrations are presented in the chapter showing the capabilities of our estimator
    • 

    corecore