11 research outputs found

    Modeling regionalized volumetric differences in protein-ligand binding cavities

    Get PDF
    Identifying elements of protein structures that create differences in protein-ligand binding specificity is an essential method for explaining the molecular mechanisms underlying preferential binding. In some cases, influential mechanisms can be visually identified by experts in structural biology, but subtler mechanisms, whose significance may only be apparent from the analysis of many structures, are harder to find. To assist this process, we present a geometric algorithm and two statistical models for identifying significant structural differences in protein-ligand binding cavities. We demonstrate these methods in an analysis of sequentially nonredundant structural representatives of the canonical serine proteases and the enolase superfamily. Here, we observed that statistically significant structural variations identified experimentally established determinants of specificity. We also observed that an analysis of individual regions inside cavities can reveal areas where small differences in shape can correspond to differences in specificity

    On Parametric and Nonparametric Methods for Dependent Data

    Get PDF
    In recent years, there has been a surge of research interest in the analysis of time series and spatial data. While on one hand more and more sophisticated models are being developed, on the other hand the resulting theory and estimation process has become more and more involved. This dissertation addresses the development of statistical inference procedures for data exhibiting dependencies of varied form and structure. In the first work, we consider estimation of the mean squared prediction error (MSPE) of the best linear predictor of (possibly) nonlinear functions of finitely many future observations in a stationary time series. We develop a resampling methodology for estimating the MSPE when the unknown parameters in the best linear predictor are estimated. Further, we propose a bias corrected MSPE estimator based on the bootstrap and establish its second order accuracy. Finite sample properties of the method are investigated through a simulation study. The next work considers nonparametric inference on spatial data. In this work the asymptotic distribution of the Discrete Fourier Transformation (DFT) of spatial data under pure and mixed increasing domain spatial asymptotic structures are studied under both deterministic and stochastic spatial sampling designs. The deterministic design is specified by a scaled version of the integer lattice in IRd while the data-sites under the stochastic spatial design are generated by a sequence of independent random vectors, with a possibly nonuniform density. A detailed account of the asymptotic joint distribution of the DFTs of the spatial data is given which, among other things, highlights the effects of the geometry of the sampling region and the spatial sampling density on the limit distribution. Further, it is shown that in both deterministic and stochastic design cases, for "asymptotically distant" frequencies, the DFTs are asymptotically independent, but this property may be destroyed if the frequencies are "asymptotically close". Some important implications of the main results are also given

    Regridding Uncertainty for Statistical Downscaling of Solar Radiation

    Full text link
    Initial steps in statistical downscaling involve being able to compare observed data from regional climate models (RCMs). This prediction requires (1) regridding RCM output from their native grids and at differing spatial resolutions to a common grid in order to be comparable to observed data and (2) bias correcting RCM data, via quantile mapping, for example, for future modeling and analysis. The uncertainty associated with (1) is not always considered for downstream operations in (2). This work examines this uncertainty, which is not often made available to the user of a regridded data product. This analysis is applied to RCM solar radiation data from the NA-CORDEX data archive and observed data from the National Solar Radiation Database housed at the National Renewable Energy Lab. A case study of the mentioned methods over California is presented.Comment: 16 pages, 5 figures, submitted to: Advances in Statistical Climatology, Meteorology and Oceanograph

    New and Fast Block Bootstrap-Based Prediction Intervals for GARCH (1,1) Process with Application to Exchange Rates

    No full text
    In this paper, we propose a new bootstrap algorithm to obtain prediction intervals for generalized autoregressive conditionally heteroscedastic (GARCH(1,1)) process which can be applied to construct prediction intervals for future returns and volatilities. The advantages of the proposed method are twofold: it (a) often exhibits improved performance and (b) is computationally more efficient compared to other available resampling methods. The superiority of this method over the other resampling method-based prediction intervals is explained with Spearman's rank correlation coefficient. The finite sample properties of the proposed method are also illustrated by an extensive simulation study and a real-world example

    A general frequency domain method for assessing spatial covariance structures

    No full text
    corecore