15,919 research outputs found

    Bandwidth choice for nonparametric classification

    Full text link
    It is shown that, for kernel-based classification with univariate distributions and two populations, optimal bandwidth choice has a dichotomous character. If the two densities cross at just one point, where their curvatures have the same signs, then minimum Bayes risk is achieved using bandwidths which are an order of magnitude larger than those which minimize pointwise estimation error. On the other hand, if the curvature signs are different, or if there are multiple crossing points, then bandwidths of conventional size are generally appropriate. The range of different modes of behavior is narrower in multivariate settings. There, the optimal size of bandwidth is generally the same as that which is appropriate for pointwise density estimation. These properties motivate empirical rules for bandwidth choice.Comment: Published at http://dx.doi.org/10.1214/009053604000000959 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    First-Principles Study of Electronic Structure in α\alpha-(BEDT-TTF)2_2I3_3 at Ambient Pressure and with Uniaxial Strain

    Full text link
    Within the framework of the density functional theory, we calculate the electronic structure of α\alpha-(BEDT-TTF)2_2I3_3 at 8K and room temperature at ambient pressure and with uniaxial strain along the aa- and bb-axes. We confirm the existence of anisotropic Dirac cone dispersion near the chemical potential. We also extract the orthogonal tight-binding parameters to analyze physical properties. An investigation of the electronic structure near the chemical potential clarifies that effects of uniaxial strain along the a-axis is different from that along the b-axis. The carrier densities show T2T^2 dependence at low temperatures, which may explain the experimental findings not only qualitatively but also quantitatively.Comment: 10 pages, 7 figure

    Local generalised method of moments: an application to point process-based rainfall models

    Get PDF
    Long series of simulated rainfall are required at point locations for a range of applications, including hydrological studies. Clustered point process-based rainfall models have been used for generating such simulations for many decades. These models suffer from a major limitation, however, their stationarity. Although seasonality can be allowed by fitting separate models for each calendar month or season, the models are unsuitable in their basic form for climate impact studies. In this paper, we develop new methodology to address this limitation. We extend the current fitting approach by allowing the discrete covariate, calendar month, to be replaced or supplemented with continuous covariates that are more directly related to the incidence and nature of rainfall. The covariate-dependent model parameters are estimated for each time interval using a kernel-based nonparametric approach within a generalised method-of-moments framework. An empirical study demonstrates the new methodology using a time series of 5-min rainfall data. The study considers both local mean and local linear approaches. While asymptotic results are included, the focus is on developing useable methodology for a complex model that can only be solved numerically. Issues including the choice of weighting matrix, estimation of parameter uncertainty and bandwidth and model selection are considered from this perspective

    The importance of scale in spatially varying coefficient modeling

    Get PDF
    While spatially varying coefficient (SVC) models have attracted considerable attention in applied science, they have been criticized as being unstable. The objective of this study is to show that capturing the "spatial scale" of each data relationship is crucially important to make SVC modeling more stable, and in doing so, adds flexibility. Here, the analytical properties of six SVC models are summarized in terms of their characterization of scale. Models are examined through a series of Monte Carlo simulation experiments to assess the extent to which spatial scale influences model stability and the accuracy of their SVC estimates. The following models are studied: (i) geographically weighted regression (GWR) with a fixed distance or (ii) an adaptive distance bandwidth (GWRa), (iii) flexible bandwidth GWR (FB-GWR) with fixed distance or (iv) adaptive distance bandwidths (FB-GWRa), (v) eigenvector spatial filtering (ESF), and (vi) random effects ESF (RE-ESF). Results reveal that the SVC models designed to capture scale dependencies in local relationships (FB-GWR, FB-GWRa and RE-ESF) most accurately estimate the simulated SVCs, where RE-ESF is the most computationally efficient. Conversely GWR and ESF, where SVC estimates are naively assumed to operate at the same spatial scale for each relationship, perform poorly. Results also confirm that the adaptive bandwidth GWR models (GWRa and FB-GWRa) are superior to their fixed bandwidth counterparts (GWR and FB-GWR)

    Iterated smoothed bootstrap confidence intervals for population quantiles

    Get PDF
    This paper investigates the effects of smoothed bootstrap iterations on coverage probabilities of smoothed bootstrap and bootstrap-t confidence intervals for population quantiles, and establishes the optimal kernel bandwidths at various stages of the smoothing procedures. The conventional smoothed bootstrap and bootstrap-t methods have been known to yield one-sided coverage errors of orders O(n^{-1/2}) and o(n^{-2/3}), respectively, for intervals based on the sample quantile of a random sample of size n. We sharpen the latter result to O(n^{-5/6}) with proper choices of bandwidths at the bootstrapping and Studentization steps. We show further that calibration of the nominal coverage level by means of the iterated bootstrap succeeds in reducing the coverage error of the smoothed bootstrap percentile interval to the order O(n^{-2/3}) and that of the smoothed bootstrap-t interval to O(n^{-58/57}), provided that bandwidths are selected of appropriate orders. Simulation results confirm our asymptotic findings, suggesting that the iterated smoothed bootstrap-t method yields the most accurate coverage. On the other hand, the iterated smoothed bootstrap percentile method interval has the advantage of being shorter and more stable than the bootstrap-t intervals.Comment: Published at http://dx.doi.org/10.1214/009053604000000878 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Bandwidth choice for nonparametric classification

    Get PDF
    It is shown that, for kernel-based classification with univariate distributions and two populations, optimal bandwidth choice has a dichotomous character. If the two densities cross at just one point, where their curvatures have the same signs, then minimum Bayes risk is achieved using bandwidths which are an order of magnitude larger than those which minimize pointwise estimation error. On the other hand, if the curvature signs are different, or if there are multiple crossing points, then bandwidths of conventional size are generally appropriate. The range of different modes of behavior is narrower in multivariate settings. There, the optimal size of bandwidth is generally the same as that which is appropriate for pointwise density estimation. These properties motivate empirical rules for bandwidth choice
    corecore