15,919 research outputs found
Bandwidth choice for nonparametric classification
It is shown that, for kernel-based classification with univariate
distributions and two populations, optimal bandwidth choice has a dichotomous
character. If the two densities cross at just one point, where their curvatures
have the same signs, then minimum Bayes risk is achieved using bandwidths which
are an order of magnitude larger than those which minimize pointwise estimation
error. On the other hand, if the curvature signs are different, or if there are
multiple crossing points, then bandwidths of conventional size are generally
appropriate. The range of different modes of behavior is narrower in
multivariate settings. There, the optimal size of bandwidth is generally the
same as that which is appropriate for pointwise density estimation. These
properties motivate empirical rules for bandwidth choice.Comment: Published at http://dx.doi.org/10.1214/009053604000000959 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
First-Principles Study of Electronic Structure in -(BEDT-TTF)I at Ambient Pressure and with Uniaxial Strain
Within the framework of the density functional theory, we calculate the
electronic structure of -(BEDT-TTF)I at 8K and room temperature
at ambient pressure and with uniaxial strain along the - and -axes. We
confirm the existence of anisotropic Dirac cone dispersion near the chemical
potential. We also extract the orthogonal tight-binding parameters to analyze
physical properties. An investigation of the electronic structure near the
chemical potential clarifies that effects of uniaxial strain along the a-axis
is different from that along the b-axis. The carrier densities show
dependence at low temperatures, which may explain the experimental findings not
only qualitatively but also quantitatively.Comment: 10 pages, 7 figure
Local generalised method of moments: an application to point process-based rainfall models
Long series of simulated rainfall are required at point locations for a range of applications, including hydrological studies. Clustered point process-based rainfall models have been used for generating such simulations for many decades. These models suffer from a major limitation, however, their stationarity. Although seasonality can be allowed by fitting separate models for each calendar month or season, the models are unsuitable in their basic form for climate impact studies. In this paper, we develop new methodology to address this limitation. We extend the current fitting approach by allowing the discrete covariate, calendar month, to be replaced or supplemented with continuous covariates that are more directly related to the incidence and nature of rainfall. The covariate-dependent model parameters are estimated for each time interval using a kernel-based nonparametric approach within a generalised method-of-moments framework. An empirical study demonstrates the new methodology using a time series of 5-min rainfall data. The study considers both local mean and local linear approaches. While asymptotic results are included, the focus is on developing useable methodology for a complex model that can only be solved numerically. Issues including the choice of weighting matrix, estimation of parameter uncertainty and bandwidth and model selection are considered from this perspective
The importance of scale in spatially varying coefficient modeling
While spatially varying coefficient (SVC) models have attracted considerable
attention in applied science, they have been criticized as being unstable. The
objective of this study is to show that capturing the "spatial scale" of each
data relationship is crucially important to make SVC modeling more stable, and
in doing so, adds flexibility. Here, the analytical properties of six SVC
models are summarized in terms of their characterization of scale. Models are
examined through a series of Monte Carlo simulation experiments to assess the
extent to which spatial scale influences model stability and the accuracy of
their SVC estimates. The following models are studied: (i) geographically
weighted regression (GWR) with a fixed distance or (ii) an adaptive distance
bandwidth (GWRa), (iii) flexible bandwidth GWR (FB-GWR) with fixed distance or
(iv) adaptive distance bandwidths (FB-GWRa), (v) eigenvector spatial filtering
(ESF), and (vi) random effects ESF (RE-ESF). Results reveal that the SVC models
designed to capture scale dependencies in local relationships (FB-GWR, FB-GWRa
and RE-ESF) most accurately estimate the simulated SVCs, where RE-ESF is the
most computationally efficient. Conversely GWR and ESF, where SVC estimates are
naively assumed to operate at the same spatial scale for each relationship,
perform poorly. Results also confirm that the adaptive bandwidth GWR models
(GWRa and FB-GWRa) are superior to their fixed bandwidth counterparts (GWR and
FB-GWR)
Iterated smoothed bootstrap confidence intervals for population quantiles
This paper investigates the effects of smoothed bootstrap iterations on
coverage probabilities of smoothed bootstrap and bootstrap-t confidence
intervals for population quantiles, and establishes the optimal kernel
bandwidths at various stages of the smoothing procedures. The conventional
smoothed bootstrap and bootstrap-t methods have been known to yield one-sided
coverage errors of orders O(n^{-1/2}) and o(n^{-2/3}), respectively, for
intervals based on the sample quantile of a random sample of size n. We sharpen
the latter result to O(n^{-5/6}) with proper choices of bandwidths at the
bootstrapping and Studentization steps. We show further that calibration of the
nominal coverage level by means of the iterated bootstrap succeeds in reducing
the coverage error of the smoothed bootstrap percentile interval to the order
O(n^{-2/3}) and that of the smoothed bootstrap-t interval to O(n^{-58/57}),
provided that bandwidths are selected of appropriate orders. Simulation results
confirm our asymptotic findings, suggesting that the iterated smoothed
bootstrap-t method yields the most accurate coverage. On the other hand, the
iterated smoothed bootstrap percentile method interval has the advantage of
being shorter and more stable than the bootstrap-t intervals.Comment: Published at http://dx.doi.org/10.1214/009053604000000878 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Bandwidth choice for nonparametric classification
It is shown that, for kernel-based classification with univariate
distributions and two populations, optimal bandwidth choice has a dichotomous
character. If the two densities cross at just one point, where their curvatures
have the same signs, then minimum Bayes risk is achieved using bandwidths which
are an order of magnitude larger than those which minimize pointwise estimation
error. On the other hand, if the curvature signs are different, or if there are
multiple crossing points, then bandwidths of conventional size are generally
appropriate. The range of different modes of behavior is narrower in
multivariate settings. There, the optimal size of bandwidth is generally the
same as that which is appropriate for pointwise density estimation. These
properties motivate empirical rules for bandwidth choice
- …