Search CORE

796 research outputs found

Bandwidth choice for nonparametric classification

Author: Hall Peter
Kang Kee-Hoon
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2005
Field of study

It is shown that, for kernel-based classification with univariate distributions and two populations, optimal bandwidth choice has a dichotomous character. If the two densities cross at just one point, where their curvatures have the same signs, then minimum Bayes risk is achieved using bandwidths which are an order of magnitude larger than those which minimize pointwise estimation error. On the other hand, if the curvature signs are different, or if there are multiple crossing points, then bandwidths of conventional size are generally appropriate. The range of different modes of behavior is narrower in multivariate settings. There, the optimal size of bandwidth is generally the same as that which is appropriate for pointwise density estimation. These properties motivate empirical rules for bandwidth choice.Comment: Published at http://dx.doi.org/10.1214/009053604000000959 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Australian National University

Bandwidth choice for nonparametric classification

Author: Hall Peter
Kang Kee-Hoon
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 24/02/2016
Field of study

The Australian National University

Bias in nearest-neighbor hazard estimation

Author: Dette Holger
Weißbach Rafael
Publication venue
Publication date
Field of study

In nonparametric curve estimation, the smoothing parameter is critical for performance. In order to estimate the hazard rate, we compare nearest neighbor selectors that minimize the quadratic, the Kullback-Leibler, and the uniform loss. These measures result in a rule of thumb, a cross-validation, and a plug-in selector. A Monte Carlo simulation within the three-parameter exponentiated Weibull distribution indicates that a counter-factual normal distribution, as an input to the selector, does provide a good rule of thumb. If bias is the main concern, minimizing the uniform loss yields the best results, but at the cost of very high variability. Cross-validation has a similar bias to the rule of thumb, but also with high variability. --hazard rate,kernel smoothing,bandwidth selection,nearest neighbor bandwidth,rule of thumb,plug-in,cross-validation,credit risk

Research Papers in Economics

Functional limit laws for the increments of the quantile process; with applications

Author: Viallon Vivian
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

We establish a functional limit law of the logarithm for the increments of the normed quantile process based upon a random sample of size

n\to\infty

. We extend a limit law obtained by Deheuvels and Mason (12), showing that their results hold uniformly over the bandwidth

h

, restricted to vary in

[h'_n,h''_n]

, where

\{h'_n\}_{n\geq1}

and

\{h''_n\}_{n\geq 1}

are appropriate non-random sequences. We treat the case where the sample observations follow possibly non-uniform distributions. As a consequence of our theorems, we provide uniform limit laws for nearest-neighbor density estimators, in the spirit of those given by Deheuvels and Mason (13) for kernel-type estimators.Comment: Published in at http://dx.doi.org/10.1214/07-EJS099 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Population Synthesis via k-Nearest Neighbor Crossover Kernel

Author: Hamada Naoki
Higuchi Hiroyuki
Homma Katsumi
Kikuchi Hideyuki
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/08/2015
Field of study

The recent development of multi-agent simulations brings about a need for population synthesis. It is a task of reconstructing the entire population from a sampling survey of limited size (1% or so), supplying the initial conditions from which simulations begin. This paper presents a new kernel density estimator for this task. Our method is an analogue of the classical Breiman-Meisel-Purcell estimator, but employs novel techniques that harness the huge degree of freedom which is required to model high-dimensional nonlinearly correlated datasets: the crossover kernel, the k-nearest neighbor restriction of the kernel construction set and the bagging of kernels. The performance as a statistical estimator is examined through real and synthetic datasets. We provide an "optimization-free" parameter selection rule for our method, a theory of how our method works and a computational cost analysis. To demonstrate the usefulness as a population synthesizer, our method is applied to a household synthesis task for an urban micro-simulator.Comment: 10 pages, 4 figures, IEEE International Conference on Data Mining (ICDM) 201

arXiv.org e-Print Archive

Crossref

Regression Discontinuity Designs Using Covariates

Author: Calonico Sebastian
Cattaneo Matias D.
Farrell Max H.
Titiunik Rocio
Publication venue
Publication date: 11/09/2018
Field of study

We study regression discontinuity designs when covariates are included in the estimation. We examine local polynomial estimators that include discrete or continuous covariates in an additive separable way, but without imposing any parametric restrictions on the underlying population regression functions. We recommend a covariate-adjustment approach that retains consistency under intuitive conditions, and characterize the potential for estimation and inference improvements. We also present new covariate-adjusted mean squared error expansions and robust bias-corrected inference procedures, with heteroskedasticity-consistent and cluster-robust standard errors. An empirical illustration and an extensive simulation study is presented. All methods are implemented in \texttt{R} and \texttt{Stata} software packages

arXiv.org e-Print Archive

Princeton University Open Access Repository