Search CORE

3 research outputs found

Optimal choice of $k$ for $k$ -nearest neighbor regression

Author: Azadkia Mona
Publication venue
Publication date: 16/02/2020
Field of study

The

k

-nearest neighbor algorithm (

k

-NN) is a widely used non-parametric method for classification and regression. We study the mean squared error of the

k

-NN estimator when

k

is chosen by leave-one-out cross-validation (LOOCV). Although it was known that this choice of

k

is asymptotically consistent, it was not known previously that it is an optimal

k

. We show, with high probability, the mean squared error of this estimator is close to the minimum mean squared error using the

k

-NN estimate, where the minimum is over all choices of

k

arXiv.org e-Print Archive

A Locally Adaptive Interpretable Regression

Author: Munkhdalai Lkhagvadorj
Munkhdalai Tsendsuren
Ryu Keun Ho
Publication venue
Publication date: 28/04/2022
Field of study

Machine learning models with both good predictability and high interpretability are crucial for decision support systems. Linear regression is one of the most interpretable prediction models. However, the linearity in a simple linear regression worsens its predictability. In this work, we introduce a locally adaptive interpretable regression (LoAIR). In LoAIR, a metamodel parameterized by neural networks predicts percentile of a Gaussian distribution for the regression coefficients for a rapid adaptation. Our experimental results on public benchmark datasets show that our model not only achieves comparable or better predictive performance than the other state-of-the-art baselines but also discovers some interesting relationships between input and target variables such as a parabolic relationship between CO2 emissions and Gross National Product (GNP). Therefore, LoAIR is a step towards bridging the gap between econometrics, statistics, and machine learning by improving the predictive ability of linear regression without depreciating its interpretability

arXiv.org e-Print Archive

Minimax Rate Optimal Adaptive Nearest Neighbor Classification and Regression

Author: Lai Lifeng
Zhao Puning
Publication venue
Publication date: 22/10/2019
Field of study

k Nearest Neighbor (kNN) method is a simple and popular statistical method for classification and regression. For both classification and regression problems, existing works have shown that, if the distribution of the feature vector has bounded support and the probability density function is bounded away from zero in its support, the convergence rate of the standard kNN method, in which k is the same for all test samples, is minimax optimal. On the contrary, if the distribution has unbounded support, we show that there is a gap between the convergence rate achieved by the standard kNN method and the minimax bound. To close this gap, we propose an adaptive kNN method, in which different k is selected for different samples. Our selection rule does not require precise knowledge of the underlying distribution of features. The new proposed method significantly outperforms the standard one. We characterize the convergence rate of the proposed adaptive method, and show that it matches the minimax lower bound

arXiv.org e-Print Archive