150,984 research outputs found
Kernel-based Information Criterion
This paper introduces Kernel-based Information Criterion (KIC) for model
selection in regression analysis. The novel kernel-based complexity measure in
KIC efficiently computes the interdependency between parameters of the model
using a variable-wise variance and yields selection of better, more robust
regressors. Experimental results show superior performance on both simulated
and real data sets compared to Leave-One-Out Cross-Validation (LOOCV),
kernel-based Information Complexity (ICOMP), and maximum log of marginal
likelihood in Gaussian Process Regression (GPR).Comment: We modified the reference 17, and the subcaptions of Figure
Recommended from our members
Probability density estimation with tunable kernels using orthogonal forward regression
A generalized or tunable-kernel model is proposed for probability density function estimation based on an orthogonal forward regression procedure. Each stage of the density estimation process determines a tunable kernel, namely, its center vector and diagonal covariance matrix, by minimizing a leave-one-out test criterion. The kernel mixing weights of the constructed sparse density estimate are finally updated using the multiplicative nonnegative quadratic programming algorithm to ensure the nonnegative and unity constraints, and this weight-updating process additionally has the desired ability to further reduce the model size. The proposed tunable-kernel model has advantages, in terms of model generalization capability and model sparsity, over the standard fixed-kernel model that restricts kernel centers to the training data points and employs a single common kernel variance for every kernel. On the other hand, it does not optimize all the model parameters together and thus avoids the problems of high-dimensional ill-conditioned nonlinear optimization associated with the conventional finite mixture model. Several examples are included to demonstrate the ability of the proposed novel tunable-kernel model to effectively construct a very compact density estimate accurately
Recommended from our members
Sparse kernel density construction using orthogonal forward regression with leave-one-out test score and local regularization
The paper presents an efficient construction algorithm for obtaining sparse kernel density estimates based on a regression approach that directly optimizes model generalization capability. Computational efficiency of the density construction is ensured using an orthogonal forward regression, and the algorithm incrementally minimizes the leave-one-out test score. A local regularization method is incorporated naturally into the density construction process to further enforce sparsity. An additional advantage of the proposed algorithm is that it is fully automatic and the user is not required to specify any criterion to terminate the density construction procedure. This is in contrast to an existing state-of-art kernel density estimation method using the support vector machine (SVM), where the user is required to specify some critical algorithm parameter. Several examples are included to demonstrate the ability of the proposed algorithm to effectively construct a very sparse kernel density estimate with comparable accuracy to that of the full sample optimized Parzen window density estimate. Our experimental results also demonstrate that the proposed algorithm compares favourably with the SVM method, in terms of both test accuracy and sparsity, for constructing kernel density estimates
Kernel density construction using orthogonal forward regression
An automatic algorithm is derived for constructing kernel density estimates based on a regression approach that directly optimizes generalization capability. Computational efficiency of the density construction is ensured using an orthogonal forward regression, and the algorithm incrementally minimizes the leave-one-out test score. Local regularization is incorporated into the density construction process to further enforce sparsity. Examples are included to demonstrate the ability of the proposed algorithm to effectively construct a very sparse kernel density estimate with comparable accuracy to that of the full sample Parzen window density estimate
Klasifikasi Wilayah Desa-perdesaan Dan Desa-perkotaan Wilayah Kabupaten Semarang Dengan Support Vector Machine (Svm)
This research will be carry out classification based on the status of the rural and urban regions that reflect the differences in characteristics/ conditions between regions in Indonesia with Support Vector Machine (SVM) method. Classification on this issue is working by build separation functions involving the kernel function to map the input data into a higher dimensional space. Sequential Minimal Optimization (SMO) algorithms is used in the training process of data classification of rural and urban regions to get the optimal separation function (hyperplane). To determine the kernel function and parameters according to the data, grid search method combined with the leave-one-out cross-validation method is used. In the classification using SVM, accuracy is obtained, which the best value is 90% using Radial Basis Function (RBF) kernel functions with parameters C=100 dan γ=2-5
Concentration inequalities for leave-one-out cross validation
In this article we prove that estimator stability is enough to show that
leave-one-out cross validation is a sound procedure, by providing concentration
bounds in a general framework. In particular, we provide concentration bounds
beyond Lipschitz continuity assumptions on the loss or on the estimator. In
order to obtain our results, we rely on random variables with distribution
satisfying the logarithmic Sobolev inequality, providing us a relatively rich
class of distributions. We illustrate our method by considering several
interesting examples, including linear regression, kernel density estimation,
and stabilized / truncated estimators such as stabilized kernel regression
Approximate inference of the bandwidth in multivariate kernel density estimation
Kernel density estimation is a popular and widely used non-parametric method for data-driven density estimation. Its appeal lies in its simplicity and ease of implementation, as well as its strong asymptotic results regarding its convergence to the true data distribution. However, a major difficulty is the setting of the bandwidth, particularly in high dimensions and with limited amount of data. An approximate Bayesian method is proposed, based on the Expectation–Propagation algorithm with a likelihood obtained from a leave-one-out cross validation approach. The proposed method yields an iterative procedure to approximate the posterior distribution of the inverse bandwidth. The approximate posterior can be used to estimate the model evidence for selecting the structure of the bandwidth and approach online learning. Extensive experimental validation shows that the proposed method is competitive in terms of performance with state-of-the-art plug-in methods
- …