13,709 research outputs found
On the consistency of Multithreshold Entropy Linear Classifier
Multithreshold Entropy Linear Classifier (MELC) is a recent classifier idea
which employs information theoretic concept in order to create a multithreshold
maximum margin model. In this paper we analyze its consistency over
multithreshold linear models and show that its objective function upper bounds
the amount of misclassified points in a similar manner like hinge loss does in
support vector machines. For further confirmation we also conduct some
numerical experiments on five datasets.Comment: Presented at Theoretical Foundations of Machine Learning 2015
(http://tfml.gmum.net), final version published in Schedae Informaticae
Journa
Spatial aggregation of local likelihood estimates with applications to classification
This paper presents a new method for spatially adaptive local (constant)
likelihood estimation which applies to a broad class of nonparametric models,
including the Gaussian, Poisson and binary response models. The main idea of
the method is, given a sequence of local likelihood estimates (``weak''
estimates), to construct a new aggregated estimate whose pointwise risk is of
order of the smallest risk among all ``weak'' estimates. We also propose a new
approach toward selecting the parameters of the procedure by providing the
prescribed behavior of the resulting estimate in the simple parametric
situation. We establish a number of important theoretical results concerning
the optimality of the aggregated estimate. In particular, our ``oracle'' result
claims that its risk is, up to some logarithmic multiplier, equal to the
smallest risk for the given family of estimates. The performance of the
procedure is illustrated by application to the classification problem. A
numerical study demonstrates its reasonable performance in simulated and
real-life examples.Comment: Published in at http://dx.doi.org/10.1214/009053607000000271 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Sharp Oracle Inequalities for Aggregation of Affine Estimators
We consider the problem of combining a (possibly uncountably infinite) set of
affine estimators in non-parametric regression model with heteroscedastic
Gaussian noise. Focusing on the exponentially weighted aggregate, we prove a
PAC-Bayesian type inequality that leads to sharp oracle inequalities in
discrete but also in continuous settings. The framework is general enough to
cover the combinations of various procedures such as least square regression,
kernel ridge regression, shrinking estimators and many other estimators used in
the literature on statistical inverse problems. As a consequence, we show that
the proposed aggregate provides an adaptive estimator in the exact minimax
sense without neither discretizing the range of tuning parameters nor splitting
the set of observations. We also illustrate numerically the good performance
achieved by the exponentially weighted aggregate
Parameter tuning in pointwise adaptation using a propagation approach
This paper discusses the problem of adaptive estimation of a univariate
object like the value of a regression function at a given point or a linear
functional in a linear inverse problem. We consider an adaptive procedure
originated from Lepski [Theory Probab. Appl. 35 (1990) 454--466.] that selects
in a data-driven way one estimate out of a given class of estimates ordered by
their variability. A serious problem with using this and similar procedures is
the choice of some tuning parameters like thresholds. Numerical results show
that the theoretically recommended proposals appear to be too conservative and
lead to a strong oversmoothing effect. A careful choice of the parameters of
the procedure is extremely important for getting the reasonable quality of
estimation. The main contribution of this paper is the new approach for
choosing the parameters of the procedure by providing the prescribed behavior
of the resulting estimate in the simple parametric situation. We establish a
non-asymptotical "oracle" bound, which shows that the estimation risk is, up to
a logarithmic multiplier, equal to the risk of the "oracle" estimate that is
optimally selected from the given family. A numerical study demonstrates a good
performance of the resulting procedure in a number of simulated examples.Comment: Published in at http://dx.doi.org/10.1214/08-AOS607 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Adaptive variance function estimation in heteroscedastic nonparametric regression
We consider a wavelet thresholding approach to adaptive variance function
estimation in heteroscedastic nonparametric regression. A data-driven estimator
is constructed by applying wavelet thresholding to the squared first-order
differences of the observations. We show that the variance function estimator
is nearly optimally adaptive to the smoothness of both the mean and variance
functions. The estimator is shown to achieve the optimal adaptive rate of
convergence under the pointwise squared error simultaneously over a range of
smoothness classes. The estimator is also adaptively within a logarithmic
factor of the minimax risk under the global mean integrated squared error over
a collection of spatially inhomogeneous function classes. Numerical
implementation and simulation results are also discussed.Comment: Published in at http://dx.doi.org/10.1214/07-AOS509 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …