1,731,609 research outputs found
On-line predictive linear regression
We consider the on-line predictive version of the standard problem of linear
regression; the goal is to predict each consecutive response given the
corresponding explanatory variables and all the previous observations. We are
mainly interested in prediction intervals rather than point predictions. The
standard treatment of prediction intervals in linear regression analysis has
two drawbacks: (1) the classical prediction intervals guarantee that the
probability of error is equal to the nominal significance level epsilon, but
this property per se does not imply that the long-run frequency of error is
close to epsilon; (2) it is not suitable for prediction of complex systems as
it assumes that the number of observations exceeds the number of parameters. We
state a general result showing that in the on-line protocol the frequency of
error for the classical prediction intervals does equal the nominal
significance level, up to statistical fluctuations. We also describe
alternative regression models in which informative prediction intervals can be
found before the number of observations exceeds the number of parameters. One
of these models, which only assumes that the observations are independent and
identically distributed, is popular in machine learning but greatly underused
in the statistical theory of regression.Comment: 34 pages; 6 figures; 1 table. arXiv admin note: substantial text
overlap with arXiv:0906.312
Logistic regression for simulating damage occurrence on a fruit grading line
Many factors influence the incidence of mechanical damage in fruit handled on a grading line. This makes it difficult to address damage estimation from an analytical point of view. During fruit transfer from one element of a grading line to another, damage occurs as a combined effect of machinery roughness and the intrinsic susceptibility of fruit. This paper describes a method to estimate bruise probability by means of logistic regression, using data yielded by specific laboratory tests. Model accuracy was measured via the statistical significance of its parameters and its classification ability. The prediction model was then linked to a simulation model through which impacts and load levels, similar to those of real grading lines, could be generated. The simulation output sample size was determined to yield reliable estimations. The process makes it possible to derive a suitable line design and the type of fruit that should be handled to maintain bruise levels within European Union (EU) Standards. A real example with peaches was carried out with the aid of the software implementation SIMLIN®, developed by the authors and registered by Madrid Technical University. This kind of tool has been demanded by inter-professional associations and grading lines designers in recent year
On-line nonparametric regression to learn state-dependent disturbances
A combination of recursive least squares and weighted least squares is made which can adapt its structure such that a relation between in- and output can he approximated, even when the structure of this relation is unknown beforehand.\ud
This method can adapt its structure on-line while it preserves information offered by previous samples, making it applicable in a control setting. This method has been tested with compntergenerated data, and it b used in a simulation to learn the non-linear state-dependent effects, both with good success
On-line support vector machines for function approximation
This paper describes an on-line method for building epsilon-insensitive support vector machines for regression as described in (Vapnik, 1995). The method is an extension of the method developed by (Cauwenberghs & Poggio, 2000) for building incremental support vector machines for classification. Machines obtained by using this approach are equivalent to the ones obtained by applying exact methods like quadratic programming, but they are obtained more quickly and allow the incremental addition of new points, removal of existing points and update of target values for existing data. This development opens the application of SVM regression to areas such as on-line prediction of temporal series or generalization of value functions in reinforcement learning.Postprint (published version
On-line regression competitive with reproducing kernel Hilbert spaces
We consider the problem of on-line prediction of real-valued labels, assumed
bounded in absolute value by a known constant, of new objects from known
labeled objects. The prediction algorithm's performance is measured by the
squared deviation of the predictions from the actual labels. No stochastic
assumptions are made about the way the labels and objects are generated.
Instead, we are given a benchmark class of prediction rules some of which are
hoped to produce good predictions. We show that for a wide range of
infinite-dimensional benchmark classes one can construct a prediction algorithm
whose cumulative loss over the first N examples does not exceed the cumulative
loss of any prediction rule in the class plus O(sqrt(N)); the main differences
from the known results are that we do not impose any upper bound on the norm of
the considered prediction rules and that we achieve an optimal leading term in
the excess loss of our algorithm. If the benchmark class is "universal" (dense
in the class of continuous functions on each compact set), this provides an
on-line non-stochastic analogue of universally consistent prediction in
non-parametric statistics. We use two proof techniques: one is based on the
Aggregating Algorithm and the other on the recently developed method of
defensive forecasting.Comment: 37 pages, 1 figur
- …
