7,281 research outputs found
Nonlinear Boosting Projections for Ensemble Construction
In this paper we propose a novel approach for ensemble construction based on the use of nonlinear
projections to achieve both accuracy and diversity of individual classifiers. The proposed approach
combines the philosophy of boosting, putting more effort on difficult instances, with the basis of
the random subspace method. Our main contribution is that instead of using a random subspace,
we construct a projection taking into account the instances which have posed most difficulties to
previous classifiers. In this way, consecutive nonlinear projections are created by a neural network
trained using only incorrectly classified instances. The feature subspace induced by the hidden layer
of this network is used as the input space to a new classifier. The method is compared with bagging
and boosting techniques, showing an improved performance on a large set of 44 problems from the
UCI Machine Learning Repository. An additional study showed that the proposed approach is less
sensitive to noise in the data than boosting method
Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval
Relevance feedback schemes based on support vector machines (SVM) have been widely used in content-based image retrieval (CBIR). However, the performance of SVM-based relevance feedback is often poor when the number of labeled positive feedback samples is small. This is mainly due to three reasons: 1) an SVM classifier is unstable on a small-sized training set, 2) SVM's optimal hyperplane may be biased when the positive feedback samples are much less than the negative feedback samples, and 3) overfitting happens because the number of feature dimensions is much higher than the size of the training set. In this paper, we develop a mechanism to overcome these problems. To address the first two problems, we propose an asymmetric bagging-based SVM (AB-SVM). For the third problem, we combine the random subspace method and SVM for relevance feedback, which is named random subspace SVM (RS-SVM). Finally, by integrating AB-SVM and RS-SVM, an asymmetric bagging and random subspace SVM (ABRS-SVM) is built to solve these three problems and further improve the relevance feedback performance
CLEAR: Covariant LEAst-square Re-fitting with applications to image restoration
In this paper, we propose a new framework to remove parts of the systematic
errors affecting popular restoration algorithms, with a special focus for image
processing tasks. Generalizing ideas that emerged for regularization,
we develop an approach re-fitting the results of standard methods towards the
input data. Total variation regularizations and non-local means are special
cases of interest. We identify important covariant information that should be
preserved by the re-fitting method, and emphasize the importance of preserving
the Jacobian (w.r.t. the observed signal) of the original estimator. Then, we
provide an approach that has a "twicing" flavor and allows re-fitting the
restored signal by adding back a local affine transformation of the residual
term. We illustrate the benefits of our method on numerical simulations for
image restoration tasks
Ensembles of probability estimation trees for customer churn prediction
Customer churn prediction is one of the most, important elements tents of a company's Customer Relationship Management, (CRM) strategy In tins study, two strategies are investigated to increase the lift. performance of ensemble classification models, i.e (1) using probability estimation trees (PETs) instead of standard decision trees as base classifiers; and (n) implementing alternative fusion rules based on lift weights lot the combination of ensemble member's outputs Experiments ale conducted lot font popular ensemble strategics on five real-life chin n data sets In general, the results demonstrate how lift performance can be substantially improved by using alternative base classifiers and fusion tides However: the effect vanes lot the (Idol cut ensemble strategies lit particular, the results indicate an increase of lift performance of (1) Bagging by implementing C4 4 base classifiets. (n) the Random Subspace Method (RSM) by using lift-weighted fusion rules, and (in) AdaBoost, by implementing both
Mixed-Integer Convex Nonlinear Optimization with Gradient-Boosted Trees Embedded
Decision trees usefully represent sparse, high dimensional and noisy data.
Having learned a function from this data, we may want to thereafter integrate
the function into a larger decision-making problem, e.g., for picking the best
chemical process catalyst. We study a large-scale, industrially-relevant
mixed-integer nonlinear nonconvex optimization problem involving both
gradient-boosted trees and penalty functions mitigating risk. This
mixed-integer optimization problem with convex penalty terms broadly applies to
optimizing pre-trained regression tree models. Decision makers may wish to
optimize discrete models to repurpose legacy predictive models, or they may
wish to optimize a discrete model that particularly well-represents a data set.
We develop several heuristic methods to find feasible solutions, and an exact,
branch-and-bound algorithm leveraging structural properties of the
gradient-boosted trees and penalty functions. We computationally test our
methods on concrete mixture design instance and a chemical catalysis industrial
instance
- …