275 research outputs found
Learning Dynamic Feature Selection for Fast Sequential Prediction
We present paired learning and inference algorithms for significantly
reducing computation and increasing speed of the vector dot products in the
classifiers that are at the heart of many NLP components. This is accomplished
by partitioning the features into a sequence of templates which are ordered
such that high confidence can often be reached using only a small fraction of
all features. Parameter estimation is arranged to maximize accuracy and early
confidence in this sequence. Our approach is simpler and better suited to NLP
than other related cascade methods. We present experiments in left-to-right
part-of-speech tagging, named entity recognition, and transition-based
dependency parsing. On the typical benchmarking datasets we can preserve POS
tagging accuracy above 97% and parsing LAS above 88.5% both with over a
five-fold reduction in run-time, and NER F1 above 88 with more than 2x increase
in speed.Comment: Appears in The 53rd Annual Meeting of the Association for
Computational Linguistics, Beijing, China, July 201
Training for Fast Sequential Prediction Using Dynamic Feature Selection
We present paired learning and inference algorithms for significantly
reducing computation and increasing speed of the vector dot products in the
classifiers that are at the heart of many NLP components. This is accomplished
by partitioning the features into a sequence of templates which are ordered
such that high confidence can often be reached using only a small fraction of
all features. Parameter estimation is arranged to maximize accuracy and early
confidence in this sequence. We present experiments in left-to-right
part-of-speech tagging on WSJ, demonstrating that we can preserve accuracy
above 97% with over a five-fold reduction in run-time.Comment: 5 pages, NIPS Modern ML + NLP Workshop 201
Efficient least angle regression for identification of linear-in-the-parameters models
Least angle regression, as a promising model selection method, differentiates itself from conventional stepwise and stagewise methods, in that it is neither too greedy nor too slow. It is closely related to L1 norm optimization, which has the advantage of low prediction variance through sacrificing part of model bias property in order to enhance model generalization capability. In this paper, we propose an efficient least angle regression algorithm for model selection for a large class of linear-in-the-parameters models with the purpose of accelerating the model selection process. The entire algorithm works completely in a recursive manner, where the correlations between model terms and residuals, the evolving directions and other pertinent variables are derived explicitly and updated successively at every subset selection step. The model coefficients are only computed when the algorithm finishes. The direct involvement of matrix inversions is thereby relieved. A detailed computational complexity analysis indicates that the proposed algorithm possesses significant computational efficiency, compared with the original approach where the well-known efficient Cholesky decomposition is involved in solving least angle regression. Three artificial and real-world examples are employed to demonstrate the effectiveness, efficiency and numerical stability of the proposed algorithm
Boosting with early stopping: Convergence and consistency
Boosting is one of the most significant advances in machine learning for
classification and regression. In its original and computationally flexible
version, boosting seeks to minimize empirically a loss function in a greedy
fashion. The resulting estimator takes an additive function form and is built
iteratively by applying a base estimator (or learner) to updated samples
depending on the previous iterations. An unusual regularization technique,
early stopping, is employed based on CV or a test set. This paper studies
numerical convergence, consistency and statistical rates of convergence of
boosting with early stopping, when it is carried out over the linear span of a
family of basis functions. For general loss functions, we prove the convergence
of boosting's greedy optimization to the infinimum of the loss function over
the linear span. Using the numerical convergence result, we find early-stopping
strategies under which boosting is shown to be consistent based on i.i.d.
samples, and we obtain bounds on the rates of convergence for boosting
estimators. Simulation studies are also presented to illustrate the relevance
of our theoretical results for providing insights to practical aspects of
boosting. As a side product, these results also reveal the importance of
restricting the greedy search step-sizes, as known in practice through the work
of Friedman and others. Moreover, our results lead to a rigorous proof that for
a linearly separable problem, AdaBoost with \epsilon\to0 step-size becomes an
L^1-margin maximizer when left to run to convergence.Comment: Published at http://dx.doi.org/10.1214/009053605000000255 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
An adaptive multiclass nearest neighbor classifier
We consider a problem of multiclass classification, where the training sample
is generated from the model , , and are
unknown -Holder continuous functions.Given a test point , our goal
is to predict its label. A widely used -nearest-neighbors classifier
constructs estimates of and uses a plug-in rule
for the prediction. However, it requires a proper choice of the smoothing
parameter , which may become tricky in some situations. In our
solution, we fix several integers , compute corresponding
-nearest-neighbor estimates for each and each and apply an
aggregation procedure. We study an algorithm, which constructs a convex
combination of these estimates such that the aggregated estimate behaves
approximately as well as an oracle choice. We also provide a non-asymptotic
analysis of the procedure, prove its adaptation to the unknown smoothness
parameter and to the margin and establish rates of convergence under
mild assumptions.Comment: Accepted in ESAIM: Probability & Statistics. The original publication
is available at www.esaim-ps.or
Gradient boosting models for photovoltaic power estimation under partial shading conditions
The energy yield estimation of a photovoltaic (PV) system operating under partially shaded conditions is a challenging task and a very active area of research. In this paper, we attack this problem with the aid of machine learning techniques. Using data simulated by the equivalent circuit of a PV string operating under partial shading, we train and evaluate three different gradient boosted regression tree models to predict the global maximum power point (MPP). Our results show that all three approaches improve upon the state-of-the-art closed-form estimates, in terms of both average and worst-case performance. Moreover, we show that even a small number of training examples is sufficient to achieve improved global MPP estimation. The methods proposed are fast to train and deploy and allow for further improvements in performance should more computational resources be available
Tree Boosting Data Competitions with XGBoost
This Master's Degree Thesis objective is to provide understanding on how to approach a supervised learning predictive problem and illustrate it using a statistical/machine learning algorithm, Tree Boosting. A review of tree methodology is introduced in order to understand its evolution, since Classification and Regression Trees, followed by Bagging, Random Forest and, nowadays, Tree Boosting. The methodology is explained following the XGBoost implementation, which achieved state-of-the-art results in several data competitions. A framework for applied predictive modelling is explained with its proper concepts: objective function, regularization term, overfitting, hyperparameter tuning, k-fold cross validation and feature engineering. All these concepts are illustrated with a real dataset of videogame churn; used in a datathon competition
- …