31,083 research outputs found
Recommended from our members
Incremental Learning with Large Datasets
This dissertation focuses on the novel learning strategy based on geometric support vector machines to address the difficulties of processing immense data set. Support vector machines find the hyper-plane that maximizes the margin between two classes, and the decision boundary is represented with a few training samples it becomes a favorable choice for incremental learning. The dissertation presents a novel method Geometric Incremental Support Vector Machines (GISVMs) to address both efficiency and accuracy issues in handling massive data sets. In GISVM, skin of convex hulls is defined and an efficient method is designed to find the best skin approximation given available examples. The set of extreme points are found by recursively searching along the direction defined by a pair of known extreme points. By identifying the skin of the convex hulls, the incremental learning will only employ a much smaller number of samples with comparable or even better accuracy. When additional samples are provided, they will be used together with the skin of the convex hull constructed from previous dataset. This results in a small number of instances used in incremental steps of the training process. Based on the experimental results with synthetic data sets, public benchmark data sets from UCI and endoscopy videos, it is evident that the GISVM achieved satisfactory classifiers that closely model the underlying data distribution. GISVM improves the performance in sensitivity in the incremental steps, significantly reduced the demand for memory space, and demonstrates the ability of recovery from temporary performance degradation
Human activity recognition for physical rehabilitation
The recognition of human activity is a challenging topic for machine learning. We present an analysis of Support Vector Machines (SVM) and Random Forests (RF) in their ability to accurately classify Kinect kinematic activities. Twenty participants were captured using the Microsoft Kinect performing ten physical rehabilitation activities. We extracted the kinematic location, velocity and energy of the skeletal joints at each frame of the activity to form a feature vector. Principle Component Analysis (PCA) was applied as a pre-processing step to reduce dimensionality and identify significant features amongst activity classes. SVM and RF are then trained on the PCA feature space to assess classification performance; we undertook an incremental increase in the dataset size.We analyse the classification accuracy, model training and classification time quantitatively at each incremental increase. The experimental results demonstrate that RF outperformed SVM in classification rate for six out of the ten activities. Although SVM has performance advantages in training time, RF would be more suited to real-time activity classification due to its low classification time and high classification accuracy when using eight to ten participants in the training set. © 2013 IEEE
Evolving class for SVM's incremental learning.
International audienceThe good generalization performance of support vector machines (SVM) has made them a popular tool in artificial intelligence community. In this paper, we prove that SVM multi class algorithms are not equivalent for all classification problems we present a new approach for incremental learning using SVM that create a rejection class which would be interesting for fault diagnosis where fault classes usually evolve with time : It is when some new samples may be rejected by all the current classes. Hence, these samples may correspond to a new fault (a new class) which may appear after the first training step
Kratkoročna prognoza potrošnje električne energije zasnovana na metodama veštačke inteligencije
The topic of this dissertation is a short-term load forecasting using artificial intelligence methods. Three new models with least squares support vector machines for nonlinear regression are proposed.
First proposed model is a model with forecasting in two stages. This model use additioal feature, maximum daily load which is not known for day ahead. Forecating of maximum daily load is obtained in the first stage. This forecasted value is used in second stage, where forecasting of hourly load is done. Model with feature selection, using mutual information for selection criteria, is a second proposed model. This model tries to find an optimal feature set for a given problem. Forecasting model based on an incremental update scheme is a third proposed model. This model is based on the incremental update of the initial training set by adding new instances into it as soon as they become available and throwing out the old ones. Then the model is trained with new training set. By this approach the evolving nature of the load pattern is followed and the model performance is preserved and improved.
For models evaluation, the forecasting of hourly loads for one year is done. Electrical consumption data for the City of Niš, which have about 260000 habitans and average daily demand of 182 MW, is used for testing. Double sesonal ARIMA and Holt-Winters as representatives of clasical models and artificial neural networks, least squares support vector machines and relevance vector machines as representatives of artificial models, are used for models evaluation. For a measure of accuracy, mean absolute percentage error, symetrical mean absolute percentage error, square root mean error and absolute percentage error are used.
Obtained results show that the best model is model with incremental update scheme, followed by double sesonal ARIMA and artificial neural networks models. The worst results are obtained by relevance vector machines and double sesonal Holt-Winters models. It has been shown that the best model could be successfully used with the short-term load forecasting problem
LOSDD: Leave-Out Support Vector Data Description for Outlier Detection
Support Vector Machines have been successfully used for one-class
classification (OCSVM, SVDD) when trained on clean data, but they work much
worse on dirty data: outliers present in the training data tend to become
support vectors, and are hence considered "normal". In this article, we improve
the effectiveness to detect outliers in dirty training data with a leave-out
strategy: by temporarily omitting one candidate at a time, this point can be
judged using the remaining data only. We show that this is more effective at
scoring the outlierness of points than using the slack term of existing
SVM-based approaches. Identified outliers can then be removed from the data,
such that outliers hidden by other outliers can be identified, to reduce the
problem of masking. Naively, this approach would require training N individual
SVMs (and training SVMs when iteratively removing the worst outliers
one at a time), which is prohibitively expensive. We will discuss that only
support vectors need to be considered in each step and that by reusing SVM
parameters and weights, this incremental retraining can be accelerated
substantially. By removing candidates in batches, we can further improve the
processing time, although it obviously remains more costly than training a
single SVM
On-line support vector machines for function approximation
This paper describes an on-line method for building epsilon-insensitive support vector machines for regression as described in (Vapnik, 1995). The method is an extension of the method developed by (Cauwenberghs & Poggio, 2000) for building incremental support vector machines for classification. Machines obtained by using this approach are equivalent to the ones obtained by applying exact methods like quadratic programming, but they are obtained more quickly and allow the incremental addition of new points, removal of existing points and update of target values for existing data. This development opens the application of SVM regression to areas such as on-line prediction of temporal series or generalization of value functions in reinforcement learning.Postprint (published version
Weighted Incremental–Decremental Support Vector Machines for concept drift with shifting window
We study the problem of learning the data samples’ distribution as it changes in time. This change, known as concept drift, complicates the task of training a model, as the predictions become less and less accurate. It is known that Support Vector Machines (SVMs) can learn weighted input instances and that they can also be trained online (incremental–decremental learning). Combining these two SVM properties, the open problem is to define an online SVM concept drift model with shifting weighted window. The classic SVM model should be retrained from scratch after each window shift. We introduce the Weighted Incremental–Decremental SVM (WIDSVM), a generalization of the incremental–decremental SVM for shifting windows. WIDSVM is capable of learning from data streams with concept drift, using the weighted shifting window technique. The soft margin constrained optimization problem imposed on the shifting window is reduced to an incremental–decremental SVM. At each window shift, we determine the exact conditions for vector migration during the incremental–decremental process. We perform experiments on artificial and real-world concept drift datasets; they show that the classification accuracy of WIDSVM significantly improves compared to a SVM with no shifting window. The WIDSVM training phase is fast, since it does not retrain from scratch after each window shift
- …