69,848 research outputs found
Evolving Ensemble Fuzzy Classifier
The concept of ensemble learning offers a promising avenue in learning from
data streams under complex environments because it addresses the bias and
variance dilemma better than its single model counterpart and features a
reconfigurable structure, which is well suited to the given context. While
various extensions of ensemble learning for mining non-stationary data streams
can be found in the literature, most of them are crafted under a static base
classifier and revisits preceding samples in the sliding window for a
retraining step. This feature causes computationally prohibitive complexity and
is not flexible enough to cope with rapidly changing environments. Their
complexities are often demanding because it involves a large collection of
offline classifiers due to the absence of structural complexities reduction
mechanisms and lack of an online feature selection mechanism. A novel evolving
ensemble classifier, namely Parsimonious Ensemble pENsemble, is proposed in
this paper. pENsemble differs from existing architectures in the fact that it
is built upon an evolving classifier from data streams, termed Parsimonious
Classifier pClass. pENsemble is equipped by an ensemble pruning mechanism,
which estimates a localized generalization error of a base classifier. A
dynamic online feature selection scenario is integrated into the pENsemble.
This method allows for dynamic selection and deselection of input features on
the fly. pENsemble adopts a dynamic ensemble structure to output a final
classification decision where it features a novel drift detection scenario to
grow the ensemble structure. The efficacy of the pENsemble has been numerically
demonstrated through rigorous numerical studies with dynamic and evolving data
streams where it delivers the most encouraging performance in attaining a
tradeoff between accuracy and complexity.Comment: this paper has been published by IEEE Transactions on Fuzzy System
Improving Optimization Bounds using Machine Learning: Decision Diagrams meet Deep Reinforcement Learning
Finding tight bounds on the optimal solution is a critical element of
practical solution methods for discrete optimization problems. In the last
decade, decision diagrams (DDs) have brought a new perspective on obtaining
upper and lower bounds that can be significantly better than classical bounding
mechanisms, such as linear relaxations. It is well known that the quality of
the bounds achieved through this flexible bounding method is highly reliant on
the ordering of variables chosen for building the diagram, and finding an
ordering that optimizes standard metrics is an NP-hard problem. In this paper,
we propose an innovative and generic approach based on deep reinforcement
learning for obtaining an ordering for tightening the bounds obtained with
relaxed and restricted DDs. We apply the approach to both the Maximum
Independent Set Problem and the Maximum Cut Problem. Experimental results on
synthetic instances show that the deep reinforcement learning approach, by
achieving tighter objective function bounds, generally outperforms ordering
methods commonly used in the literature when the distribution of instances is
known. To the best knowledge of the authors, this is the first paper to apply
machine learning to directly improve relaxation bounds obtained by
general-purpose bounding mechanisms for combinatorial optimization problems.Comment: Accepted and presented at AAAI'1
Basics of Feature Selection and Statistical Learning for High Energy Physics
This document introduces basics in data preparation, feature selection and
learning basics for high energy physics tasks. The emphasis is on feature
selection by principal component analysis, information gain and significance
measures for features. As examples for basic statistical learning algorithms,
the maximum a posteriori and maximum likelihood classifiers are shown.
Furthermore, a simple rule based classification as a means for automated cut
finding is introduced. Finally two toolboxes for the application of statistical
learning techniques are introduced.Comment: 12 pages, 8 figures. Part of the proceedings of the Track
'Computational Intelligence for HEP Data Analysis' at iCSC 200
Multi-learner based recursive supervised training
In this paper, we propose the Multi-Learner Based Recursive Supervised Training (MLRT) algorithm which uses the existing framework of recursive task decomposition, by training the entire dataset, picking out the best learnt patterns, and then repeating the process with the remaining patterns. Instead of having a single learner to classify all datasets during each recursion, an appropriate learner is chosen from a set of three learners, based on the subset of data being trained, thereby avoiding the time overhead associated with the genetic algorithm learner utilized in previous approaches. In this way MLRT seeks to identify the inherent characteristics of the dataset, and utilize it to train the data accurately and efficiently. We observed that empirically, MLRT performs considerably well as compared to RPHP and other systems on benchmark data with 11% improvement in accuracy on the SPAM dataset and comparable performances on the VOWEL and the TWO-SPIRAL problems. In addition, for most datasets, the time taken by MLRT is considerably lower than the other systems with comparable accuracy. Two heuristic versions, MLRT-2 and MLRT-3 are also introduced to improve the efficiency in the system, and to make it more scalable for future updates. The performance in these versions is similar to the original MLRT system
- …