24 research outputs found
Learning Opposites with Evolving Rules
The idea of opposition-based learning was introduced 10 years ago. Since then
a noteworthy group of researchers has used some notions of oppositeness to
improve existing optimization and learning algorithms. Among others,
evolutionary algorithms, reinforcement agents, and neural networks have been
reportedly extended into their opposition-based version to become faster and/or
more accurate. However, most works still use a simple notion of opposites,
namely linear (or type- I) opposition, that for each assigns its
opposite as . This, of course, is a very naive estimate of
the actual or true (non-linear) opposite , which has been
called type-II opposite in literature. In absence of any knowledge about a
function that we need to approximate, there seems to be no
alternative to the naivety of type-I opposition if one intents to utilize
oppositional concepts. But the question is if we can receive some level of
accuracy increase and time savings by using the naive opposite estimate
according to all reports in literature, what would we be able to
gain, in terms of even higher accuracies and more reduction in computational
complexity, if we would generate and employ true opposites? This work
introduces an approach to approximate type-II opposites using evolving fuzzy
rules when we first perform opposition mining. We show with multiple examples
that learning true opposites is possible when we mine the opposites from the
training data to subsequently approximate .Comment: Accepted for publication in The 2015 IEEE International Conference on
Fuzzy Systems (FUZZ-IEEE 2015), August 2-5, 2015, Istanbul, Turke
New Structural Evolving Algorithms For Fuzzy Systems
Recently, the issue of accuracy and interpretability trade-off has been getting more attention when designing new fuzzy systems. In this thesis, three evolving fuzzy models, namely enhancement of fuzzy term identification (EFTI), structure identification method (SIM) and structural evolving approach (SEA) are proposed to spot the best trade-off between accuracy and interpretability. EFTI, SIM and SEA are designed based on error reducing methods. EFTI is developed to fit with single input single output (SISO) problems (i.e. one dimension), while SIM and SEA are developed to fit with multi input single output (MISO) (medium and high dimension). EFTI begins with a simple fuzzy structure that is composed of two fuzzy terms in the input space. Then EFTI continues evolving by identifying splitting points of the input space that are compatible with the consequent parameters. On the other hand, SIM and SEA start with one fuzzy rule that has no fuzzy term in the input space regardless of the degree level of input dimension. Then they evolve on the basis of either closure or split processes for the selected input attribute of the selected subregion. If the selected attribute has no fuzzy terms, closure is performed, otherwise split is done. The evolving continues until a satisfactory accuracy is fulfilled or maximum number of subregion is reached. A partitioning technique based on the similarity feature and a static partition-selection technique are developed for SIM. While, a partitioning technique based on splitting the selected subregion into two subregions with maximum and minimum average error and a dynamic partition-selection technique are developed for SEA. Furthermore, a pruning technique based on the importance level of the fuzzy rules is proposed to shrink the rule-base of SEA. Compared with SISO models and using three datasets, EFTI produces the lowest RMSE with lowest number of rules. For MISO models and using nine benchmark datasets, SIM achieves the lowest RMSE with the smallest size of rule-base systems. Similarly, for MISO state-of-the-art models and using six benchmark datasets, SEA also produces the lowest RMSE with the smallest size of rule-base systems. In conclusion, the results proved that EFTI, SIM and SEA are able to produce a significant trade-off between accuracy and interpretabilit
Evolving Large-Scale Data Stream Analytics based on Scalable PANFIS
Many distributed machine learning frameworks have recently been built to
speed up the large-scale data learning process. However, most distributed
machine learning used in these frameworks still uses an offline algorithm model
which cannot cope with the data stream problems. In fact, large-scale data are
mostly generated by the non-stationary data stream where its pattern evolves
over time. To address this problem, we propose a novel Evolving Large-scale
Data Stream Analytics framework based on a Scalable Parsimonious Network based
on Fuzzy Inference System (Scalable PANFIS), where the PANFIS evolving
algorithm is distributed over the worker nodes in the cloud to learn
large-scale data stream. Scalable PANFIS framework incorporates the active
learning (AL) strategy and two model fusion methods. The AL accelerates the
distributed learning process to generate an initial evolving large-scale data
stream model (initial model), whereas the two model fusion methods aggregate an
initial model to generate the final model. The final model represents the
update of current large-scale data knowledge which can be used to infer future
data. Extensive experiments on this framework are validated by measuring the
accuracy and running time of four combinations of Scalable PANFIS and other
Spark-based built in algorithms. The results indicate that Scalable PANFIS with
AL improves the training time to be almost two times faster than Scalable
PANFIS without AL. The results also show both rule merging and the voting
mechanisms yield similar accuracy in general among Scalable PANFIS algorithms
and they are generally better than Spark-based algorithms. In terms of running
time, the Scalable PANFIS training time outperforms all Spark-based algorithms
when classifying numerous benchmark datasets.Comment: 20 pages, 5 figure
An Incremental Construction of Deep Neuro Fuzzy System for Continual Learning of Non-stationary Data Streams
Existing FNNs are mostly developed under a shallow network configuration
having lower generalization power than those of deep structures. This paper
proposes a novel self-organizing deep FNN, namely DEVFNN. Fuzzy rules can be
automatically extracted from data streams or removed if they play limited role
during their lifespan. The structure of the network can be deepened on demand
by stacking additional layers using a drift detection method which not only
detects the covariate drift, variations of input space, but also accurately
identifies the real drift, dynamic changes of both feature space and target
space. DEVFNN is developed under the stacked generalization principle via the
feature augmentation concept where a recently developed algorithm, namely
gClass, drives the hidden layer. It is equipped by an automatic feature
selection method which controls activation and deactivation of input attributes
to induce varying subsets of input features. A deep network simplification
procedure is put forward using the concept of hidden layer merging to prevent
uncontrollable growth of dimensionality of input space due to the nature of
feature augmentation approach in building a deep network structure. DEVFNN
works in the sample-wise fashion and is compatible for data stream
applications. The efficacy of DEVFNN has been thoroughly evaluated using seven
datasets with non-stationary properties under the prequential test-then-train
protocol. It has been compared with four popular continual learning algorithms
and its shallow counterpart where DEVFNN demonstrates improvement of
classification accuracy. Moreover, it is also shown that the concept drift
detection method is an effective tool to control the depth of network structure
while the hidden layer merging scenario is capable of simplifying the network
complexity of a deep network with negligible compromise of generalization
performance.Comment: This paper has been published in IEEE Transactions on Fuzzy System
Evolving Ensemble Fuzzy Classifier
The concept of ensemble learning offers a promising avenue in learning from
data streams under complex environments because it addresses the bias and
variance dilemma better than its single model counterpart and features a
reconfigurable structure, which is well suited to the given context. While
various extensions of ensemble learning for mining non-stationary data streams
can be found in the literature, most of them are crafted under a static base
classifier and revisits preceding samples in the sliding window for a
retraining step. This feature causes computationally prohibitive complexity and
is not flexible enough to cope with rapidly changing environments. Their
complexities are often demanding because it involves a large collection of
offline classifiers due to the absence of structural complexities reduction
mechanisms and lack of an online feature selection mechanism. A novel evolving
ensemble classifier, namely Parsimonious Ensemble pENsemble, is proposed in
this paper. pENsemble differs from existing architectures in the fact that it
is built upon an evolving classifier from data streams, termed Parsimonious
Classifier pClass. pENsemble is equipped by an ensemble pruning mechanism,
which estimates a localized generalization error of a base classifier. A
dynamic online feature selection scenario is integrated into the pENsemble.
This method allows for dynamic selection and deselection of input features on
the fly. pENsemble adopts a dynamic ensemble structure to output a final
classification decision where it features a novel drift detection scenario to
grow the ensemble structure. The efficacy of the pENsemble has been numerically
demonstrated through rigorous numerical studies with dynamic and evolving data
streams where it delivers the most encouraging performance in attaining a
tradeoff between accuracy and complexity.Comment: this paper has been published by IEEE Transactions on Fuzzy System
Discovering three-dimensional patterns in real-time from data streams: An online triclustering approach
Triclustering algorithms group sets of coordinates of 3-dimensional datasets. In this paper,
a new triclustering approach for data streams is introduced. It follows a streaming scheme
of learning in two steps: offline and online phases. First, the offline phase provides a sum mary model with the components of the triclusters. Then, the second stage is the online
phase to deal with data in streaming. This online phase consists in using the summary
model obtained in the offline stage to update the triclusters as fast as possible with genetic
operators. Results using three types of synthetic datasets and a real-world environmental
sensor dataset are reported. The performance of the proposed triclustering streaming algo rithm is compared to a batch triclustering algorithm, showing an accurate performance
both in terms of quality and running timesMinisterio de Ciencia, Innovación y Universidades TIN2017-88209-C