8 research outputs found

    Evolving Large-Scale Data Stream Analytics based on Scalable PANFIS

    Full text link
    Many distributed machine learning frameworks have recently been built to speed up the large-scale data learning process. However, most distributed machine learning used in these frameworks still uses an offline algorithm model which cannot cope with the data stream problems. In fact, large-scale data are mostly generated by the non-stationary data stream where its pattern evolves over time. To address this problem, we propose a novel Evolving Large-scale Data Stream Analytics framework based on a Scalable Parsimonious Network based on Fuzzy Inference System (Scalable PANFIS), where the PANFIS evolving algorithm is distributed over the worker nodes in the cloud to learn large-scale data stream. Scalable PANFIS framework incorporates the active learning (AL) strategy and two model fusion methods. The AL accelerates the distributed learning process to generate an initial evolving large-scale data stream model (initial model), whereas the two model fusion methods aggregate an initial model to generate the final model. The final model represents the update of current large-scale data knowledge which can be used to infer future data. Extensive experiments on this framework are validated by measuring the accuracy and running time of four combinations of Scalable PANFIS and other Spark-based built in algorithms. The results indicate that Scalable PANFIS with AL improves the training time to be almost two times faster than Scalable PANFIS without AL. The results also show both rule merging and the voting mechanisms yield similar accuracy in general among Scalable PANFIS algorithms and they are generally better than Spark-based algorithms. In terms of running time, the Scalable PANFIS training time outperforms all Spark-based algorithms when classifying numerous benchmark datasets.Comment: 20 pages, 5 figure

    Rekomendasi Berdasarkan Nilai Pretest Mahasiswa Menggunakan Metode Collaborative Filtering dan Bayesian Ranking

    Get PDF
    Abstract- Self-Regulated Learning (SRL) skill can be improved by improving students’ cognitive and metacognitive abilities. To improve metacognitive abilities, metacognitive support in learning process using e-learning needs to be included. One of the example is assisting students by giving feedbacks once students had finished doing specific avtivities. The purpose of this study was to develop a pedagogical agent with the abilities to give students feedbacks, particularly recommendations of lesson sub-materials order. Recommendations were given by considering students pretest scores (students’ prior knowledge). The computations for recommendations used Collaborative Filtering and Bayesian Ranking methods. Results obtained in this study show that based on MAP (Mean Average Precision) testings, Item-based method got the highest MAP score, which was 1. Computation time for each method was calculated to find runtime complexity of each method. The results of computation time show that Bayesian Ranking had the shortest computation time with 0,002 seconds, followed by Item-based with 0,006 seconds, User Based with 0,226 seconds, while Hybrid has the longest computation time with 0,236 seconds. Keyword- self-regulated learning, metacognitive, metacognitive support, feedback, pretest (prior knowledge), Collaborative Filtering, Bayesian Ranking, Mean Average Precision, runtime complexity

    A randomized neural network for data streams

    Get PDF
    © 2017 IEEE. Randomized neural network (RNN) is a highly feasible solution in the era of big data because it offers a simple and fast working principle in processing dynamic and evolving data streams. This paper proposes a novel RNN, namely recurrent type-2 random vector functional link network (RT2McRVFLN), which provides a highly scalable solution for data streams in a strictly online and integrated framework. It is built upon the psychologically inspired concept of metacognitive learning, which covers three basic components of human learning: what-to-learn, how-to-learn, and when-to-learn. The what-to-learn selects important samples on the fly with the use of online active learning scenario, which renders our algorithm an online semi-supervised algorithm. The how-to-learn process combines an open structure of evolving concept and a randomized learning algorithm of random vector functional link network (RVFLN). The efficacy of the RT2McRVFLN has been numerically validated through two real-world case studies and comparisons with its counterparts, which arrive at a conclusive finding that our algorithm delivers a tradeoff between accuracy and simplicity

    An Incremental Construction of Deep Neuro Fuzzy System for Continual Learning of Non-stationary Data Streams

    Full text link
    Existing FNNs are mostly developed under a shallow network configuration having lower generalization power than those of deep structures. This paper proposes a novel self-organizing deep FNN, namely DEVFNN. Fuzzy rules can be automatically extracted from data streams or removed if they play limited role during their lifespan. The structure of the network can be deepened on demand by stacking additional layers using a drift detection method which not only detects the covariate drift, variations of input space, but also accurately identifies the real drift, dynamic changes of both feature space and target space. DEVFNN is developed under the stacked generalization principle via the feature augmentation concept where a recently developed algorithm, namely gClass, drives the hidden layer. It is equipped by an automatic feature selection method which controls activation and deactivation of input attributes to induce varying subsets of input features. A deep network simplification procedure is put forward using the concept of hidden layer merging to prevent uncontrollable growth of dimensionality of input space due to the nature of feature augmentation approach in building a deep network structure. DEVFNN works in the sample-wise fashion and is compatible for data stream applications. The efficacy of DEVFNN has been thoroughly evaluated using seven datasets with non-stationary properties under the prequential test-then-train protocol. It has been compared with four popular continual learning algorithms and its shallow counterpart where DEVFNN demonstrates improvement of classification accuracy. Moreover, it is also shown that the concept drift detection method is an effective tool to control the depth of network structure while the hidden layer merging scenario is capable of simplifying the network complexity of a deep network with negligible compromise of generalization performance.Comment: This paper has been published in IEEE Transactions on Fuzzy System

    Fuzzy Transfer Learning Using an Infinite Gaussian Mixture Model and Active Learning

    Full text link
    © 2018 IEEE. Transfer learning is gaining considerable attention due to its ability to leverage previously acquired knowledge to assist in completing a prediction task in a related domain. Fuzzy transfer learning, which is based on fuzzy system (especially fuzzy rule-based models), has been developed because of its capability to deal with the uncertainty in transfer learning. However, two issues with fuzzy transfer learning have not yet been resolved: choosing an appropriate source domain and efficiently selecting labeled data for the target domain. This paper proposes an innovative method based on fuzzy rules that combines an infinite Gaussian mixture model (IGMM) with active learning to enhance the performance and generalizability of the constructed model. An IGMM is used to identify the data structures in the source and target domains providing a promising solution to the domain selection dilemma. Further, we exploit the interactive query strategy in active learning to correct imbalances in the knowledge to improve the generalizability of fuzzy learning models. Through experiments on synthetic datasets, we demonstrate the rationality of employing an IGMM and the effectiveness of applying an active learning technique. Additional experiments on real-world datasets further support the capabilities of the proposed method in practical situations

    Incremental learning algorithms and applications

    Get PDF
    International audienceIncremental learning refers to learning from streaming data, which arrive over time, with limited memory resources and, ideally, without sacrificing model accuracy. This setting fits different application scenarios where lifelong learning is relevant, e.g. due to changing environments , and it offers an elegant scheme for big data processing by means of its sequential treatment. In this contribution, we formalise the concept of incremental learning, we discuss particular challenges which arise in this setting, and we give an overview about popular approaches, its theoretical foundations, and applications which emerged in the last years

    An incremental meta-cognitive-based scaffolding fuzzy neural network

    Full text link
    The idea of meta-cognitive learning has enriched the landscape of evolving systems, because it emulates three fundamental aspects of human learning: what-to-learn; how-to-learn; and when-to-learn. However, existing meta-cognitive algorithms still exclude Scaffolding theory, which can realize a plug-and-play classifier. Consequently, these algorithms require laborious pre- and/or post-training processes to be carried out in addition to the main training process. This paper introduces a novel meta-cognitive algorithm termed GENERIC-Classifier (gClass), where the how-to-learn part constitutes a synergy of Scaffolding Theory - a tutoring theory that fosters the ability to sort out complex learning tasks, and Schema Theory - a learning theory of knowledge acquisition by humans. The what-to-learn aspect adopts an online active learning concept by virtue of an extended conflict and ignorance method, making gClass an incremental semi-supervised classifier, whereas the when-to-learn component makes use of the standard sample reserved strategy. A generalized version of the Takagi-Sugeno Kang (TSK) fuzzy system is devised to serve as the cognitive constituent. That is, the rule premise is underpinned by multivariate Gaussian functions, while the rule consequent employs a subset of the non-linear Chebyshev polynomial. Thorough empirical studies, confirmed by their corresponding statistical tests, have numerically validated the efficacy of gClass, which delivers better classification rates than state-of-the-art classifiers while having less complexity
    corecore