Search CORE

88,525 research outputs found

A fuzzy approach for interpretation and application of ubiquitous data stream clustering

Author: Gaber M.
Horovitz O.
Krishnaswamy S.
Publication venue
Publication date: 07/10/2005
Field of study

Portsmouth University Research Portal (Pure)

State-of-the-art in data stream mining

Author: Gaber M.
Gama J.
Publication venue
Publication date: 17/09/2007
Field of study

Portsmouth University Research Portal (Pure)

Mining data streams using option trees (revised edition, 2004)

Author: Holmes Geoffrey
Kirkby Richard Brendon
Pfahringer Bernhard
Publication venue: Department of Computer Science, The University of Waikato
Publication date: 01/01/2004
Field of study

The data stream model for data mining places harsh restrictions on a learning algorithm. A model must be induced following the briefest interrogation of the data, must use only available memory and must update itself over time within these constraints. Additionally, the model must be able to be used for data mining at any point in time. This paper describes a data stream classi_cation algorithm using an ensemble of option trees. The ensemble of trees is induced by boosting and iteratively combined into a single interpretable model. The algorithm is evaluated using benchmark datasets for accuracy against state-of-the-art algorithms that make use of the entire dataset

Research Commons@Waikato

Evolving Ensemble Fuzzy Classifier

Author: Lughofer Edwin
Pedrycz Witold
Pratama Mahardhika
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The concept of ensemble learning offers a promising avenue in learning from data streams under complex environments because it addresses the bias and variance dilemma better than its single model counterpart and features a reconfigurable structure, which is well suited to the given context. While various extensions of ensemble learning for mining non-stationary data streams can be found in the literature, most of them are crafted under a static base classifier and revisits preceding samples in the sliding window for a retraining step. This feature causes computationally prohibitive complexity and is not flexible enough to cope with rapidly changing environments. Their complexities are often demanding because it involves a large collection of offline classifiers due to the absence of structural complexities reduction mechanisms and lack of an online feature selection mechanism. A novel evolving ensemble classifier, namely Parsimonious Ensemble pENsemble, is proposed in this paper. pENsemble differs from existing architectures in the fact that it is built upon an evolving classifier from data streams, termed Parsimonious Classifier pClass. pENsemble is equipped by an ensemble pruning mechanism, which estimates a localized generalization error of a base classifier. A dynamic online feature selection scenario is integrated into the pENsemble. This method allows for dynamic selection and deselection of input features on the fly. pENsemble adopts a dynamic ensemble structure to output a final classification decision where it features a novel drift detection scenario to grow the ensemble structure. The efficacy of the pENsemble has been numerically demonstrated through rigorous numerical studies with dynamic and evolving data streams where it delivers the most encouraging performance in attaining a tradeoff between accuracy and complexity.Comment: this paper has been published by IEEE Transactions on Fuzzy System

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Learning from Ontology Streams with Semantic Concept Drift

Author: Chen Huajun
Chen Jiaoyan
Lecue Freddy
Pan Jeff
Publication venue
Publication date: 24/04/2017
Field of study

Data stream learning has been largely studied for extracting knowledge structures from continuous and rapid data records. In the semantic Web, data is interpreted in ontologies and its ordered sequence is represented as an ontology stream. Our work exploits the semantics of such streams to tackle the problem of concept drift i.e., unexpected changes in data distribution, causing most of models to be less accurate as time passes. To this end we revisited (i) semantic inference in the context of supervised stream learning, and (ii) models with semantic embeddings. The experiments show accurate prediction with data from Dublin and Beijing

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Next challenges for adaptive learning systems

Author: Bifet A.
Gaber M.
Gabrys B.
Gama J.
Minku L.
Musial K.
Zliobaite I.
Publication venue
Publication date: 01/01/2012
Field of study

University of Birmingham Research Portal

Portsmouth University Research Portal (Pure)

Improving adaptation and interpretability of a short-term traffic forecasting system

Author: Casas Vilaró Jordi
Djukic Tamara
Gavaldà Mestre Ricard
Mena Yedra Rafael
Publication venue
Publication date: 01/01/2017
Field of study

Traffic management is being more important than ever, especially in overcrowded big cities with over-pollution problems and with new unprecedented mobility changes. In this scenario, road-traffic prediction plays a key role within Intelligent Transportation Systems, allowing traffic managers to be able to anticipate and take the proper decisions. This paper aims to analyse the situation in a commercial real-time prediction system with its current problems and limitations. The analysis unveils the trade-off between simple parsimonious models and more complex models. Finally, we propose an enriched machine learning framework, Adarules, for the traffic prediction in real-time facing the problem as continuously incoming data streams with all the commonly occurring problems in such volatile scenario, namely changes in the network infrastructure and demand, new detection stations or failure ones, among others. The framework is also able to infer automatically the most relevant features to our end-task, including the relationships within the road network. Although the intention with the proposed framework is to evolve and grow with new incoming big data, however there is no limitation in starting to use it without any prior knowledge as it can starts learning the structure and parameters automatically from data. We test this predictive system in different real-work scenarios, and evaluate its performance integrating a multi-task learning paradigm for the sake of the traffic prediction task.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC