6,146 research outputs found
Recommended from our members
A Clustering System for Dynamic Data Streams Based on Metaheuristic Optimisation
open access articleThis article presents the Optimised Stream clustering algorithm (OpStream), a novel approach to cluster dynamic data streams. The proposed system displays desirable features, such as a low number of parameters and good scalability capabilities to both high-dimensional data and numbers of clusters in the dataset, and it is based on a hybrid structure using deterministic clustering methods and stochastic optimisation approaches to optimally centre the clusters. Similar to other state-of-the-art methods available in the literature, it uses “microclusters” and other established techniques, such as density based clustering. Unlike other methods, it makes use of metaheuristic optimisation to maximise performances during the initialisation phase, which precedes the classic online phase. Experimental results show that OpStream outperforms the state-of-the-art methods in several cases, and it is always competitive against other comparison algorithms regardless of the chosen optimisation method. Three variants of OpStream, each coming with a different optimisation algorithm, are presented in this study. A thorough sensitive analysis is performed by using the best variant to point out OpStream’s robustness to noise and resiliency to parameter changes
Ontology of core data mining entities
In this article, we present OntoDM-core, an ontology of core data mining
entities. OntoDM-core defines themost essential datamining entities in a three-layered
ontological structure comprising of a specification, an implementation and an application
layer. It provides a representational framework for the description of mining
structured data, and in addition provides taxonomies of datasets, data mining tasks,
generalizations, data mining algorithms and constraints, based on the type of data.
OntoDM-core is designed to support a wide range of applications/use cases, such as
semantic annotation of data mining algorithms, datasets and results; annotation of
QSAR studies in the context of drug discovery investigations; and disambiguation of
terms in text mining. The ontology has been thoroughly assessed following the practices
in ontology engineering, is fully interoperable with many domain resources and
is easy to extend
A customizable multi-agent system for distributed data mining
We present a general Multi-Agent System framework for
distributed data mining based on a Peer-to-Peer model. Agent
protocols are implemented through message-based asynchronous
communication. The framework adopts a dynamic load balancing
policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances
Evaluation of Machine Learning Algorithms for Intrusion Detection System
Intrusion detection system (IDS) is one of the implemented solutions against
harmful attacks. Furthermore, attackers always keep changing their tools and
techniques. However, implementing an accepted IDS system is also a challenging
task. In this paper, several experiments have been performed and evaluated to
assess various machine learning classifiers based on KDD intrusion dataset. It
succeeded to compute several performance metrics in order to evaluate the
selected classifiers. The focus was on false negative and false positive
performance metrics in order to enhance the detection rate of the intrusion
detection system. The implemented experiments demonstrated that the decision
table classifier achieved the lowest value of false negative while the random
forest classifier has achieved the highest average accuracy rate
- …