Parallel data mining-case study

Bagnall, T.; Bull, Larry; Pettipher, M.; Studley, Matthew; Tekiner, F.; Whittley, I.

Parallel data mining-case study

Authors: T. Bagnall
Larry Bull
M. Pettipher
Matthew Studley
F. Tekiner
I. Whittley
Publication date: 1 January 2010
Publisher

Abstract

Abstract. The continuing rapid growth of data and knowledge in scientific domain has spurred huge interest in distributed/parallel data and text mining. This paper reports the investigation of a large scale data mining application to supercomputing environment. The aim is to explore some of the issues that may arise in porting and working with the C++/MPI implementation of the ensemble knn application on supercomputers. In this paper we evaluate behaviour of MFS on several large data sets. The aim of this study is to identify how the performance of the ensemble application depends on the nature of the algorithm used, and on the characteristics of the datasets and the analysis to be performed. This can then be used to select the most appropriate algorithm for a given analysis/dataset, and to indicate the optimum number of processors to be used

Similar works

Full text

Available Versions

UWE Bristol Research Repository

oai:uwe-repository.worktribe.c...

Last time updated on 08/06/2020