Search CORE

3 research outputs found

A Clustering-Based Algorithm for Data Reduction

Author: Lee Shie-Jue
Ouyang Jeng
Yeh Chi-Yuan
Publication venue: IEEE SMC Hiroshima Chapter
Publication date: 01/11/2009
Field of study

Finding an efficient data reduction method for large-scale problems is an imperative task. In this paper, we propose a similarity-based self-constructing fuzzy clustering algorithm to do the sampling of instances for the classification task. Instances that are similar to each other are grouped into the same cluster. When all the instances have been fed in, a number of clusters are formed automatically. Then the statistical mean for each cluster will be regarded as representing all the instances covered in the cluster. This approach has two advantages. One is that it can be faster and uses less storage memory. The other is that the number of new representative instances need not be specified in advance by the user. Experiments on real-world datasets show that our method can run faster and obtain better reduction rate than other methods

Hiroshima University Institutional Repository

Okayama University Scientific Achievement Repository

A Study of Support Vectors on Model Independent Example Selection

Author: Huan Li
Kay Sung
Liu Kah
Nadeem Ahmed Syed
Syed Huan
Publication venue
Publication date: 01/01/1999
Field of study

As databases for real-world problems increase in size, there is a need in many situations to select and keep relevant training data for ecient storage and processing reasons. Support vector machines (SVMs) reportedly exhibit certain desirable properties in selecting and preserving useful training data as support vectors. This paper attempts to quantify the extent to which SVM training behaves like a model independent example selection procedure. Using several common machine-learning training databases, we compare the prediction results obtained by dierent classiers, trained with data selected by SVMs and by two other example selection methods (IB2 and random sampling). Some interesting observations are made with explanations. 1 Introduction Example-based learning is an attractive framework for extracting knowledge from empirical data, with the goal of generalizing well on new input patterns. Many realworld processes can be formulated as a 2-way classi- cation task that may be solv..

CiteSeerX