Search CORE

16 research outputs found

Improving adaptive bagging methods for evolving data streams

Author: A. Bifet
F. Chu
F. Gustafsson
G. Hulten
J. Campo-Ávila del
J. Gama
L. Breiman
M. Basseville
N. Oza
P. Zhang
T. Mitchell
W.N. Street
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We propose two new improvements for bagging methods on evolving data streams. Recently, two new variants of Bagging were proposed: ADWIN Bagging and Adaptive-Size Hoeffding Tree (ASHT) Bagging. ASHT Bagging uses trees of different sizes, and ADWIN Bagging uses ADWIN as a change detector to decide when to discard underperforming ensemble members. We improve ADWIN Bagging using Hoeffding Adaptive Trees, trees that can adaptively learn from data streams that change over time. To speed up the time for adapting to change of Adaptive-Size Hoeffding Tree (ASHT) Bagging, we add an error change detector for each classifier. We test our improvements by performing an evaluation study on synthetic and real-world datasets comprising up to ten million examples

Crossref

Research Commons@Waikato

Detecting change via competence model

Author: A. Dries
A. Tsymbal
B. Smyth
B. Smyth
B. Smyth
C.-J. Tsai
G. Hulten
G. Widmer
G. Widmer
H. Wang
J. Gama
J. Gao
J.Z. Kolter
K. Nishida
L. Cohen
M.A. Maloof
N. Lu
P. Zhang
R. Klinkenberg
S.J. Delany
W. Fan
W.N. Street
X. Song
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

In real world applications, interested concepts are more likely to change rather than remain stable, which is known as concept drift. This situation causes problems on predictions for many learning algorithms including case-base reasoning (CBR). When learning under concept drift, a critical issue is to identify and determine "when" and "how" the concept changes. In this paper, we developed a competence-based empirical distance between case chunks and then proposed a change detection method based on it. As a main contribution of our work, the change detection method provides an approach to measure the distribution change of cases of an infinite domain through finite samples and requires no prior knowledge about the case distribution, which makes it more practical in real world applications. Also, different from many other change detection methods, we not only detect the change of concepts but also quantify and describe this change. © 2010 Springer-Verlag

Crossref

OPUS - University of Technology Sydney

Bagging with Adaptive Costs

Author: W.N. Street
Yi Zhang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Clustering via Concave Minimization

Author: Bradley P.S.
Mangasarian O.L.
Street W.N.
Publication venue
Publication date: 01/01/1996
Field of study

The problem of assigning m points in the n-dimensional real space R^n to k clusters is formulated as that of determining k centers in R^n such that the sum of distance of each point to the nearest center in minimized. If a polyhedral distance is used, the problem can be formulated as that of minimizing a piecewise-linear concave function on a polyhedral set which is shown to be equivalent to a bilinear program: minimizing a bilinear function on a polyhedral set. A fast finite k-Median Algorithm consisting of solving few linear programs in closed form leads to a stationary point of the bilinear program. Computational testing on a number of real-world databases was carried out. On the Wisconsin Diagnostic Breast Cancer (WDBC) database, k-Median training set correctness was comparable to that of the k-Mean Algorithm, however its testing set correctness was better. Additionally, on the Wisconsin Prognostic Breast Cancer (WPBC) database, distinct and clinically important survival curves were extracted by the k-Median Algorithm, whereas the k-Mean Algorithm failed to obtain such distinct survival curves for the same database

CiteSeerX

Minds@University of Wisconsin