Search CORE

8 research outputs found

Gossip Learning with Linear Models on Fully Distributed Data

Author: Hegedüs István
Jelasity Márk
Ormándi Róbert
Publication venue: 'Wiley'
Publication date: 16/05/2012
Field of study

Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is difficult, because there is no possibility to learn local models, the system model offers almost no guarantees for reliability, yet the communication cost needs to be kept low. Here we propose gossip learning, a generic approach that is based on multiple models taking random walks over the network in parallel, while applying an online learning algorithm to improve themselves, and getting combined via ensemble learning methods. We present an instantiation of this approach for the case of classification with linear models. Our main contribution is an ensemble learning method which---through the continuous combination of the models in the network---implements a virtual weighted voting mechanism over an exponential number of models at practically no extra cost as compared to independent random walks. We prove the convergence of the method theoretically, and perform extensive experiments on benchmark datasets. Our experimental analysis demonstrates the performance and robustness of the proposed approach.Comment: The paper was published in the journal Concurrency and Computation: Practice and Experience http://onlinelibrary.wiley.com/journal/10.1002/%28ISSN%291532-0634 (DOI: http://dx.doi.org/10.1002/cpe.2858). The modifications are based on the suggestions from the reviewer

arXiv.org e-Print Archive

Crossref

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Classification in P2P Networks by Bagging Cascade RSVMs

Author: ANG Hock Hee
DATTA Anwitaman
GOPALKRISHNAN Vikvekanand
HOI Steven C. H.
NG Wee Keong
Publication venue: 'VLDB Endowment'
Publication date: 01/08/2008
Field of study

Institutional Knowledge at Singapore Management University

Communication-efficient Classification in P2P Networks

Author: ANG Hock Hee
Gopalkrishnan Vivekanand
HOI Steven C. H.
NG Wee Keong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2009
Field of study

Institutional Knowledge at Singapore Management University

Classification in P2P Networks with Cascade Support Vendor Machines

Author: ANG Hock Hee
Gopalkrishnan Vivekanand
HOI Steven C. H.
NG Wee-Keong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2013
Field of study

Institutional Knowledge at Singapore Management University

Seventh Biennial Report : June 2003 - March 2005

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2005
Field of study

MPG.PuRe

Eight Biennial Report : April 2005 – March 2007

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2007
Field of study

MPG.PuRe

S.: Automatic document organization in a p2p environment

Author: A. Strehl
D. Wolpert
D.D. Lewis
H. Kargupta
I.S. Dhillon
J. Hartigan
J. Platt
K. Gorunova
L. Breiman
R. Baeza-Yates
S. Chakrabarti
Publication venue: Springer
Publication date: 01/01/2006
Field of study

Abstract. This paper describes an efficient method to construct reliable machine learning applications in peer-to-peer (P2P) networks by building ensemble based meta methods. We consider this problem in the context of distributed Web exploration applications like focused crawling. Typical applications are user-specific classification of retrieved Web contents into personalized topic hierarchies as well as automatic refinements of such taxonomies using unsupervised machine learning methods (e.g. clustering). Our approach is to combine models from multiple peers and to construct the advanced decision model that takes the generalization performance of multiple ‘local ’ peer models into account. In addition, meta algorithms can be applied in a restrictive manner, i.e. by leaving out some ‘uncertain ’ documents. The results of our systematic evaluation show the viability of the proposed approach.

CiteSeerX

Crossref

MPG.PuRe