1 research outputs found
Revisiting Large Scale Distributed Machine Learning
Nowadays, with the widespread of smartphones and other portable gadgets
equipped with a variety of sensors, data is ubiquitous available and the focus
of machine learning has shifted from being able to infer from small training
samples to dealing with large scale high-dimensional data. In domains such as
personal healthcare applications, which motivates this survey, distributed
machine learning is a promising line of research, both for scaling up learning
algorithms, but mostly for dealing with data which is inherently produced at
different locations. This report offers a thorough overview of and
state-of-the-art algorithms for distributed machine learning, for both
supervised and unsupervised learning, ranging from simple linear logistic
regression to graphical models and clustering. We propose future directions for
most categories, specific to the potential personal healthcare applications.
With this in mind, the report focuses on how security and low communication
overhead can be assured in the specific case of a strictly client-server
architectural model. As particular directions we provides an exhaustive
presentation of an empirical clustering algorithm, k-windows, and proposed an
asynchronous distributed machine learning algorithm that would scale well and
also would be computationally cheap and easy to implement