1 research outputs found

    Revisiting Large Scale Distributed Machine Learning

    Full text link
    Nowadays, with the widespread of smartphones and other portable gadgets equipped with a variety of sensors, data is ubiquitous available and the focus of machine learning has shifted from being able to infer from small training samples to dealing with large scale high-dimensional data. In domains such as personal healthcare applications, which motivates this survey, distributed machine learning is a promising line of research, both for scaling up learning algorithms, but mostly for dealing with data which is inherently produced at different locations. This report offers a thorough overview of and state-of-the-art algorithms for distributed machine learning, for both supervised and unsupervised learning, ranging from simple linear logistic regression to graphical models and clustering. We propose future directions for most categories, specific to the potential personal healthcare applications. With this in mind, the report focuses on how security and low communication overhead can be assured in the specific case of a strictly client-server architectural model. As particular directions we provides an exhaustive presentation of an empirical clustering algorithm, k-windows, and proposed an asynchronous distributed machine learning algorithm that would scale well and also would be computationally cheap and easy to implement
    corecore