Search CORE

3,162 research outputs found

GraphLab: A New Framework for Parallel Machine Learning

Author: Bickson Danny
Gonzalez Joseph
Guestrin Carlos
Hellerstein Joseph M.
Kyrola Aapo
Low Yucheng
Publication venue
Publication date: 01/01/2010
Field of study

Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and Pthreads leave ML experts repeatedly solving the same design challenges. By targeting common patterns in ML, we developed GraphLab, which improves upon abstractions like MapReduce by compactly expressing asynchronous iterative algorithms with sparse computational dependencies while ensuring data consistency and achieving a high degree of parallel performance. We demonstrate the expressiveness of the GraphLab framework by designing and implementing parallel versions of belief propagation, Gibbs sampling, Co-EM, Lasso and Compressed Sensing. We show that using GraphLab we can achieve excellent parallel performance on large scale real-world problems

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Accelerating Iterative Computations for Large-Scale Data Processing

Author: Yin Jiangtao
Publication venue: ScholarWorks@UMass Amherst
Publication date: 10/11/2016
Field of study

Recent advances in sensing, storage, and networking technologies are creating massive amounts of data at an unprecedented scale and pace. Large-scale data processing is commonly leveraged to make sense of these data, which will enable companies, governments, and organizations, to make better decisions and bring convenience to our daily life. However, the massive amount of data involved makes it challenging to perform data processing in a timely manner. On the one hand, huge volumes of data might not even fit into the disk of a single machine. On the other hand, data mining and machine learning algorithms, which are usually involved in large-scale data processing, typically require time-consuming iterative computations. Therefore, it is imperative to efficiently perform iterative computations on large computer clusters or cloud using highly-parallel and shared-nothing distributed systems. This research aims to explore new forms of iterative computations that reduce unnecessary computations so as to accelerate large-scale data processing in a distributed environment. We propose the iterative computation transformation for well-known data mining and machine learning algorithms, such as expectation-maximization, nonnegative matrix factorization, belief propagation, and graph algorithms (e.g., PageRank). These algorithms have been used in a wide range of application domains. First, we show how to accelerate expectation-maximization algorithms with frequent updates in a distributed environment. Then, we illustrate the way of efficiently scaling distributed nonnegative matrix factorization with block-wise updates. Next, our approach of scaling distributed belief propagation with prioritized block updates is presented. Last, we illustrate how to efficiently perform distributed incremental computation on evolving graphs. We will elaborate how to implement these transformed iterative computations on existing distributed programming models such as the MapReduce-based model, as well as develop new scalable and efficient distributed programming models and frameworks when necessary. The goal of these supporting distributed frameworks is to lift the burden of the programmers in specifying transformation of iterative computations and communication mechanisms, and automatically optimize the execution of the computation. Our techniques are evaluated extensively to demonstrate their efficiency. While the techniques we propose are in the context of specific algorithms, they address the challenges commonly faced in many other algorithms

ScholarWorks@UMass Amherst

A systematic review on multi-criteria group decision-making methods based on weights: analysis and classification scheme

Author: Boix Cots David
Pardo Bosch Francesc
Pujadas Álvarez Pablo
Publication venue: Elsevier
Publication date: 01/08/2023
Field of study

Interest in group decision-making (GDM) has been increasing prominently over the last decade. Access to global databases, sophisticated sensors which can obtain multiple inputs or complex problems requiring opinions from several experts have driven interest in data aggregation. Consequently, the field has been widely studied from several viewpoints and multiple approaches have been proposed. Nevertheless, there is a lack of general framework. Moreover, this problem is exacerbated in the case of experts’ weighting methods, one of the most widely-used techniques to deal with multiple source aggregation. This lack of general classification scheme, or a guide to assist expert knowledge, leads to ambiguity or misreading for readers, who may be overwhelmed by the large amount of unclassified information currently available. To invert this situation, a general GDM framework is presented which divides and classifies all data aggregation techniques, focusing on and expanding the classification of experts’ weighting methods in terms of analysis type by carrying out an in-depth literature review. Results are not only classified but analysed and discussed regarding multiple characteristics, such as MCDMs in which they are applied, type of data used, ideal solutions considered or when they are applied. Furthermore, general requirements supplement this analysis such as initial influence, or component division considerations. As a result, this paper provides not only a general classification scheme and a detailed analysis of experts’ weighting methods but also a road map for researchers working on GDM topics or a guide for experts who use these methods. Furthermore, six significant contributions for future research pathways are provided in the conclusions.The first author acknowledges support from the Spanish Ministry of Universities [grant number FPU18/01471]. The second and third author wish to recognize their support from the Serra Hunter program. Finally, this work was supported by the Catalan agency AGAUR through its research group support program (2017SGR00227). This research is part of the R&D project IAQ4EDU, reference no. PID2020-117366RB-I00, funded by MCIN/AEI/10.13039/ 501100011033.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Reinforcement Learning: A Survey

Author: Kaelbling L. P.
Littman M. L.
Moore A. W.
Publication venue
Publication date: 01/01/1996
Field of study

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

A Bayesian Hyperprior Approach for Joint Image Denoising and Interpolation, with an Application to HDR Imaging

Author: Aguerrebere Cecilia
Almansa Andrés
Delon Julie
Gousseau Yann
Musé Pablo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Recently, impressive denoising results have been achieved by Bayesian approaches which assume Gaussian models for the image patches. This improvement in performance can be attributed to the use of per-patch models. Unfortunately such an approach is particularly unstable for most inverse problems beyond denoising. In this work, we propose the use of a hyperprior to model image patches, in order to stabilize the estimation procedure. There are two main advantages to the proposed restoration scheme: Firstly it is adapted to diagonal degradation matrices, and in particular to missing data problems (e.g. inpainting of missing pixels or zooming). Secondly it can deal with signal dependent noise models, particularly suited to digital cameras. As such, the scheme is especially adapted to computational photography. In order to illustrate this point, we provide an application to high dynamic range imaging from a single image taken with a modified sensor, which shows the effectiveness of the proposed scheme.Comment: Some figures are reduced to comply with arxiv's size constraints. Full size images are available as HAL technical report hal-01107519v5, IEEE Transactions on Computational Imaging, 201

arXiv.org e-Print Archive

HAL Descartes