3,162 research outputs found
GraphLab: A New Framework for Parallel Machine Learning
Designing and implementing efficient, provably correct parallel machine
learning (ML) algorithms is challenging. Existing high-level parallel
abstractions like MapReduce are insufficiently expressive while low-level tools
like MPI and Pthreads leave ML experts repeatedly solving the same design
challenges. By targeting common patterns in ML, we developed GraphLab, which
improves upon abstractions like MapReduce by compactly expressing asynchronous
iterative algorithms with sparse computational dependencies while ensuring data
consistency and achieving a high degree of parallel performance. We demonstrate
the expressiveness of the GraphLab framework by designing and implementing
parallel versions of belief propagation, Gibbs sampling, Co-EM, Lasso and
Compressed Sensing. We show that using GraphLab we can achieve excellent
parallel performance on large scale real-world problems
Recommended from our members
Accelerating Iterative Computations for Large-Scale Data Processing
Recent advances in sensing, storage, and networking technologies are creating massive amounts of data at an unprecedented scale and pace. Large-scale data processing is commonly leveraged to make sense of these data, which will enable companies, governments, and organizations, to make better decisions and bring convenience to our daily life. However, the massive amount of data involved makes it challenging to perform data processing in a timely manner. On the one hand, huge volumes of data might not even fit into the disk of a single machine. On the other hand, data mining and machine learning algorithms, which are usually involved in large-scale data processing, typically require time-consuming iterative computations. Therefore, it is imperative to efficiently perform iterative computations on large computer clusters or cloud using highly-parallel and shared-nothing distributed systems.
This research aims to explore new forms of iterative computations that reduce unnecessary computations so as to accelerate large-scale data processing in a distributed environment. We propose the iterative computation transformation for well-known data mining and machine learning algorithms, such as expectation-maximization, nonnegative matrix factorization, belief propagation, and graph algorithms (e.g., PageRank). These algorithms have been used in a wide range of application domains. First, we show how to accelerate expectation-maximization algorithms with frequent updates in a distributed environment. Then, we illustrate the way of efficiently scaling distributed nonnegative matrix factorization with block-wise updates. Next, our approach of scaling distributed belief propagation with prioritized block updates is presented. Last, we illustrate how to efficiently perform distributed incremental computation on evolving graphs.
We will elaborate how to implement these transformed iterative computations on existing distributed programming models such as the MapReduce-based model, as well as develop new scalable and efficient distributed programming models and frameworks when necessary. The goal of these supporting distributed frameworks is to lift the burden of the programmers in specifying transformation of iterative computations and communication mechanisms, and automatically optimize the execution of the computation. Our techniques are evaluated extensively to demonstrate their efficiency. While the techniques we propose are in the context of specific algorithms, they address the challenges commonly faced in many other algorithms
A systematic review on multi-criteria group decision-making methods based on weights: analysis and classification scheme
Interest in group decision-making (GDM) has been increasing prominently over the last decade. Access to global databases, sophisticated sensors which can obtain multiple inputs or complex problems requiring opinions from several experts have driven interest in data aggregation. Consequently, the field has been widely studied from several viewpoints and multiple approaches have been proposed. Nevertheless, there is a lack of general framework. Moreover, this problem is exacerbated in the case of experts’ weighting methods, one of the most widely-used techniques to deal with multiple source aggregation. This lack of general classification scheme, or a guide to assist expert knowledge, leads to ambiguity or misreading for readers, who may be overwhelmed by the large amount of unclassified information currently available. To invert this situation, a general GDM framework is presented which divides and classifies all data aggregation techniques, focusing on and expanding the classification of experts’ weighting methods in terms of analysis type by carrying out an in-depth literature review. Results are not only classified but analysed and discussed regarding multiple characteristics, such as MCDMs in which they are applied, type of data used, ideal solutions considered or when they are applied. Furthermore, general requirements supplement this analysis such as initial influence, or component division considerations. As a result, this paper provides not only a general classification scheme and a detailed analysis of experts’ weighting methods but also a road map for researchers working on GDM topics or a guide for experts who use these methods. Furthermore, six significant contributions for future research pathways are provided in the conclusions.The first author acknowledges support from the Spanish Ministry of Universities [grant number FPU18/01471]. The second and third author wish to recognize their support from the Serra Hunter program. Finally, this work was supported by the Catalan agency AGAUR through its research group support program (2017SGR00227). This research is part of the R&D project IAQ4EDU, reference no. PID2020-117366RB-I00, funded by MCIN/AEI/10.13039/ 501100011033.Peer ReviewedPostprint (published version
Reinforcement Learning: A Survey
This paper surveys the field of reinforcement learning from a
computer-science perspective. It is written to be accessible to researchers
familiar with machine learning. Both the historical basis of the field and a
broad selection of current work are summarized. Reinforcement learning is the
problem faced by an agent that learns behavior through trial-and-error
interactions with a dynamic environment. The work described here has a
resemblance to work in psychology, but differs considerably in the details and
in the use of the word ``reinforcement.'' The paper discusses central issues of
reinforcement learning, including trading off exploration and exploitation,
establishing the foundations of the field via Markov decision theory, learning
from delayed reinforcement, constructing empirical models to accelerate
learning, making use of generalization and hierarchy, and coping with hidden
state. It concludes with a survey of some implemented systems and an assessment
of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file
A Bayesian Hyperprior Approach for Joint Image Denoising and Interpolation, with an Application to HDR Imaging
Recently, impressive denoising results have been achieved by Bayesian
approaches which assume Gaussian models for the image patches. This improvement
in performance can be attributed to the use of per-patch models. Unfortunately
such an approach is particularly unstable for most inverse problems beyond
denoising. In this work, we propose the use of a hyperprior to model image
patches, in order to stabilize the estimation procedure. There are two main
advantages to the proposed restoration scheme: Firstly it is adapted to
diagonal degradation matrices, and in particular to missing data problems (e.g.
inpainting of missing pixels or zooming). Secondly it can deal with signal
dependent noise models, particularly suited to digital cameras. As such, the
scheme is especially adapted to computational photography. In order to
illustrate this point, we provide an application to high dynamic range imaging
from a single image taken with a modified sensor, which shows the effectiveness
of the proposed scheme.Comment: Some figures are reduced to comply with arxiv's size constraints.
Full size images are available as HAL technical report hal-01107519v5, IEEE
Transactions on Computational Imaging, 201
- …