109 research outputs found
Parallelization of genetic algorithms using Hadoop Map/Reduce
In this paper we present parallel implementation of genetic algorithm using map/reduce programming paradigm. Hadoop implementation of map/reduce library is used for this purpose. We compare our implementation with implementation presented in [1]. These two implementations are compared in solving One Max (Bit counting) problem. The comparison criteria between implementations are fitness convergence, quality of final solution, algorithm scalability, and cloud resource utilization. Our model for parallelization of genetic algorithm shows better performances and fitness convergence than model presented in [1], but our model has lower quality of solution because of species problem
Ordonnancement dynamique des transferts dans MapReduce sous contrainte de bande passante
National audienceDe nombreux domaines scientifiques font désormais face à un déluge de données. L'une des approches proposées pour permettre le traitement de tels volumes est le paradigme de programmation MapReduce introduit par Google. Ce schéma d'exécution très simple se compose de deux phases, map et reduce entre lesquelles a lieu une phase d'échange massif de données entre les machines exécutant l'application. Dans cet article, nous proposons un système linéaire définissant un partitionnement des données à traiter et un algorithme d'ordonnancement dynamique des transferts afin d'optimiser cette phase intermédiaire. Nous comparons cette approche à celle reposant sur un programme linéaire et un ordonnancement statique par phases. Les expériences menées montrent que notre approche produit des ordonnancements plus compacts en un temps bien plus court
The Nornir run-time system for parallel programs using Kahn process networks on multi-core machines – A flexible alternative to MapReduce
Even though shared-memory concurrency is a paradigm frequently used for developing parallel applications on small- and middle-sized machines, experience has shown that it is hard to use. This is largely caused by synchronization primitives which are low-level, inherently non-deterministic, and, consequently, non-intuitive to use. In this paper, we present the Nornir run-time system. Nornir is comparable to well-known frameworks such as MapReduce and Dryad that are recognized for their efficiency and simplicity. Unlike these frameworks, Nornir also supports process structures containing branches and cycles. Nornir is based on the formalism of Kahn process networks, which is a shared-nothing, message-passing model of concurrency. We deem this model a simple and deterministic alternative to shared-memory concurrency. Experiments with real and synthetic benchmarks on up to 8 CPUs show that performance in most cases scales almost linearly with the number of CPUs, when not limited by data dependencies. We also show that the modeling flexibility allows Nornir to outperform its MapReduce counterparts using well-known benchmarks.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited
Heuristics for periodical batch job scheduling in a MapReduce computing framework
Task scheduling has a significant impact on the performance of the MapReduce computing
framework. In this paper, a scheduling problem of periodical batch jobs with makespan minimization
is considered. The problem is modeled as a general two-stage hybrid flow shop
scheduling problem with schedule-dependent setup times. The new model incorporates the
data locality of tasks and is formulated as an integer program. Three heuristics are developed
to solve the problem and an improvement policy based on data locality is presented to enhance
the methods. A lower bound of the makespan is derived. 150 instances are randomly
generated from data distributions drawn from a real cluster. The parameters involved in the
methods are set according to different cluster setups. The proposed heuristics are compared
over different numbers of jobs and cluster setups. Computational results show that the performance
of the methods is highly dependent on both the number of jobs and the cluster setups.
The proposed improvement policy is effective and the impact of the input data distribution on
the policy is analyzed and tested.This work is supported by the National Natural Science Foundation of China (No. 61272377) and the Specialized Research Fund for the Doctoral Program of Higher Education (No. 20120092110027). Ruben Ruiz is partially supported by the Spanish Ministry of Economy and Competitiveness, under the project "RESULT - Realistic Extended Scheduling Using Light Techniques" (No. DPI2012-36243-C02-01) partially financed with FEDER funds.Xiaoping Li; Tianze Jiang; Ruiz García, R. (2016). Heuristics for periodical batch job scheduling in a MapReduce computing framework. Information Sciences. 326:119-133. https://doi.org/10.1016/j.ins.2015.07.040S11913332
- …