1,432 research outputs found

    Multi-Objective Big Data Optimization with jMetal and Spark

    Get PDF
    Big Data Optimization is the term used to refer to optimization problems which have to manage very large amounts of data. In this paper, we focus on the parallelization of metaheuristics with the Apache Spark cluster computing system for solving multi-objective Big Data Optimization problems. Our purpose is to study the influence of accessing data stored in the Hadoop File System (HDFS) in each evaluation step of a metaheuristic and to provide a software tool to solve these kinds of problems. This tool combines the jMetal multi-objective optimization framework with Apache Spark. We have carried out experiments to measure the performance of the proposed parallel infrastructure in an environment based on virtual machines in a local cluster comprising up to 100 cores. We obtained interesting results for computational e ort and propose guidelines to face multi-objective Big Data Optimization problems.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Hybrid ant colony system and genetic algorithm approach for scheduling of jobs in computational grid

    Get PDF
    Metaheuristic algorithms have been used to solve scheduling problems in grid computing.However, stand-alone metaheuristic algorithms do not always show good performance in every problem instance. This study proposes a high level hybrid approach between ant colony system and genetic algorithm for job scheduling in grid computing.The proposed approach is based on a high level hybridization.The proposed hybrid approach is evaluated using the static benchmark problems known as ETC matrix.Experimental results show that the proposed hybridization between the two algorithms outperforms the stand-alone algorithms in terms of best and average makespan values

    Energy-aware scheduling in heterogeneous computing systems

    Get PDF
    In the last decade, the grid computing systems emerged as useful provider of the computing power required for solving complex problems. The classic formulation of the scheduling problem in heterogeneous computing systems is NP-hard, thus approximation techniques are required for solving real-world scenarios of this problem. This thesis tackles the problem of scheduling tasks in a heterogeneous computing environment in reduced execution times, considering the schedule length and the total energy consumption as the optimization objectives. An efficient multithreading local search algorithm for solving the multi-objective scheduling problem in heterogeneous computing systems, named MEMLS, is presented. The proposed method follows a fully multi-objective approach, applying a Pareto-based dominance search that is executed in parallel by using several threads. The experimental analysis demonstrates that the new multithreading algorithm outperforms a set of fast and accurate two-phase deterministic heuristics based on the traditional MinMin. The new ME-MLS method is able to achieve significant improvements in both makespan and energy consumption objectives in reduced execution times for a large set of testbed instances, while exhibiting very good scalability. The ME-MLS was evaluated solving instances comprised of up to 2048 tasks and 64 machines. In order to scale the dimension of the problem instances even further and tackle large-sized problem instances, the Graphical Processing Unit (GPU) architecture is considered. This line of future work has been initially tackled with the gPALS: a hybrid CPU/GPU local search algorithm for efficiently tackling a single-objective heterogeneous computing scheduling problem. The gPALS shows very promising results, being able to tackle instances of up to 32768 tasks and 1024 machines in reasonable execution times.En la última década, los sistemas de computación grid se han convertido en útiles proveedores de la capacidad de cálculo necesaria para la resolución de problemas complejos. En su formulación clásica, el problema de la planificación de tareas en sistemas heterogéneos es un problema NP difícil, por lo que se requieren técnicas de resolución aproximadas para atacar instancias de tamaño realista de este problema. Esta tesis aborda el problema de la planificación de tareas en sistemas heterogéneos, considerando el largo de la planificación y el consumo energético como objetivos a optimizar. Para la resolución de este problema se propone un algoritmo de búsqueda local eficiente y multihilo. El método propuesto se trata de un enfoque plenamente multiobjetivo que consiste en la aplicación de una búsqueda basada en dominancia de Pareto que se ejecuta en paralelo mediante el uso de varios hilos de ejecución. El análisis experimental demuestra que el algoritmo multithilado propuesto supera a un conjunto de heurísticas deterministas rápidas y e caces basadas en el algoritmo MinMin tradicional. El nuevo método, ME-MLS, es capaz de lograr mejoras significativas tanto en el largo de la planificación y como en consumo energético, en tiempos de ejecución reducidos para un gran número de casos de prueba, mientras que exhibe una escalabilidad muy promisoria. El ME-MLS fue evaluado abordando instancias de hasta 2048 tareas y 64 máquinas. Con el n de aumentar la dimensión de las instancias abordadas y hacer frente a instancias de gran tamaño, se consideró la utilización de la arquitectura provista por las unidades de procesamiento gráfico (GPU). Esta línea de trabajo futuro ha sido abordada inicialmente con el algoritmo gPALS: un algoritmo híbrido CPU/GPU de búsqueda local para la planificación de tareas en en sistemas heterogéneos considerando el largo de la planificación como único objetivo. La evaluación del algoritmo gPALS ha mostrado resultados muy prometedores, siendo capaz de abordar instancias de hasta 32768 tareas y 1024 máquinas en tiempos de ejecución razonables

    Parallel Algorithm for Reduction of Data Processing Time in Big Data

    Get PDF
    Technological advances have allowed to collect and store large volumes of data over the years. Besides, it is significant that today's applications have high performance and can analyze these large datasets effectively. Today, it remains a challenge for data mining to make its algorithms and applications equally efficient in the need of increasing data size and dimensionality [1]. To achieve this goal, many applications rely on parallelism, because it is an area that allows the reduction of cost depending on the execution time of the algorithms because it takes advantage of the characteristics of current computer architectures to run several processes concurrently [2]. This paper proposes a parallel version of the FuzzyPred algorithm based on the amount of data that can be processed within each of the processing threads, synchronously and independently

    A Unified Model for Evolutionary Multiobjective Optimization and its Implementation in a General Purpose Software Framework: ParadisEO-MOEO

    Get PDF
    This paper gives a concise overview of evolutionary algorithms for multiobjective optimization. A substantial number of evolutionary computation methods for multiobjective problem solving has been proposed so far, and an attempt of unifying existing approaches is here presented. Based on a fine-grained decomposition and following the main issues of fitness assignment, diversity preservation and elitism, a conceptual global model is proposed and is validated by regarding a number of state-of-the-art algorithms as simple variants of the same structure. The presented model is then incorporated into a general-purpose software framework dedicated to the design and the implementation of evolutionary multiobjective optimization techniques: ParadisEO-MOEO. This package has proven its validity and flexibility by enabling the resolution of many real-world and hard multiobjective optimization problems
    • …
    corecore