20 research outputs found
Enhancing in-memory Efficiency for MapReduce-based Data Processing
This is a post-peer-review, pre-copyedit version of an article published in Journal of Parallel and Distributed Computing. The final authenticated version is available online at: https://doi.org/10.1016/j.jpdc.2018.04.001[Abstract] As the memory capacity of computational systems increases, the in-memory data management of Big Data processing frameworks becomes more crucial for performance. This paper analyzes and improves the memory efficiency of Flame-MR, a framework that accelerates Hadoop applications, providing valuable insight into the impact of memory management on performance. By optimizing memory allocation, the garbage collection overheads and execution times have been reduced by up to 85% and 44%, respectively, on a multi-core cluster. Moreover, different data buffer implementations are evaluated, showing that off-heap buffers achieve better results overall. Memory resources are also leveraged by caching intermediate results, improving iterative applications by up to 26%. The memory-enhanced version of Flame-MR has been compared with Hadoop and Spark on the Amazon EC2 cloud platform. The experimental results have shown significant performance benefits reducing Hadoop execution times by up to 65%, while providing very competitive results compared to Spark.Ministerio de EconomÃa, industria y Competitividad; TIN2016-75845-P, AEI/FEDER/EUMinisterio de Educación; FPU14/0280
Improving MPI Application Communication Time with an Introspection Monitoring Library
As IPDPS in-person meeting was cancelled, PDSEC will be onlineInternational audienceIn this paper we describe how to improve communication time of MPI parallel applications with the use of a library that enables to monitor MPI applications and allows for introspection (the program itself can query the state of the monitoring system). Based on previous work, this library is able to see how collective communications are decomposed into point-to-point messages. It also features monitoring sessions that allow suspending and restarting the monitoring, limiting it to specific portions of the code. Experiments show that the monitoring overhead is very small and that the proposed features allow for dynamic and efficient rank reordering enabling up to 2-time reduction of communication parts of some program
Improving MPI Application Communication Time with an Introspection Monitoring Library
In this report we describe how to improve communication time of MPI parallel applications with the use of a library that enables to monitor MPI applications and allows for introspection (the program itself can query the state of the monitoring system). Based on previous work, this library is able to see how collective communications are decomposed into point-to-point messages. It also features monitoring sessions that allow suspending and restarting the monitoring, limiting it to specific portions of the code. Experiments show that the monitoring overhead is very small and that the proposed features allow for dynamic and efficient rank reordering enabling up to 2-time reduction of communication parts of some program.Dans ce rapport, nous décrivons comment améliorer le temps de communication d’applications parallèles écrites en MPI. Pour cela, nous proposons, une bibliothèque qui effectue du contrôle (monitoring) introspectif des applications MPI : le programme peut lui-même interroger le système de contrôle/monitoring). Cette bibliothèque se base sur des travaux précédents qui permettent de voir comment les communications collectives sont décomposées en messages point-à -point. Cette bibliothèque présente aussi des sessions de monitoring pour suspendre et de redémarrer le contrôle permettant de limiter celui-ci à une portion précise du code. Les expériences montrent que le surcout est très faible et que ses caractéristiques permettent une réorganisation dynamique et efficace des rangs résultant à une réduction de moitié du temps de communication de certaines parties du programm
Surveillance dynamique des communications MPI au cours de l’exécution : guide d’utilisation scientifique et documentation technique
Understanding application communication patterns became increasingly relevant as the complexity and diversity of the underlying hardware along with elaborate network topologies are making the implementation of portable and efficient algorithms more challenging. Equipped with the knowledge of the communication patterns, external tools can predict and improve the performance of applications either by modifying the process placement or by changing the communication infrastructure parameters to refine the match between the application requirementsand the message passing library capabilities. This report presents the design and evaluation of a communication monitoring infrastructure developed in the Open MPI software stack and able to expose a dynamically configurable level of detail about the application communication patterns, accompanied by a user documentation and a technical report about the implementation details.La diversité ainsi que la complexité des supports de communications couplées à la complexité des topologies résiliennes rendent l’implémentation d’algorithmes portables et efficaces de plus en plus difficile. Il en est devenu particulièrement pertinent d’être capable d’appréhender les modèles de communication des applications. Des outils extérieurs à ces applications peuvent ainsi prévoir et en améliorer les performances, à l’aide de la connaissance de ces modèles, soit en modifiant le placement des processus, soit en changeant les paramètres des infrastructures de communication afin d’affiner la correspondance entre les besoins de ces applications et les possibilités offertes par la bibliothèque de passage de messages. Ce rapport présente la conception et l’évaluation d’une infrastructure de surveillance des communications développée au sein de la pile logicielle Open MPI. Celle-ci exporte divers niveaux de détails des modèles de communication et est configurable dynamiquement. Ce rapport comprend également un guide d’utilisateur ainsi qu’une documentation technique décrivant les détails d’implémentation
Evaluation and optimization of Big Data Processing on High Performance Computing Systems
Programa Oficial de Doutoramento en Investigación en TecnoloxÃas da Información. 524V01[Resumo]
Hoxe en dÃa, moitas organizacións empregan tecnoloxÃas Big Data para extraer
información de grandes volumes de datos. A medida que o tamaño destes volumes
crece, satisfacer as demandas de rendemento das aplicacións de procesamento
de datos masivos faise máis difÃcil. Esta Tese céntrase en avaliar e optimizar estas
aplicacións, presentando dúas novas ferramentas chamadas BDEv e Flame-MR. Por
unha banda, BDEv analiza o comportamento de frameworks de procesamento Big
Data como Hadoop, Spark e Flink, moi populares na actualidade. BDEv xestiona
a súa configuración e despregamento, xerando os conxuntos de datos de entrada
e executando cargas de traballo previamente elixidas polo usuario. Durante cada
execución, BDEv extrae diversas métricas de avaliación que inclúen rendemento,
uso de recursos, eficiencia enerxética e comportamento a nivel de microarquitectura.
Doutra banda, Flame-MR permite optimizar o rendemento de aplicacións Hadoop
MapReduce. En xeral, o seu deseño baséase nunha arquitectura dirixida por eventos
capaz de mellorar a eficiencia dos recursos do sistema mediante o solapamento da
computación coas comunicacións. Ademais de reducir o número de copias en memoria
que presenta Hadoop, emprega algoritmos eficientes para ordenar e mesturar os
datos. Flame-MR substitúe o motor de procesamento de datos MapReduce de xeito
totalmente transparente, polo que non é necesario modificar o código de aplicacións
xa existentes. A mellora de rendemento de Flame-MR foi avaliada de maneira exhaustiva
en sistemas clúster e cloud, executando tanto benchmarks estándar coma
aplicacións pertencentes a casos de uso reais. Os resultados amosan unha redución
de entre un 40% e un 90% do tempo de execución das aplicacións. Esta Tese proporciona
aos usuarios e desenvolvedores de Big Data dúas potentes ferramentas
para analizar e comprender o comportamento de frameworks de procesamento de
datos e reducir o tempo de execución das aplicacións sen necesidade de contar con
coñecemento experto para elo.[Resumen]
Hoy en dÃa, muchas organizaciones utilizan tecnologÃas Big Data para extraer
información de grandes volúmenes de datos. A medida que el tamaño de estos volúmenes
crece, satisfacer las demandas de rendimiento de las aplicaciones de procesamiento
de datos masivos se vuelve más difÃcil. Esta Tesis se centra en evaluar y
optimizar estas aplicaciones, presentando dos nuevas herramientas llamadas BDEv
y Flame-MR. Por un lado, BDEv analiza el comportamiento de frameworks de procesamiento
Big Data como Hadoop, Spark y Flink, muy populares en la actualidad.
BDEv gestiona su configuración y despliegue, generando los conjuntos de datos de
entrada y ejecutando cargas de trabajo previamente elegidas por el usuario. Durante
cada ejecución, BDEv extrae diversas métricas de evaluación que incluyen rendimiento,
uso de recursos, eficiencia energética y comportamiento a nivel de microarquitectura.
Por otro lado, Flame-MR permite optimizar el rendimiento de aplicaciones
Hadoop MapReduce. En general, su diseño se basa en una arquitectura dirigida por
eventos capaz de mejorar la eficiencia de los recursos del sistema mediante el solapamiento
de la computación con las comunicaciones. Además de reducir el número
de copias en memoria que presenta Hadoop, utiliza algoritmos eficientes para ordenar
y mezclar los datos. Flame-MR reemplaza el motor de procesamiento de datos
MapReduce de manera totalmente transparente, por lo que no se necesita modificar
el código de aplicaciones ya existentes. La mejora de rendimiento de Flame-MR ha
sido evaluada de manera exhaustiva en sistemas clúster y cloud, ejecutando tanto
benchmarks estándar como aplicaciones pertenecientes a casos de uso reales. Los
resultados muestran una reducción de entre un 40% y un 90% del tiempo de ejecución
de las aplicaciones. Esta Tesis proporciona a los usuarios y desarrolladores de
Big Data dos potentes herramientas para analizar y comprender el comportamiento
de frameworks de procesamiento de datos y reducir el tiempo de ejecución de las
aplicaciones sin necesidad de contar con conocimiento experto para ello.[Abstract]
Nowadays, Big Data technologies are used by many organizations to extract
valuable information from large-scale datasets. As the size of these datasets increases,
meeting the huge performance requirements of data processing applications
becomes more challenging. This Thesis focuses on evaluating and optimizing these
applications by proposing two new tools, namely BDEv and Flame-MR. On the one
hand, BDEv allows to thoroughly assess the behavior of widespread Big Data processing
frameworks such as Hadoop, Spark and Flink. It manages the configuration
and deployment of the frameworks, generating the input datasets and launching the
workloads specified by the user. During each workload, it automatically extracts
several evaluation metrics that include performance, resource utilization, energy efficiency
and microarchitectural behavior. On the other hand, Flame-MR optimizes
the performance of existing Hadoop MapReduce applications. Its overall design is
based on an event-driven architecture that improves the efficiency of the system
resources by pipelining data movements and computation. Moreover, it avoids redundant
memory copies present in Hadoop, while also using efficient sort and merge
algorithms for data processing. Flame-MR replaces the underlying MapReduce data
processing engine in a transparent way and thus the source code of existing applications
does not require to be modified. The performance benefits provided by Flame-
MR have been thoroughly evaluated on cluster and cloud systems by using both
standard benchmarks and real-world applications, showing reductions in execution
time that range from 40% to 90%. This Thesis provides Big Data users with powerful
tools to analyze and understand the behavior of data processing frameworks and
reduce the execution time of the applications without requiring expert knowledge
Proyecto Docente e Investigador, Trabajo Original de Investigación y Presentación de la Defensa, preparado por Germán Moltó para concursar a la plaza de Catedrático de Universidad, concurso 082/22, plaza 6708, área de Ciencia de la Computación e Inteligencia Artificial
Este documento contiene el proyecto docente e investigador del candidato Germán Moltó MartÃnez presentado como requisito para el concurso de acceso a plazas de Cuerpos Docentes Universitarios. Concretamente, el documento se centra en el concurso para la plaza 6708 de Catedrático de Universidad en el área de Ciencia de la Computación en el Departamento de Sistemas Informáticos y Computación de la Universitat Politécnica de València. La plaza está adscrita a la Escola Técnica Superior d'Enginyeria Informà tica y tiene como perfil las asignaturas "Infraestructuras de Cloud Público" y "Estructuras de Datos y Algoritmos".También se incluye el Historial Académico, Docente e Investigador, asà como la presentación usada durante la defensa.Germán Moltó MartÃnez (2022). Proyecto Docente e Investigador, Trabajo Original de Investigación y Presentación de la Defensa, preparado por Germán Moltó para concursar a la plaza de Catedrático de Universidad, concurso 082/22, plaza 6708, área de Ciencia de la Computación e Inteligencia Artificial. http://hdl.handle.net/10251/18903
Recommended from our members
Asynchronous epidemic algorithms for consistency in large-scale systems
Achieving and detecting a globally consistent state is essential to many services in the large
and extreme-scale distributed systems, especially when the desired consistent state is critical
for services operation. Centralised and deterministic approaches for synchronisation and
distributed consistency are not scalable and not fault-tolerant. Alternatively, epidemic-based
paradigms are decentralised computations based on randomised communications. They are
scalable, resilient, fault-tolerant, and converge to the desired target in logarithmic time with
respect to system size. Thus, many distributed services have adopted epidemic protocols
to achieve the consensus and the consistent state, mainly due to scalability concerns. The
convergence of epidemic protocols is stochastically guaranteed. However, the detection of
the convergence is probabilistic and non-explicit. In a real-world environment, systems are
unreliable, and epidemic protocols cannot converge to the desired state. Thus, achieving
convergence by itself does not ensure making a system-wide consistent state under dynamic
conditions.
The research work presented in this thesis introduces the Phase Transition Algorithm
(PTA) to achieve distributed consistent state based on the explicit detection of convergence.
Each phase in PTA is a decentralised decision-making process that implements epidemic data
aggregation, in which the detection of convergence implies achieving a global agreement. The
phases in PTA can be cascaded to achieve higher certainty as desired. Following the PTA,
two epidemic protocols, namely PTP and ECP, are proposed to acquire of consensus, i.e. for
the consistency in data dissemination and data aggregation. The protocols are examined
through simulations, and experimental results have validated the protocols ability to achieve
and explicitly detect the consensus among system nodes.
The research work has also studied the epidemic data aggregation under nodes churn and
network failures, in which the analysis has identified three phases of the aggregation process.
The investigations have shown a different impact of nodes churn on each phase. The phase
that is critical for the aggregation process has been studied further, which led to propose
new robust data aggregation protocols, REAP and REAP+. Each protocol has a different
decentralised replication method, and both implements distributed failure detection and
instantaneous mass restoration mechanisms. Simulations have validated the protocols, and
results have shown protocols ability to converge, detect convergence, and produce competitive
accuracy under various levels of nodes churn.
Furthermore, distributed consistency in continuous systems is addressed in the research.
The work has proposed a novel continuous epidemic protocol with the adaptive restart
mechanism. The protocol restarts either upon the detection of system convergence or upon
the detection of divergence. Also, the protocol introduces the seed selection method for
the peak data distribution in decentralised approaches, which was a challenge that requires
single-point initialisation and leader-election step. The simulations validated the performance
of the algorithm under static and dynamic conditions and approved that convergence and
divergence detection accuracy can be tuned as desired.
Finally, the research work shows that combining and integrating of the proposed protocols
enables extreme-scale distributed systems to achieve and detect global consistent states even
under realistic and dynamical conditions