316 research outputs found
Scalable Observation, Analysis, and Tuning for Parallel Portability in HPC
It is desirable for general productivity that high-performance computing applications be portable to new architectures, or can be optimized for new workflows and input types, without the need for costly code interventions or algorithmic re-writes. Parallel portability programming models provide the potential for high performance and productivity, however they come with a multitude of runtime parameters that can have significant impact on execution performance. Selecting the optimal set of parameters, so that HPC applications perform well in different system environments and on different input data sets, is not trivial.This dissertation maps out a vision for addressing this parallel portability challenge, and then demonstrates this plan through an effective combination of observability, analysis, and in situ machine learning techniques. A platform for general-purpose observation in HPC contexts is investigated, along with support for its use in human-in-the-loop performance understanding and analysis. The dissertation culminates in a demonstration of lessons learned in order to provide automated tuning of HPC applications utilizing parallel portability frameworks
Performance Observability and Monitoring of High Performance Computing with Microservices
Traditionally, High Performance Computing (HPC) softwarehas been built and deployed as bulk-synchronous, parallel
executables based on the message-passing interface (MPI) programming model.
The rise of data-oriented computing paradigms and an explosion
in the variety of applications that need to be supported on HPC
platforms have forced a re-think of the appropriate programming and execution models to integrate this new functionality.
In situ workflows demarcate a paradigm shift in
HPC software development methodologies enabling
a range of new applications ---
from user-level data services to machine learning (ML) workflows that run
alongside traditional scientific simulations.
By tracing the evolution of HPC software developmentover the past 30 years, this dissertation identifies the key elements and trends
responsible for the emergence of coupled, distributed, in situ workflows.
This dissertation's focus is on coupled in situ workflows
involving composable, high-performance microservices. After outlining the motivation
to enable performance observability of these services and why
existing HPC performance tools and techniques can not be applied in this context, this dissertation
proposes a solution wherein a set of techniques gathers, analyzes, and orients performance data from
different sources to generate observability. By leveraging microservice components initially designed
to build high performance data services,
this dissertation demonstrates their broader applicability for building and deploying performance
monitoring and visualization as services within an in situ workflow.
The results from this dissertation suggest that: (1) integration of
performance data from different sources is vital to understanding the performance
of service components, (2) the in situ (online) analysis of this performance data
is needed to enable the adaptivity of distributed components and manage monitoring data volume, (3) statistical modeling combined
with performance observations can help generate better service configurations, and (4) services are a promising
architecture choice for deploying in situ performance monitoring and visualization functionality.
This dissertation includes previously published and co-authored material and unpublished co-authored material
Navigating Diverse Datasets in the Face of Uncertainty
When exploring big volumes of data, one of the challenging aspects is their diversity
of origin. Multiple files that have not yet been ingested into a database system may
contain information of interest to a researcher, who must curate, understand and sieve
their content before being able to extract knowledge.
Performance is one of the greatest difficulties in exploring these datasets. On the
one hand, examining non-indexed, unprocessed files can be inefficient. On the other
hand, any processing before its understanding introduces latency and potentially un-
necessary work if the chosen schema matches poorly the data. We have surveyed the
state-of-the-art and, fortunately, there exist multiple proposal of solutions to handle
data in-situ performantly.
Another major difficulty is matching files from multiple origins since their schema
and layout may not be compatible or properly documented. Most surveyed solutions
overlook this problem, especially for numeric, uncertain data, as is typical in fields
like astronomy.
The main objective of our research is to assist data scientists during the exploration
of unprocessed, numerical, raw data distributed across multiple files based solely on
its intrinsic distribution.
In this thesis, we first introduce the concept of Equally-Distributed Dependencies,
which provides the foundations to match this kind of dataset. We propose PresQ,
a novel algorithm that finds quasi-cliques on hypergraphs based on their expected
statistical properties. The probabilistic approach of PresQ can be successfully exploited to mine EDD between diverse datasets when the underlying populations can
be assumed to be the same.
Finally, we propose a two-sample statistical test based on Self-Organizing Maps
(SOM). This method can outperform, in terms of power, other classifier-based two-
sample tests, being in some cases comparable to kernel-based methods, with the
advantage of being interpretable.
Both PresQ and the SOM-based statistical test can provide insights that drive
serendipitous discoveries
In Situ Visualization of Performance Data in Parallel CFD Applications
This thesis summarizes the work of the author on visualization of performance data in parallel Computational Fluid Dynamics (CFD) simulations.
Current performance analysis tools are unable to show their data on top of complex simulation geometries (e.g. an aircraft engine). But in CFD simulations, performance is expected to be affected by the computations being carried out, which in turn are tightly related to the underlying computational grid.
Therefore it is imperative that performance data is visualized on top of the same computational geometry which they originate from. However, performance tools have no native knowledge of the underlying mesh of the simulation. This scientific gap can be filled by merging the branches of HPC performance analysis and in situ visualization of CFD simulations data, which shall be done by integrating existing, well established state-of-the-art tools from each field.
In this threshold, an extension for the open-source performance tool Score-P was designed and developed, which intercepts an arbitrary number of manually selected code regions (mostly functions) and send their respective measurements – amount of executions and cumulative time spent – to the visualization software ParaView – through its in situ library, Catalyst –, as if they were any other flow-related variable. Subsequently the tool was extended with the capacity to also show communication data (messages sent between MPI ranks) on top of the CFD mesh. Testing and evaluation are done with two industry-grade codes: Rolls-Royce’s CFD code, Hydra, and Onera, DLR and Airbus’ CFD code, CODA.
On the other hand, it has been also noticed that the current performance tools have limited capacity of displaying their data on top of three-dimensional, framed (i.e. time-stepped) representations of the cluster’s topology. Parallel to that, in order for the approach not to be limited to codes which already have the in situ adapter, it was extended to take the performance data and display it – also in codes without in situ – on a three-dimensional, framed representation of the hardware resources being used by the simulation. Testing is done with the Multi-Grid and Block Tri-diagonal NAS Parallel Benchmarks (NPB), as well as with Hydra and CODA again. The benchmarks are used to explain how the new visualizations work, while real performance analyses are done with the industry-grade CFD codes.
The proposed solution is able to provide concrete performance insights, which would not have been reached with the current performance tools and which motivated beneficial changes in the respective source code in real life. Finally, its overhead is discussed and proven to be suitable for usage with CFD codes. The dissertation provides a valuable addition to the state of the art of highly parallel CFD performance analysis and serves as basis for further suggested research directions
Land Use Identification of the Metropolitan Area of Guadalajara Using Bicycle Data: An Unsupervised Classification Approach
El siguiente trabajo propone diferentes maneras de resolver una problemática que se encuentra en la actualidad, que es el hacer la investigación en el área de land-use, mapeo y comportamiento humano evaluando su movimiento por medio de fuentes de información que contienen información geo referenciada, también se comparte la meta de clasificar diferentes secciones y su relación entre ellas. Se utilizó como fuente de información MiBici que es una plataforma de compartimiento de bicicleta que existe en la ciudad de Guadalajara, Jalisco, la cual comparte mes tras mes un archivo consolidado de los viajes que se realizan en cada mes, cabe mencionar que el acceso de esta información es totalmente libre. Las metodologías utilizadas fueron agile para planeación del proyecto, KNN, Decision Trees y KMeans para la cauterización de las zonas, el lenguaje de programación utilizado fue Python, además se anexo una propuesta de implementación utilizando la plataforma de Amazon Web Service con el objetivo de proponer una solución más “sencilla” de implementar, pero con el mismo valor que hacerlo con puros recursos libres. El proceso se dividió primordialmente en 3 partes en donde la primera fue limpiar datos y entenderlos, se aplicaron algoritmos machine learning que fueron Decision tree y KNN, para la segunda etapa evaluando los resultados de la etapa anterior se hicieron modificaciones a los datos en donde se agregaron nuevos campos para mejor los resultados y se aplicó KMeans para la creación de grupos y como último paso se creó un flujo que inicio con la limpieza de los datos en crudo utilizando herramientas de AWS y se terminó con la interpretación de los resultados finales. Los resultados obtenidos fueron demasiados alentadores ya que los grupos que se obtuvieron fueron demasiados marcados y revisándolo con las zonas relacionadas a los nodos se encontró una gran relación. Sin duda alguna queda aún demasiado trabajo a desarrollar en esta rama de investigación
Scalability in the Presence of Variability
Supercomputers are used to solve some of the world’s most computationally demanding
problems. Exascale systems, to be comprised of over one million cores and capable of 10^18
floating point operations per second, will probably exist by the early 2020s, and will provide
unprecedented computational power for parallel computing workloads. Unfortunately,
while these machines hold tremendous promise and opportunity for applications in High
Performance Computing (HPC), graph processing, and machine learning, it will be a major
challenge to fully realize their potential, because to do so requires balanced execution across
the entire system and its millions of processing elements. When different processors take different
amounts of time to perform the same amount of work, performance imbalance arises,
large portions of the system sit idle, and time and energy are wasted. Larger systems incorporate
more processors and thus greater opportunity for imbalance to arise, as well as larger
performance/energy penalties when it does. This phenomenon is referred to as performance
variability and is the focus of this dissertation.
In this dissertation, we explain how to design system software to mitigate variability
on large scale parallel machines. Our approaches span (1) the design, implementation, and
evaluation of a new high performance operating system to reduce some classes of performance
variability, (2) a new performance evaluation framework to holistically characterize
key features of variability on new and emerging architectures, and (3) a distributed modeling
framework that derives predictions of how and where imbalance is manifesting in order to
drive reactive operations such as load balancing and speed scaling. Collectively, these efforts
provide a holistic set of tools to promote scalability through the mitigation of variability
High-Fidelity Provenance:Exploring the Intersection of Provenance and Security
In the past 25 years, the World Wide Web has disrupted the way news are disseminated and consumed. However, the euphoria for the democratization of news publishing was soon followed by scepticism, as a new phenomenon emerged: fake news. With no gatekeepers to vouch for it, the veracity of the information served over the World Wide Web became a major public concern. The Reuters Digital News Report 2020 cites that in at least half of the EU member countries, 50% or more of the population is concerned about online fake news. To help address the problem of trust on information communi- cated over the World Wide Web, it has been proposed to also make available the provenance metadata of the information. Similar to artwork provenance, this would include a detailed track of how the information was created, updated and propagated to produce the result we read, as well as what agents—human or software—were involved in the process. However, keeping track of provenance information is a non-trivial task. Current approaches, are often of limited scope and may require modifying existing applications to also generate provenance information along with thei regular output. This thesis explores how provenance can be automatically tracked in an application-agnostic manner, without having to modify the individual applications. We frame provenance capture as a data flow analysis problem and explore the use of dynamic taint analysis in this context. Our work shows that this appoach improves on the quality of provenance captured compared to traditonal approaches, yielding what we term as high-fidelity provenance. We explore the performance cost of this approach and use deterministic record and replay to bring it down to a more practical level. Furthermore, we create and present the tooling necessary for the expanding the use of using deterministic record and replay for provenance analysis. The thesis concludes with an application of high-fidelity provenance as a tool for state-of-the art offensive security analysis, based on the intuition that software too can be misguided by "fake news". This demonstrates that the potential uses of high-fidelity provenance for security extend beyond traditional forensics analysis
- …