8 research outputs found
Polyhedral Model in ROSE
The Polyhedral Model has been an academic topic since the early eighties. It has primarily been used for systolic architecture generation and loop tranformations. During the last ten years, interest in this model has increased for many reasons: computing power has increased (Polyhedral Model is compute intensive), classical heuristic loop transformation methods have reached their limits, architectural behaviour is becoming unpredictable because of its complexity and new hardware (like graphic accelerators) are opening new horizons. This report presents the implementation in ROSE, a research compiler, of Farkas' Algorithm. A new way to apply this algorithm has been developed; this variation of the original algorithm enables elimination of all hidden variables created by Farkas' Algorithm. This algorithm can also be applied in parallel to use multicore processors, alleviating the cost of the modification
Structured Thoughts Automaton: First Formalized Execution Model for Auto-Regressive Language Models
In recent months, Language Models (LMs) have become a part of daily
discourse, with focus on OpenAI and the potential of Artificial General
Intelligence (AGI). Furthermore, the leaking of LLama's weights to the public
has led to an influx of innovations demonstrating the impressive capabilities
of generative LMs. While we believe that AGI is still a distant goal, we
recognize the potential of LMs in solving tasks such as searching complex
documents, compiling reports with basic analysis, and providing assistance in
problem-solving. In this paper, we propose formalizing the execution model of
language models. We investigate current execution models, to find that this
formalism has received little attention, and present our contribution: the
first formalized execution model for LMs. We introduce a new algorithm for
sampling the predictions of LMs, which we use to build a reliable and
inspectable execution model. We introduce a low-level language to write
"cognitive program" for this execution model. We hope to shed light on the need
for execution models for LMs and encourage further research in this area.Comment: Submitted to CGO-2
Making Machine Learning Datasets and Models FAIR for HPC: A Methodology and Case Study
The FAIR Guiding Principles aim to improve the findability, accessibility,
interoperability, and reusability of digital content by making them both human
and machine actionable. However, these principles have not yet been broadly
adopted in the domain of machine learning-based program analyses and
optimizations for High-Performance Computing (HPC). In this paper, we design a
methodology to make HPC datasets and machine learning models FAIR after
investigating existing FAIRness assessment and improvement techniques. Our
methodology includes a comprehensive, quantitative assessment for elected data,
followed by concrete, actionable suggestions to improve FAIRness with respect
to common issues related to persistent identifiers, rich metadata descriptions,
license and provenance information. Moreover, we select a representative
training dataset to evaluate our methodology. The experiment shows the
methodology can effectively improve the dataset and model's FAIRness from an
initial score of 19.1% to the final score of 83.0%
Data Race Detection Using Large Language Models
Large language models (LLMs) are demonstrating significant promise as an
alternate strategy to facilitate analyses and optimizations of high-performance
computing programs, circumventing the need for resource-intensive manual tool
creation. In this paper, we explore a novel LLM-based data race detection
approach combining prompting engineering and fine-tuning techniques. We create
a dedicated dataset named DRB-ML, which is derived from DataRaceBench, with
fine-grain labels showing the presence of data race pairs and their associated
variables, line numbers, and read/write information. DRB-ML is then used to
evaluate representative LLMs and fine-tune open-source ones. Our experiment
shows that LLMs can be a viable approach to data race detection. However, they
still cannot compete with traditional data race detection tools when we need
detailed information about variable pairs causing data races
Towards Seamless Management of AI Models in High-Performance Computing
With the increasing prevalence of artificial intelligence (AI) in diverse
science/engineering communities, AI models emerge on an unprecedented scale
among various domains. However, given the complexity and diversity of the
software and hardware environments, reusing AI artifacts (models and datasets)
is extremely challenging, especially with AI-driven science applications.
Building an ecosystem to run and reuse AI applications/datasets at scale
efficiently becomes increasingly essential for diverse science and engineering
and high-performance computing (HPC) communities. In this paper, we innovate
over an HPC-AI ecosystem -- HPCFair, which enables the Findable, Accessible,
Interoperable, and Reproducible (FAIR) principles. HPCFair enables the
collection of AI models/datasets allowing users to download/upload AI artifacts
with authentications. Most importantly, our proposed framework provides
user-friendly APIs for users to easily run inference jobs and customize AI
artifacts to their tasks as needed. Our results show that, with HPCFair API,
users irrespective of technical expertise in AI, can easily leverage AI
artifacts to their tasks with minimal effort.Comment: Accepted at the 2nd Annual AAAI Workshop on AI to Accelerate Science
and Engineering (AI2ASE
HPC-GPT: Integrating Large Language Model for High-Performance Computing
Large Language Models (LLMs), including the LLaMA model, have exhibited their
efficacy across various general-domain natural language processing (NLP) tasks.
However, their performance in high-performance computing (HPC) domain tasks has
been less than optimal due to the specialized expertise required to interpret
the model responses. In response to this challenge, we propose HPC-GPT, a novel
LLaMA-based model that has been supervised fine-tuning using generated QA
(Question-Answer) instances for the HPC domain. To evaluate its effectiveness,
we concentrate on two HPC tasks: managing AI models and datasets for HPC, and
data race detection. By employing HPC-GPT, we demonstrate comparable
performance with existing methods on both tasks, exemplifying its excellence in
HPC-related scenarios. Our experiments on open-source benchmarks yield
extensive results, underscoring HPC-GPT's potential to bridge the performance
gap between LLMs and HPC-specific tasks. With HPC-GPT, we aim to pave the way
for LLMs to excel in HPC domains, simplifying the utilization of language
models in complex computing applications.Comment: 9 page
Application of deep-learning to compiler-based graphs
Cavazos, JohnGraph-structured data is used in many domains to represent complex objects, such as the molecular structure of chemicals or interactions between members of a social network. However, extracting meaningful information from these graphs is a difficult task, which is often undertaken on a case by case basis. Devising automated methods to mine information from graphs has become increasingly important as the use of graphs becomes more prevalent. Techniques have been developed that adapt algorithms, like support vector machine, to extract information from graphs with minimal preprocessing. Unfortunately, none of these techniques permit the use of deep neural networks (DNNs) to learn from graphs. Given the potential of DNNs to learn from large amounts of data, this has become an important area of interest. Recently, a technique based on graph spectral analysis was proposed to characterize graphs in a way that allows them to be used as input by DNNs. ☐ We used this technique to apply DNNs to two different systems problems, i.e., 1) classifying malicious applications based on graph-structured representations of executable code and 2) developing prediction models that assist in iterative compilation to optimize and parallelize scientific code. Our results on malicious application classification show that graph-based characterizations increase the ability of DNN to distinguish malware from different families. We performed a detailed evaluation of deep learning applied to state-of-the-art and graph-based malware characterizations. The graph-based characterizations are obtained by reverse engineering potentially malicious applications. For performance prediction, the graphs represent versions of optimized code. We use machine learning to rank these versions and inform an iterative compilation process. The models are trained using only five percent of the search space. ☐ Our work shows that graph structured data can be used to build powerful deep learning models. The techniques developed for this dissertation shows great potential in a diverse pair of systems.University of Delaware, Department of Computer and Information SciencesPh.D
LM4HPC: Towards Effective Language Model Application in High-Performance Computing
In recent years, language models (LMs), such as GPT-4, have been widely used
in multiple domains, including natural language processing, visualization, and
so on. However, applying them for analyzing and optimizing high-performance
computing (HPC) software is still challenging due to the lack of HPC-specific
support. In this paper, we design the LM4HPC framework to facilitate the
research and development of HPC software analyses and optimizations using LMs.
Tailored for supporting HPC datasets, AI models, and pipelines, our framework
is built on top of a range of components from different levels of the machine
learning software stack, with Hugging Face-compatible APIs. Using three
representative tasks, we evaluated the prototype of our framework. The results
show that LM4HPC can help users quickly evaluate a set of state-of-the-art
models and generate insightful leaderboards