Search CORE

1,544 research outputs found

High Performance Regional Ocean Modeling with GPU Acceleration

Author: Choboter Paul
Lines Spencer
Lupo Chris
Mak Jason
Panzer Ian
Publication venue: DigitalCommons@CalPoly
Publication date: 23/09/2013
Field of study

The Regional Ocean Modeling System (ROMS) is an open-source, free-surface, primitive equation ocean model used by the scientific community for a diverse range of applications [1]. ROMS employs sophisticated numerical techniques, including a split-explicit time-stepping scheme that treats the fast barotropic (2D) and slow baroclinic (3D) modes separately for improved efficiency [2]. ROMS also contains a suite of data assimilation tools that allow the user to improve the accuracy of a simulation by incorporating observational data. These tools are based on four dimensional variational methods [3], which generate reliable results, but require more computational resources than without any assimilation of data. The implementation of ROMS supports two parallel computing models; a distributed memory model that utilizes Message Passing Interface (MPI), and a shared memory model that utilizes OpenMP. Prior research has shown that portions of ROMS can also be executed on a General Purpose Graphics Processing Unit (GPGPU) to take advantage of the massively parallel architecture available on those systems [4]. This paper presents a comparison between two forms of parallelism. NVIDIA Kepler K20X GPUs were used for performance measurement of GPU parallelism using CUDA while an Intel Xeon E5-2650 was used for shared memory parallelism using OpenMP. The implementation is benchmarked using idealistic marine conditions. Our experiments show that OpenMP was the fastest, followed closely by CUDA, while the normal serial version was considerably slower

DigitalCommons@CalPoly

Loopapalooza: Investigating Limits of Loop-Level Parallelism with a Compiler-Driven Approach

Author: Gabrielli Giacomo
Iordanou Konstantinos
Luján Mikel
Zaidi Ali
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/04/2021
Field of study

The University of Manchester - Institutional Repository

Towards Ontology-Based Program Analysis

Author: Chen Guoyang
Liao Chunhua
Shen Xipeng
Zhao Yue
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th European Conference on Object-Oriented Programming (ECOOP 2016)
Publication date: 01/01/2016
Field of study

Program analysis is fundamental for program optimizations, debugging, and many other tasks. But developing program analyses has been a challenging and error-prone process for general users. Declarative program analysis has shown the promise to dramatically improve the productivity in the development of program analyses. Current declarative program analysis is however subject to some major limitations in supporting cooperations among analysis tools, guiding program optimizations, and often requires much effort for repeated program preprocessing. In this work, we advocate the integration of ontology into declarative program analysis. As a way to standardize the definitions of concepts in a domain and the representation of the knowledge in the domain, ontology offers a promising way to address the limitations of current declarative program analysis. We develop a prototype framework named PATO for conducting program analysis upon ontology-based program representation. Experiments on six program analyses confirm the potential of ontology for complementing existing declarative program analysis. It supports multiple analyses without separate program preprocessing, promotes cooperative Liveness analysis between two compilers, and effectively guides a data placement optimization for Graphic Processing Units (GPU)

Dagstuhl Research Online Publication Server

21st Century Simulation: Exploiting High Performance Computing and Data Analysis

Author: Baer Garth D.
Davis Dan M.
Gottschalk Thomas D.
Publication venue
Publication date: 01/12/2004
Field of study

This paper identifies, defines, and analyzes the limitations imposed on Modeling and Simulation by outmoded paradigms in computer utilization and data analysis. The authors then discuss two emerging capabilities to overcome these limitations: High Performance Parallel Computing and Advanced Data Analysis. First, parallel computing, in supercomputers and Linux clusters, has proven effective by providing users an advantage in computing power. This has been characterized as a ten-year lead over the use of single-processor computers. Second, advanced data analysis techniques are both necessitated and enabled by this leap in computing power. JFCOM's JESPP project is one of the few simulation initiatives to effectively embrace these concepts. The challenges facing the defense analyst today have grown to include the need to consider operations among non-combatant populations, to focus on impacts to civilian infrastructure, to differentiate combatants from non-combatants, and to understand non-linear, asymmetric warfare. These requirements stretch both current computational techniques and data analysis methodologies. In this paper, documented examples and potential solutions will be advanced. The authors discuss the paths to successful implementation based on their experience. Reviewed technologies include parallel computing, cluster computing, grid computing, data logging, OpsResearch, database advances, data mining, evolutionary computing, genetic algorithms, and Monte Carlo sensitivity analyses. The modeling and simulation community has significant potential to provide more opportunities for training and analysis. Simulations must include increasingly sophisticated environments, better emulations of foes, and more realistic civilian populations. Overcoming the implementation challenges will produce dramatically better insights, for trainees and analysts. High Performance Parallel Computing and Advanced Data Analysis promise increased understanding of future vulnerabilities to help avoid unneeded mission failures and unacceptable personnel losses. The authors set forth road maps for rapid prototyping and adoption of advanced capabilities. They discuss the beneficial impact of embracing these technologies, as well as risk mitigation required to ensure success

Caltech Authors

The Landscape and Challenges of HPC Research and LLMs

Author: Abebe Waqwoya
Ahmed Nesreen K.
Bhattacharjee Arijit
Butler Branden
Chen Le
Dutta Akash
Hasabnis Niranjan
Jannesari Ali
Mahmud Quazi Ishtiaque
Mattson Tim
Munoz Juan Pablo
Oren Gal
Phan Hung
Sarkar Aishwarya
Vo Vy A.
Willke Theodore L.
Yu Sixing
Publication venue
Publication date: 06/02/2024
Field of study

Recently, language models (LMs), especially large language models (LLMs), have revolutionized the field of deep learning. Both encoder-decoder models and prompt-based techniques have shown immense potential for natural language processing and code-based tasks. Over the past several years, many research labs and institutions have invested heavily in high-performance computing, approaching or breaching exascale performance levels. In this paper, we posit that adapting and utilizing such language model-based techniques for tasks in high-performance computing (HPC) would be very beneficial. This study presents our reasoning behind the aforementioned position and highlights how existing ideas can be improved and adapted for HPC tasks

arXiv.org e-Print Archive

Phase-coherent lightwave communications with frequency combs

Author: Andrekson Peter A.
Foo Benjamin
Karlsson Magnus
Lundberg Lars
Mazur Mikael
Mirani Ali
Schröder Jochen
Torres-Company Victor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/05/2019
Field of study

Fiber-optical networks are a crucial telecommunication infrastructure in society. Wavelength division multiplexing allows for transmitting parallel data streams over the fiber bandwidth, and coherent detection enables the use of sophisticated modulation formats and electronic compensation of signal impairments. In the future, optical frequency combs may replace multiple lasers used for the different wavelength channels. We demonstrate two novel signal processing schemes that take advantage of the broadband phase coherence of optical frequency combs. This approach allows for a more efficient estimation and compensation of optical phase noise in coherent communication systems, which can significantly simplify the signal processing or increase the transmission performance. With further advances in space division multiplexing and chip-scale frequency comb sources, these findings pave the way for compact energy-efficient optical transceivers.Comment: 17 pages, 9 figure

arXiv.org e-Print Archive

Chalmers Research

High Performance with Prescriptive Optimization and Debugging

Author: Jensen Nicklas Bo
Publication venue: Technical University of Denmark
Publication date: 01/01/2017
Field of study

Online Research Database In Technology

The parallel event loop model and runtime: a parallel programming model and runtime system for safe event-based parallel programming

Author: Bonetta Daniele
Pautasso Cesare
Publication venue
Publication date: 01/11/1985
Field of study

Recent trends in programming models for server-side development have shown an increasing popularity of event-based single- threaded programming models based on the combination of dynamic languages such as JavaScript and event-based runtime systems for asynchronous I/O management such as Node.JS. Reasons for the success of such models are the simplicity of the single-threaded event-based programming model as well as the growing popularity of the Cloud as a deployment platform for Web applications. Unfortunately, the popularity of single-threaded models comes at the price of performance and scalability, as single-threaded event-based models present limitations when parallel processing is needed, and traditional approaches to concurrency such as threads and locks don't play well with event-based systems. This dissertation proposes a programming model and a runtime system to overcome such limitations by enabling single-threaded event-based applications with support for speculative parallel execution. The model, called Parallel Event Loop, has the goal of bringing parallel execution to the domain of single-threaded event-based programming without relaxing the main characteristics of the single-threaded model, and therefore providing developers with the impression of a safe, single-threaded, runtime. Rather than supporting only pure single-threaded programming, however, the parallel event loop can also be used to derive safe, high-level, parallel programming models characterized by a strong compatibility with single-threaded runtimes. We describe three distinct implementations of speculative runtimes enabling the parallel execution of event-based applications. The first implementation we describe is a pessimistic runtime system based on locks to implement speculative parallelization. The second and the third implementations are based on two distinct optimistic runtimes using software transactional memory. Each of the implementations supports the parallelization of applications written using an asynchronous single-threaded programming style, and each of them enables applications to benefit from parallel execution

RERO DOC Digital Library

Loop Parallelization using Dynamic Commutativity Analysis

Author: Castaneda Lozano Roberto
Cole Murray
Franke Bjoern
Vasiladiotis Chris
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/03/2021
Field of study

Edinburgh Research Explorer

The parallel event loop model and runtime: a parallel programming model and runtime system for safe event-based parallel programming

Author: Bonetta Daniele
Pautasso Cesare
Publication venue
Publication date: 30/07/2015
Field of study

RERO DOC Digital Library