1,544 research outputs found
High Performance Regional Ocean Modeling with GPU Acceleration
The Regional Ocean Modeling System (ROMS) is an open-source, free-surface, primitive equation ocean model used by the scientific community for a diverse range of applications [1]. ROMS employs sophisticated numerical techniques, including a split-explicit time-stepping scheme that treats the fast barotropic (2D) and slow baroclinic (3D) modes separately for improved efficiency [2]. ROMS also contains a suite of data assimilation tools that allow the user to improve the accuracy of a simulation by incorporating observational data. These tools are based on four dimensional variational methods [3], which generate reliable results, but require more computational resources than without any assimilation of data. The implementation of ROMS supports two parallel computing models; a distributed memory model that utilizes Message Passing Interface (MPI), and a shared memory model that utilizes OpenMP. Prior research has shown that portions of ROMS can also be executed on a General Purpose Graphics Processing Unit (GPGPU) to take advantage of the massively parallel architecture available on those systems [4]. This paper presents a comparison between two forms of parallelism. NVIDIA Kepler K20X GPUs were used for performance measurement of GPU parallelism using CUDA while an Intel Xeon E5-2650 was used for shared memory parallelism using OpenMP. The implementation is benchmarked using idealistic marine conditions. Our experiments show that OpenMP was the fastest, followed closely by CUDA, while the normal serial version was considerably slower
Towards Ontology-Based Program Analysis
Program analysis is fundamental for program optimizations, debugging,
and many other tasks. But developing program analyses has been a
challenging and error-prone process for general users. Declarative
program analysis has shown the promise to dramatically improve the
productivity in the development of program analyses. Current
declarative program analysis is however subject to some major
limitations in supporting cooperations among analysis tools, guiding
program optimizations, and often requires much effort for repeated
program preprocessing.
In this work, we advocate the integration of ontology into declarative
program analysis. As a way to standardize the definitions of concepts
in a domain and the representation of the knowledge in the domain,
ontology offers a promising way to address the limitations of current
declarative program analysis. We develop a prototype framework named
PATO for conducting program analysis upon ontology-based program
representation. Experiments on six program analyses confirm the
potential of ontology for complementing existing declarative program
analysis. It supports multiple analyses without separate program
preprocessing, promotes cooperative Liveness analysis between two
compilers, and effectively guides a data placement optimization for
Graphic Processing Units (GPU)
21st Century Simulation: Exploiting High Performance Computing and Data Analysis
This paper identifies, defines, and analyzes the limitations imposed on Modeling and Simulation by outmoded
paradigms in computer utilization and data analysis. The authors then discuss two emerging capabilities to
overcome these limitations: High Performance Parallel Computing and Advanced Data Analysis. First, parallel
computing, in supercomputers and Linux clusters, has proven effective by providing users an advantage in
computing power. This has been characterized as a ten-year lead over the use of single-processor computers.
Second, advanced data analysis techniques are both necessitated and enabled by this leap in computing power.
JFCOM's JESPP project is one of the few simulation initiatives to effectively embrace these concepts. The
challenges facing the defense analyst today have grown to include the need to consider operations among non-combatant
populations, to focus on impacts to civilian infrastructure, to differentiate combatants from non-combatants,
and to understand non-linear, asymmetric warfare. These requirements stretch both current
computational techniques and data analysis methodologies. In this paper, documented examples and potential
solutions will be advanced. The authors discuss the paths to successful implementation based on their experience.
Reviewed technologies include parallel computing, cluster computing, grid computing, data logging, OpsResearch,
database advances, data mining, evolutionary computing, genetic algorithms, and Monte Carlo sensitivity analyses.
The modeling and simulation community has significant potential to provide more opportunities for training and
analysis. Simulations must include increasingly sophisticated environments, better emulations of foes, and more
realistic civilian populations. Overcoming the implementation challenges will produce dramatically better insights,
for trainees and analysts. High Performance Parallel Computing and Advanced Data Analysis promise increased
understanding of future vulnerabilities to help avoid unneeded mission failures and unacceptable personnel losses.
The authors set forth road maps for rapid prototyping and adoption of advanced capabilities. They discuss the
beneficial impact of embracing these technologies, as well as risk mitigation required to ensure success
The Landscape and Challenges of HPC Research and LLMs
Recently, language models (LMs), especially large language models (LLMs),
have revolutionized the field of deep learning. Both encoder-decoder models and
prompt-based techniques have shown immense potential for natural language
processing and code-based tasks. Over the past several years, many research
labs and institutions have invested heavily in high-performance computing,
approaching or breaching exascale performance levels. In this paper, we posit
that adapting and utilizing such language model-based techniques for tasks in
high-performance computing (HPC) would be very beneficial. This study presents
our reasoning behind the aforementioned position and highlights how existing
ideas can be improved and adapted for HPC tasks
Phase-coherent lightwave communications with frequency combs
Fiber-optical networks are a crucial telecommunication infrastructure in
society. Wavelength division multiplexing allows for transmitting parallel data
streams over the fiber bandwidth, and coherent detection enables the use of
sophisticated modulation formats and electronic compensation of signal
impairments. In the future, optical frequency combs may replace multiple lasers
used for the different wavelength channels. We demonstrate two novel signal
processing schemes that take advantage of the broadband phase coherence of
optical frequency combs. This approach allows for a more efficient estimation
and compensation of optical phase noise in coherent communication systems,
which can significantly simplify the signal processing or increase the
transmission performance. With further advances in space division multiplexing
and chip-scale frequency comb sources, these findings pave the way for compact
energy-efficient optical transceivers.Comment: 17 pages, 9 figure
The parallel event loop model and runtime: a parallel programming model and runtime system for safe event-based parallel programming
Recent trends in programming models for server-side development have shown an increasing popularity of event-based single- threaded programming models based on the combination of dynamic languages such as JavaScript and event-based runtime systems for asynchronous I/O management such as Node.JS. Reasons for the success of such models are the simplicity of the single-threaded event-based programming model as well as the growing popularity of the Cloud as a deployment platform for Web applications. Unfortunately, the popularity of single-threaded models comes at the price of performance and scalability, as single-threaded event-based models present limitations when parallel processing is needed, and traditional approaches to concurrency such as threads and locks don't play well with event-based systems. This dissertation proposes a programming model and a runtime system to overcome such limitations by enabling single-threaded event-based applications with support for speculative parallel execution. The model, called Parallel Event Loop, has the goal of bringing parallel execution to the domain of single-threaded event-based programming without relaxing the main characteristics of the single-threaded model, and therefore providing developers with the impression of a safe, single-threaded, runtime. Rather than supporting only pure single-threaded programming, however, the parallel event loop can also be used to derive safe, high-level, parallel programming models characterized by a strong compatibility with single-threaded runtimes. We describe three distinct implementations of speculative runtimes enabling the parallel execution of event-based applications. The first implementation we describe is a pessimistic runtime system based on locks to implement speculative parallelization. The second and the third implementations are based on two distinct optimistic runtimes using software transactional memory. Each of the implementations supports the parallelization of applications written using an asynchronous single-threaded programming style, and each of them enables applications to benefit from parallel execution
The parallel event loop model and runtime: a parallel programming model and runtime system for safe event-based parallel programming
Recent trends in programming models for server-side development have shown an increasing popularity of event-based single- threaded programming models based on the combination of dynamic languages such as JavaScript and event-based runtime systems for asynchronous I/O management such as Node.JS. Reasons for the success of such models are the simplicity of the single-threaded event-based programming model as well as the growing popularity of the Cloud as a deployment platform for Web applications. Unfortunately, the popularity of single-threaded models comes at the price of performance and scalability, as single-threaded event-based models present limitations when parallel processing is needed, and traditional approaches to concurrency such as threads and locks don't play well with event-based systems. This dissertation proposes a programming model and a runtime system to overcome such limitations by enabling single-threaded event-based applications with support for speculative parallel execution. The model, called Parallel Event Loop, has the goal of bringing parallel execution to the domain of single-threaded event-based programming without relaxing the main characteristics of the single-threaded model, and therefore providing developers with the impression of a safe, single-threaded, runtime. Rather than supporting only pure single-threaded programming, however, the parallel event loop can also be used to derive safe, high-level, parallel programming models characterized by a strong compatibility with single-threaded runtimes. We describe three distinct implementations of speculative runtimes enabling the parallel execution of event-based applications. The first implementation we describe is a pessimistic runtime system based on locks to implement speculative parallelization. The second and the third implementations are based on two distinct optimistic runtimes using software transactional memory. Each of the implementations supports the parallelization of applications written using an asynchronous single-threaded programming style, and each of them enables applications to benefit from parallel execution
- …