16,716 research outputs found
Programming MPSoC platforms: Road works ahead
This paper summarizes a special session on multicore/multi-processor system-on-chip (MPSoC) programming challenges. The current trend towards MPSoC platforms in most computing domains does not only mean a radical change in computer architecture. Even more important from a SW developer´s viewpoint, at the same time the classical sequential von Neumann programming model needs to be overcome. Efficient utilization of the MPSoC HW resources demands for radically new models and corresponding SW development tools, capable of exploiting the available parallelism and guaranteeing bug-free parallel SW. While several standards are established in the high-performance computing domain (e.g. OpenMP), it is clear that more innovations are required for successful\ud
deployment of heterogeneous embedded MPSoC. On the other hand, at least for coming years, the freedom for disruptive programming technologies is limited by the huge amount of certified sequential code that demands for a more pragmatic, gradual tool and code replacement strategy
The Scalability-Efficiency/Maintainability-Portability Trade-off in Simulation Software Engineering: Examples and a Preliminary Systematic Literature Review
Large-scale simulations play a central role in science and the industry.
Several challenges occur when building simulation software, because simulations
require complex software developed in a dynamic construction process. That is
why simulation software engineering (SSE) is emerging lately as a research
focus. The dichotomous trade-off between scalability and efficiency (SE) on the
one hand and maintainability and portability (MP) on the other hand is one of
the core challenges. We report on the SE/MP trade-off in the context of an
ongoing systematic literature review (SLR). After characterizing the issue of
the SE/MP trade-off using two examples from our own research, we (1) review the
33 identified articles that assess the trade-off, (2) summarize the proposed
solutions for the trade-off, and (3) discuss the findings for SSE and future
work. Overall, we see evidence for the SE/MP trade-off and first solution
approaches. However, a strong empirical foundation has yet to be established;
general quantitative metrics and methods supporting software developers in
addressing the trade-off have to be developed. We foresee considerable future
work in SSE across scientific communities.Comment: 9 pages, 2 figures. Accepted for presentation at the Fourth
International Workshop on Software Engineering for High Performance Computing
in Computational Science and Engineering (SEHPCCSE 2016
Redesigning OP2 Compiler to Use HPX Runtime Asynchronous Techniques
Maximizing parallelism level in applications can be achieved by minimizing
overheads due to load imbalances and waiting time due to memory latencies.
Compiler optimization is one of the most effective solutions to tackle this
problem. The compiler is able to detect the data dependencies in an application
and is able to analyze the specific sections of code for parallelization
potential. However, all of these techniques provided with a compiler are
usually applied at compile time, so they rely on static analysis, which is
insufficient for achieving maximum parallelism and producing desired
application scalability. One solution to address this challenge is the use of
runtime methods. This strategy can be implemented by delaying certain amount of
code analysis to be done at runtime. In this research, we improve the parallel
application performance generated by the OP2 compiler by leveraging HPX, a C++
runtime system, to provide runtime optimizations. These optimizations include
asynchronous tasking, loop interleaving, dynamic chunk sizing, and data
prefetching. The results of the research were evaluated using an Airfoil
application which showed a 40-50% improvement in parallel performance.Comment: 18th IEEE International Workshop on Parallel and Distributed
Scientific and Engineering Computing (PDSEC 2017
Scalable software architecture for on-line multi-camera video processing
In this paper we present a scalable software architecture for on-line multi-camera video processing, that guarantees a good trade off between computational power, scalability and flexibility. The software system is modular and its main blocks are the Processing Units (PUs), and the Central Unit. The Central Unit works as a supervisor of the running PUs and each PU manages the acquisition phase and the processing phase. Furthermore, an approach to easily parallelize the desired processing application has been presented. In this paper, as case study, we apply the proposed software architecture to a multi-camera system in order to efficiently manage multiple 2D object detection modules in a real-time scenario. System performance has been evaluated under different load conditions such as number of cameras and image sizes. The results show that the software architecture scales well with the number of camera and can easily works with different image formats respecting the real time constraints. Moreover, the parallelization approach can be used in order to speed up the processing tasks with a low level of overhea
Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS
GROMACS is a widely used package for biomolecular simulation, and over the
last two decades it has evolved from small-scale efficiency to advanced
heterogeneous acceleration and multi-level parallelism targeting some of the
largest supercomputers in the world. Here, we describe some of the ways we have
been able to realize this through the use of parallelization on all levels,
combined with a constant focus on absolute performance. Release 4.6 of GROMACS
uses SIMD acceleration on a wide range of architectures, GPU offloading
acceleration, and both OpenMP and MPI parallelism within and between nodes,
respectively. The recent work on acceleration made it necessary to revisit the
fundamental algorithms of molecular simulation, including the concept of
neighborsearching, and we discuss the present and future challenges we see for
exascale simulation - in particular a very fine-grained task parallelism. We
also discuss the software management, code peer review and continuous
integration testing required for a project of this complexity.Comment: EASC 2014 conference proceedin
- …