20,573 research outputs found
A portable platform for accelerated PIC codes and its application to GPUs using OpenACC
We present a portable platform, called PIC_ENGINE, for accelerating
Particle-In-Cell (PIC) codes on heterogeneous many-core architectures such as
Graphic Processing Units (GPUs). The aim of this development is efficient
simulations on future exascale systems by allowing different parallelization
strategies depending on the application problem and the specific architecture.
To this end, this platform contains the basic steps of the PIC algorithm and
has been designed as a test bed for different algorithmic options and data
structures. Among the architectures that this engine can explore, particular
attention is given here to systems equipped with GPUs. The study demonstrates
that our portable PIC implementation based on the OpenACC programming model can
achieve performance closely matching theoretical predictions. Using the Cray
XC30 system, Piz Daint, at the Swiss National Supercomputing Centre (CSCS), we
show that PIC_ENGINE running on an NVIDIA Kepler K20X GPU can outperform the
one on an Intel Sandybridge 8-core CPU by a factor of 3.4
Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems
Two emerging hardware trends will dominate the database system technology in
the near future: increasing main memory capacities of several TB per server and
massively parallel multi-core processing. Many algorithmic and control
techniques in current database technology were devised for disk-based systems
where I/O dominated the performance. In this work we take a new look at the
well-known sort-merge join which, so far, has not been in the focus of research
in scalable massively parallel multi-core data processing as it was deemed
inferior to hash joins. We devise a suite of new massively parallel sort-merge
(MPSM) join algorithms that are based on partial partition-based sorting.
Contrary to classical sort-merge joins, our MPSM algorithms do not rely on a
hard to parallelize final merge step to create one complete sort order. Rather
they work on the independently created runs in parallel. This way our MPSM
algorithms are NUMA-affine as all the sorting is carried out on local memory
partitions. An extensive experimental evaluation on a modern 32-core machine
with one TB of main memory proves the competitive performance of MPSM on large
main memory databases with billions of objects. It scales (almost) linearly in
the number of employed cores and clearly outperforms competing hash join
proposals - in particular it outperforms the "cutting-edge" Vectorwise parallel
query engine by a factor of four.Comment: VLDB201
SlowFuzz: Automated Domain-Independent Detection of Algorithmic Complexity Vulnerabilities
Algorithmic complexity vulnerabilities occur when the worst-case time/space
complexity of an application is significantly higher than the respective
average case for particular user-controlled inputs. When such conditions are
met, an attacker can launch Denial-of-Service attacks against a vulnerable
application by providing inputs that trigger the worst-case behavior. Such
attacks have been known to have serious effects on production systems, take
down entire websites, or lead to bypasses of Web Application Firewalls.
Unfortunately, existing detection mechanisms for algorithmic complexity
vulnerabilities are domain-specific and often require significant manual
effort. In this paper, we design, implement, and evaluate SlowFuzz, a
domain-independent framework for automatically finding algorithmic complexity
vulnerabilities. SlowFuzz automatically finds inputs that trigger worst-case
algorithmic behavior in the tested binary. SlowFuzz uses resource-usage-guided
evolutionary search techniques to automatically find inputs that maximize
computational resource utilization for a given application.Comment: ACM CCS '17, October 30-November 3, 2017, Dallas, TX, US
Synthesis of Parametric Programs using Genetic Programming and Model Checking
Formal methods apply algorithms based on mathematical principles to enhance
the reliability of systems. It would only be natural to try to progress from
verification, model checking or testing a system against its formal
specification into constructing it automatically. Classical algorithmic
synthesis theory provides interesting algorithms but also alarming high
complexity and undecidability results. The use of genetic programming, in
combination with model checking and testing, provides a powerful heuristic to
synthesize programs. The method is not completely automatic, as it is fine
tuned by a user that sets up the specification and parameters. It also does not
guarantee to always succeed and converge towards a solution that satisfies all
the required properties. However, we applied it successfully on quite
nontrivial examples and managed to find solutions to hard programming
challenges, as well as to improve and to correct code. We describe here several
versions of our method for synthesizing sequential and concurrent systems.Comment: In Proceedings INFINITY 2013, arXiv:1402.661
- …