33,624 research outputs found
Mixing multi-core CPUs and GPUs for scientific simulation software
Recent technological and economic developments have led to widespread availability of
multi-core CPUs and specialist accelerator processors such as graphical processing units
(GPUs). The accelerated computational performance possible from these devices can be very
high for some applications paradigms. Software languages and systems such as NVIDIA's
CUDA and Khronos consortium's open compute language (OpenCL) support a number of
individual parallel application programming paradigms. To scale up the performance of some
complex systems simulations, a hybrid of multi-core CPUs for coarse-grained parallelism and
very many core GPUs for data parallelism is necessary. We describe our use of hybrid applica-
tions using threading approaches and multi-core CPUs to control independent GPU devices.
We present speed-up data and discuss multi-threading software issues for the applications
level programmer and o er some suggested areas for language development and integration
between coarse-grained and ne-grained multi-thread systems. We discuss results from three
common simulation algorithmic areas including: partial di erential equations; graph cluster
metric calculations and random number generation. We report on programming experiences
and selected performance for these algorithms on: single and multiple GPUs; multi-core CPUs;
a CellBE; and using OpenCL. We discuss programmer usability issues and the outlook and
trends in multi-core programming for scienti c applications developers
Uncovering Bugs in Distributed Storage Systems during Testing (not in Production!)
Testing distributed systems is challenging due to multiple sources of nondeterminism. Conventional testing techniques, such as unit, integration and stress testing, are ineffective in preventing serious but subtle bugs from reaching production. Formal techniques, such as TLA+, can only verify high-level specifications of systems at the level of logic-based models, and fall short of checking the actual executable code. In this paper, we present a new methodology for testing distributed systems. Our approach applies advanced systematic testing techniques to thoroughly check that the executable code adheres to its high-level specifications, which significantly improves coverage of important system behaviors. Our methodology has been applied to three distributed storage systems in the Microsoft Azure cloud computing platform. In the process, numerous bugs were identified, reproduced, confirmed and fixed. These bugs required a subtle combination of concurrency and failures, making them extremely difficult to find with conventional testing techniques. An important advantage of our approach is that a bug is uncovered in a small setting and witnessed by a full system trace, which dramatically increases the productivity of debugging
Kolmogorov Complexity in perspective. Part II: Classification, Information Processing and Duality
We survey diverse approaches to the notion of information: from Shannon
entropy to Kolmogorov complexity. Two of the main applications of Kolmogorov
complexity are presented: randomness and classification. The survey is divided
in two parts published in a same volume. Part II is dedicated to the relation
between logic and information system, within the scope of Kolmogorov
algorithmic information theory. We present a recent application of Kolmogorov
complexity: classification using compression, an idea with provocative
implementation by authors such as Bennett, Vitanyi and Cilibrasi. This stresses
how Kolmogorov complexity, besides being a foundation to randomness, is also
related to classification. Another approach to classification is also
considered: the so-called "Google classification". It uses another original and
attractive idea which is connected to the classification using compression and
to Kolmogorov complexity from a conceptual point of view. We present and unify
these different approaches to classification in terms of Bottom-Up versus
Top-Down operational modes, of which we point the fundamental principles and
the underlying duality. We look at the way these two dual modes are used in
different approaches to information system, particularly the relational model
for database introduced by Codd in the 70's. This allows to point out diverse
forms of a fundamental duality. These operational modes are also reinterpreted
in the context of the comprehension schema of axiomatic set theory ZF. This
leads us to develop how Kolmogorov's complexity is linked to intensionality,
abstraction, classification and information system.Comment: 43 page
Applying Prolog to Develop Distributed Systems
Development of distributed systems is a difficult task. Declarative
programming techniques hold a promising potential for effectively supporting
programmer in this challenge. While Datalog-based languages have been actively
explored for programming distributed systems, Prolog received relatively little
attention in this application area so far. In this paper we present a
Prolog-based programming system, called DAHL, for the declarative development
of distributed systems. DAHL extends Prolog with an event-driven control
mechanism and built-in networking procedures. Our experimental evaluation using
a distributed hash-table data structure, a protocol for achieving Byzantine
fault tolerance, and a distributed software model checker - all implemented in
DAHL - indicates the viability of the approach
Using shared-data localization to reduce the cost of inspector-execution in unified-parallel-C programs
Programs written in the Unified Parallel C (UPC) language can access any location of the entire local and remote address space via read/write operations. However, UPC programs that contain fine-grained shared accesses can exhibit performance degradation. One solution is to use the inspector-executor technique to coalesce fine-grained shared accesses to larger remote access operations. A straightforward implementation of the inspector executor transformation results in excessive instrumentation that hinders performance.; This paper addresses this issue and introduces various techniques that aim at reducing the generated instrumentation code: a shared-data localization transformation based on Constant-Stride Linear Memory Descriptors (CSLMADs) [S. Aarseth, Gravitational N-Body Simulations: Tools and Algorithms, Cambridge Monographs on Mathematical Physics, Cambridge University Press, 2003.], the inlining of data locality checks and the usage of an index vector to aggregate the data. Finally, the paper introduces a lightweight loop code motion transformation to privatize shared scalars that were propagated through the loop body.; A performance evaluation, using up to 2048 cores of a POWER 775, explores the impact of each optimization and characterizes the overheads of UPC programs. It also shows that the presented optimizations increase performance of UPC programs up to 1.8 x their UPC hand-optimized counterpart for applications with regular accesses and up to 6.3 x for applications with irregular accesses.Peer ReviewedPostprint (author's final draft
Security Through Amnesia: A Software-Based Solution to the Cold Boot Attack on Disk Encryption
Disk encryption has become an important security measure for a multitude of
clients, including governments, corporations, activists, security-conscious
professionals, and privacy-conscious individuals. Unfortunately, recent
research has discovered an effective side channel attack against any disk
mounted by a running machine\cite{princetonattack}. This attack, known as the
cold boot attack, is effective against any mounted volume using
state-of-the-art disk encryption, is relatively simple to perform for an
attacker with even rudimentary technical knowledge and training, and is
applicable to exactly the scenario against which disk encryption is primarily
supposed to defend: an adversary with physical access. To our knowledge, no
effective software-based countermeasure to this attack supporting multiple
encryption keys has yet been articulated in the literature. Moreover, since no
proposed solution has been implemented in publicly available software, all
general-purpose machines using disk encryption remain vulnerable. We present
Loop-Amnesia, a kernel-based disk encryption mechanism implementing a novel
technique to eliminate vulnerability to the cold boot attack. We offer
theoretical justification of Loop-Amnesia's invulnerability to the attack,
verify that our implementation is not vulnerable in practice, and present
measurements showing our impact on I/O accesses to the encrypted disk is
limited to a slowdown of approximately 2x. Loop-Amnesia is written for x86-64,
but our technique is applicable to other register-based architectures. We base
our work on loop-AES, a state-of-the-art open source disk encryption package
for Linux.Comment: 13 pages, 4 figure
- …