6,059 research outputs found
When One Pipeline Is Not Enough
Pipelines that operate on buffers often work well to mitigate the high latency inherent in interprocessor communication and in accessing data on disk. Running a single pipeline on each node works well when each pipeline stage consumes and produces data at the same rate. If a stage might consume data faster or slower than it produces data, a single pipeline becomes unwieldy. We describe how we have extended the FG programming environment to support multiple pipelines in two forms. When a node might send and receive data at different rates during interprocessor communication, we use disjoint pipelines that send and receive on each node. When a node consumes and produces data from different streams on the node, we use multiple pipelines that intersect at a particular stage. Experimental results for two out-of-core sorting algorithms---one based on columnsort and the other a distribution-based sort---demonstrate the value of multiple pipelines
Tackling Latency Using FG
Applications that operate on datasets which are too big to fit in main memory, known in the literature as external-memory or out-of-core applications, store their data on one or more disks. Several of these applications make multiple passes over the data, where each pass reads data from disk, operates on it, and writes data back to disk. Compared with an in-memory operation, a disk-I/O operation takes orders of magnitude (approx. 100,000 times) longer; that is, disk-I/O is a high-latency operation. Out-of-core algorithms often run on a distributed-memory cluster to take advantage of a cluster\u27s computing power, memory, disk space, and bandwidth. By doing so, however, they introduce another high-latency operation: interprocessor communication. Efficient implementations of these algorithms access data in blocks to amortize the cost of a single data transfer over the disk or the network, and they introduce asynchrony to overlap high-latency operations and computations. FG, short for Asynchronous Buffered Computation Design and Engineering Framework Generator, is a programming framework that helps to mitigate latency in out-of-core programs that run on distributed-memory clusters. An FG program is composed of a pipeline of stages operating on buffers. FG runs the stages asynchronously so that stages performing high-latency operations can overlap their work with other stages. FG supplies the code to create a pipeline, synchronize the stages, and manage data buffers; the user provides a straightforward function, containing only synchronous calls, for each stage. In this thesis, we use FG to tackle latency and exploit the available parallelism in out-of-core and distributed-memory programs. We show how FG helps us design out-of-core programs and think about parallel computing in general using three instances: an out-of-core, distribution-based sorting program; an implementation of external-memory suffix arrays; and a scientific-computing application called the fast Gauss transform. FG\u27s interaction with these real-world programs is symbiotic: FG enables efficient implementations of these programs, and the design of the first two of these programs pointed us toward further extensions for FG. Today\u27s era of multicore machines compels us to harness all opportunities for parallelism that are available in a program, and so in the latter two applications, we combine FG\u27s multithreading capabilities with the routines that OpenMP offers for in-core parallelism. In the fast Gauss transform application, we use this strategy to realize an up to 20-fold performance improvement compared with an alternate fast Gauss transform implementation. In addition, we use our experience with designing programs in FG to provide some suggestions for the next version of FG
Toward improved construction health, safety, and ergonomics in South Africa: A working model for use by architectural designers
The construction industry produces a high rate of accidents. Despite evidence that up to 50% of accidents can be avoided through mitigation of hazards and risks in the design phase of construction projects, architectural designers do not adequately engage in designing for construction health, safety, and ergonomics. The article reports on the development of an architectural design-oriented model toward a reduction of construction hazards and risks. The research intertwined a range of secondary data with four provisional studies undertaken in the Eastern Cape Province considered representative of South Africa, and involved quantitative and qualitative methodologies directed at architectural designers registered with the South African Council for the Architectural Profession (SACAP). These served to provide local insight and a line of structured questioning for use in the main study, which was positioned in the action research (AR) paradigm and used focus-group (FG) methodology to solicit vast qualitative data from SACAP-registered participants. Synthesis of the FG data with literature and the provisional studies gave rise to a provisional model comprising six main model components and a range of subcomponents. The provisional model was validated and refined. The evolved model includes a core model embedded in a greater process model, and implementation and use of the core model relies on appropriate knowledge of architectural designers. It is recommended that tertiary architectural education institutions and those involved in architectural CPD programmes take âupstream design ownershipâ and use the model as a basis for designing and implementing appropriate education and training programmes
Identifying Security-Critical Cyber-Physical Components in Industrial Control Systems
In recent years, Industrial Control Systems (ICS) have become an appealing
target for cyber attacks, having massive destructive consequences. Security
metrics are therefore essential to assess their security posture. In this
paper, we present a novel ICS security metric based on AND/OR graphs that
represent cyber-physical dependencies among network components. Our metric is
able to efficiently identify sets of critical cyber-physical components, with
minimal cost for an attacker, such that if compromised, the system would enter
into a non-operational state. We address this problem by efficiently
transforming the input AND/OR graph-based model into a weighted logical formula
that is then used to build and solve a Weighted Partial MAX-SAT problem. Our
tool, META4ICS, leverages state-of-the-art techniques from the field of logical
satisfiability optimisation in order to achieve efficient computation times.
Our experimental results indicate that the proposed security metric can
efficiently scale to networks with thousands of nodes and be computed in
seconds. In addition, we present a case study where we have used our system to
analyse the security posture of a realistic water transport network. We discuss
our findings on the plant as well as further security applications of our
metric.Comment: Keywords: Security metrics, industrial control systems,
cyber-physical systems, AND-OR graphs, MAX-SAT resolutio
PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation
High-performance computing has recently seen a surge of interest in
heterogeneous systems, with an emphasis on modern Graphics Processing Units
(GPUs). These devices offer tremendous potential for performance and efficiency
in important large-scale applications of computational science. However,
exploiting this potential can be challenging, as one must adapt to the
specialized and rapidly evolving computing environment currently exhibited by
GPUs. One way of addressing this challenge is to embrace better techniques and
develop tools tailored to their needs. This article presents one simple
technique, GPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL,
two open-source toolkits that support this technique.
In introducing PyCUDA and PyOpenCL, this article proposes the combination of
a dynamic, high-level scripting language with the massive performance of a GPU
as a compelling two-tiered computing platform, potentially offering significant
performance and productivity advantages over conventional single-tier, static
systems. The concept of RTCG is simple and easily implemented using existing,
robust infrastructure. Nonetheless it is powerful enough to support (and
encourage) the creation of custom application-specific tools by its users. The
premise of the paper is illustrated by a wide range of examples where the
technique has been applied with considerable success.Comment: Submitted to Parallel Computing, Elsevie
- âŠ