6,059 research outputs found

    When One Pipeline Is Not Enough

    Get PDF
    Pipelines that operate on buffers often work well to mitigate the high latency inherent in interprocessor communication and in accessing data on disk. Running a single pipeline on each node works well when each pipeline stage consumes and produces data at the same rate. If a stage might consume data faster or slower than it produces data, a single pipeline becomes unwieldy. We describe how we have extended the FG programming environment to support multiple pipelines in two forms. When a node might send and receive data at different rates during interprocessor communication, we use disjoint pipelines that send and receive on each node. When a node consumes and produces data from different streams on the node, we use multiple pipelines that intersect at a particular stage. Experimental results for two out-of-core sorting algorithms---one based on columnsort and the other a distribution-based sort---demonstrate the value of multiple pipelines

    Tackling Latency Using FG

    Get PDF
    Applications that operate on datasets which are too big to fit in main memory, known in the literature as external-memory or out-of-core applications, store their data on one or more disks. Several of these applications make multiple passes over the data, where each pass reads data from disk, operates on it, and writes data back to disk. Compared with an in-memory operation, a disk-I/O operation takes orders of magnitude (approx. 100,000 times) longer; that is, disk-I/O is a high-latency operation. Out-of-core algorithms often run on a distributed-memory cluster to take advantage of a cluster\u27s computing power, memory, disk space, and bandwidth. By doing so, however, they introduce another high-latency operation: interprocessor communication. Efficient implementations of these algorithms access data in blocks to amortize the cost of a single data transfer over the disk or the network, and they introduce asynchrony to overlap high-latency operations and computations. FG, short for Asynchronous Buffered Computation Design and Engineering Framework Generator, is a programming framework that helps to mitigate latency in out-of-core programs that run on distributed-memory clusters. An FG program is composed of a pipeline of stages operating on buffers. FG runs the stages asynchronously so that stages performing high-latency operations can overlap their work with other stages. FG supplies the code to create a pipeline, synchronize the stages, and manage data buffers; the user provides a straightforward function, containing only synchronous calls, for each stage. In this thesis, we use FG to tackle latency and exploit the available parallelism in out-of-core and distributed-memory programs. We show how FG helps us design out-of-core programs and think about parallel computing in general using three instances: an out-of-core, distribution-based sorting program; an implementation of external-memory suffix arrays; and a scientific-computing application called the fast Gauss transform. FG\u27s interaction with these real-world programs is symbiotic: FG enables efficient implementations of these programs, and the design of the first two of these programs pointed us toward further extensions for FG. Today\u27s era of multicore machines compels us to harness all opportunities for parallelism that are available in a program, and so in the latter two applications, we combine FG\u27s multithreading capabilities with the routines that OpenMP offers for in-core parallelism. In the fast Gauss transform application, we use this strategy to realize an up to 20-fold performance improvement compared with an alternate fast Gauss transform implementation. In addition, we use our experience with designing programs in FG to provide some suggestions for the next version of FG

    Toward improved construction health, safety, and ergonomics in South Africa: A working model for use by architectural designers

    Get PDF
    The construction industry produces a high rate of accidents. Despite evidence that up to 50% of accidents can be avoided through mitigation of hazards and risks in the design phase of construction projects, architectural designers do not adequately engage in designing for construction health, safety, and ergonomics. The article reports on the development of an architectural design-oriented model toward a reduction of construction hazards and risks. The research intertwined a range of secondary data with four provisional studies undertaken in the Eastern Cape Province considered representative of South Africa, and involved quantitative and qualitative methodologies directed at architectural designers registered with the South African Council for the Architectural Profession (SACAP). These served to provide local insight and a line of structured questioning for use in the main study, which was positioned in the action research (AR) paradigm and used focus-group (FG) methodology to solicit vast qualitative data from SACAP-registered participants. Synthesis of the FG data with literature and the provisional studies gave rise to a provisional model comprising six main model components and a range of subcomponents. The provisional model was validated and refined. The evolved model includes a core model embedded in a greater process model, and implementation and use of the core model relies on appropriate knowledge of architectural designers. It is recommended that tertiary architectural education institutions and those involved in architectural CPD programmes take ‘upstream design ownership’ and use the model as a basis for designing and implementing appropriate education and training programmes

    Identifying Security-Critical Cyber-Physical Components in Industrial Control Systems

    Get PDF
    In recent years, Industrial Control Systems (ICS) have become an appealing target for cyber attacks, having massive destructive consequences. Security metrics are therefore essential to assess their security posture. In this paper, we present a novel ICS security metric based on AND/OR graphs that represent cyber-physical dependencies among network components. Our metric is able to efficiently identify sets of critical cyber-physical components, with minimal cost for an attacker, such that if compromised, the system would enter into a non-operational state. We address this problem by efficiently transforming the input AND/OR graph-based model into a weighted logical formula that is then used to build and solve a Weighted Partial MAX-SAT problem. Our tool, META4ICS, leverages state-of-the-art techniques from the field of logical satisfiability optimisation in order to achieve efficient computation times. Our experimental results indicate that the proposed security metric can efficiently scale to networks with thousands of nodes and be computed in seconds. In addition, we present a case study where we have used our system to analyse the security posture of a realistic water transport network. We discuss our findings on the plant as well as further security applications of our metric.Comment: Keywords: Security metrics, industrial control systems, cyber-physical systems, AND-OR graphs, MAX-SAT resolutio

    PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation

    Full text link
    High-performance computing has recently seen a surge of interest in heterogeneous systems, with an emphasis on modern Graphics Processing Units (GPUs). These devices offer tremendous potential for performance and efficiency in important large-scale applications of computational science. However, exploiting this potential can be challenging, as one must adapt to the specialized and rapidly evolving computing environment currently exhibited by GPUs. One way of addressing this challenge is to embrace better techniques and develop tools tailored to their needs. This article presents one simple technique, GPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL, two open-source toolkits that support this technique. In introducing PyCUDA and PyOpenCL, this article proposes the combination of a dynamic, high-level scripting language with the massive performance of a GPU as a compelling two-tiered computing platform, potentially offering significant performance and productivity advantages over conventional single-tier, static systems. The concept of RTCG is simple and easily implemented using existing, robust infrastructure. Nonetheless it is powerful enough to support (and encourage) the creation of custom application-specific tools by its users. The premise of the paper is illustrated by a wide range of examples where the technique has been applied with considerable success.Comment: Submitted to Parallel Computing, Elsevie
    • 

    corecore