16,979 research outputs found
Dynamic resource allocation in a hierarchical multiprocessor system: A preliminary study
An integrated system approach to dynamic resource allocation is proposed. Some of the problems in dynamic resource allocation and the relationship of these problems to system structures are examined. A general dynamic resource allocation scheme is presented. A hierarchial system architecture which dynamically maps between processor structure and programs at multiple levels of instantiations is described. Simulation experiments were conducted to study dynamic resource allocation on the proposed system. Preliminary evaluation based on simple dynamic resource allocation algorithms indicates that with the proposed system approach, the complexity of dynamic resource management could be significantly reduced while achieving reasonable effective dynamic resource allocation
A Taxonomy of Workflow Management Systems for Grid Computing
With the advent of Grid and application technologies, scientists and
engineers are building more and more complex applications to manage and process
large data sets, and execute scientific experiments on distributed resources.
Such application scenarios require means for composing and executing complex
workflows. Therefore, many efforts have been made towards the development of
workflow management systems for Grid computing. In this paper, we propose a
taxonomy that characterizes and classifies various approaches for building and
executing workflows on Grids. We also survey several representative Grid
workflow systems developed by various projects world-wide to demonstrate the
comprehensiveness of the taxonomy. The taxonomy not only highlights the design
and engineering similarities and differences of state-of-the-art in Grid
workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
Magic-State Functional Units: Mapping and Scheduling Multi-Level Distillation Circuits for Fault-Tolerant Quantum Architectures
Quantum computers have recently made great strides and are on a long-term
path towards useful fault-tolerant computation. A dominant overhead in
fault-tolerant quantum computation is the production of high-fidelity encoded
qubits, called magic states, which enable reliable error-corrected computation.
We present the first detailed designs of hardware functional units that
implement space-time optimized magic-state factories for surface code
error-corrected machines. Interactions among distant qubits require surface
code braids (physical pathways on chip) which must be routed. Magic-state
factories are circuits comprised of a complex set of braids that is more
difficult to route than quantum circuits considered in previous work [1]. This
paper explores the impact of scheduling techniques, such as gate reordering and
qubit renaming, and we propose two novel mapping techniques: braid repulsion
and dipole moment braid rotation. We combine these techniques with graph
partitioning and community detection algorithms, and further introduce a
stitching algorithm for mapping subgraphs onto a physical machine. Our results
show a factor of 5.64 reduction in space-time volume compared to the best-known
previous designs for magic-state factories.Comment: 13 pages, 10 figure
Geometry-Oblivious FMM for Compressing Dense SPD Matrices
We present GOFMM (geometry-oblivious FMM), a novel method that creates a
hierarchical low-rank approximation, "compression," of an arbitrary dense
symmetric positive definite (SPD) matrix. For many applications, GOFMM enables
an approximate matrix-vector multiplication in or even time,
where is the matrix size. Compression requires storage and work.
In general, our scheme belongs to the family of hierarchical matrix
approximation methods. In particular, it generalizes the fast multipole method
(FMM) to a purely algebraic setting by only requiring the ability to sample
matrix entries. Neither geometric information (i.e., point coordinates) nor
knowledge of how the matrix entries have been generated is required, thus the
term "geometry-oblivious." Also, we introduce a shared-memory parallel scheme
for hierarchical matrix computations that reduces synchronization barriers. We
present results on the Intel Knights Landing and Haswell architectures, and on
the NVIDIA Pascal architecture for a variety of matrices.Comment: 13 pages, accepted by SC'1
Deterministic Consistency: A Programming Model for Shared Memory Parallelism
The difficulty of developing reliable parallel software is generating
interest in deterministic environments, where a given program and input can
yield only one possible result. Languages or type systems can enforce
determinism in new code, and runtime systems can impose synthetic schedules on
legacy parallel code. To parallelize existing serial code, however, we would
like a programming model that is naturally deterministic without language
restrictions or artificial scheduling. We propose "deterministic consistency",
a parallel programming model as easy to understand as the "parallel assignment"
construct in sequential languages such as Perl and JavaScript, where concurrent
threads always read their inputs before writing shared outputs. DC supports
common data- and task-parallel synchronization abstractions such as fork/join
and barriers, as well as non-hierarchical structures such as producer/consumer
pipelines and futures. A preliminary prototype suggests that software-only
implementations of DC can run applications written for popular parallel
environments such as OpenMP with low (<10%) overhead for some applications.Comment: 7 pages, 3 figure
Recommended from our members
Computing infrastructure issues in distributed communications systems : a survey of operating system transport system architectures
The performance of distributed applications (such as file transfer, remote login, tele-conferencing, full-motion video, and scientific visualization) is influenced by several factors that interact in complex ways. In particular, application performance is significantly affected both by communication infrastructure factors and computing infrastructure factors. Several communication infrastructure factors include channel speed, bit-error rate, and congestion at intermediate switching nodes. Computing infrastructure factors include (among other things) both protocol processing activities (such as connection management, flow control, error detection, and retransmission) and general operating system factors (such as memory latency, CPU speed, interrupt and context switching overhead, process architecture, and message buffering). Due to a several orders of magnitude increase in network channel speed and an increase in application diversity, performance bottlenecks are shifting from the network factors to the transport system factors.This paper defines an abstraction called an "Operating System Transport System Architecture" (OSTSA) that is used to classify the major components and services in the computing infrastructure. End-to-end network protocols such as TCP, TP4, VMTP, XTP, and Delta-t typically run on general-purpose computers, where they utilize various operating system resources such as processors, virtual memory, and network controllers. The OSTSA provides services that integrate these resources to support distributed applications running on local and wide area networks.A taxonomy is presented to evaluate OSTSAs in terms of their support for protocol processing activities. We use this taxonomy to compare and contrast five general-purpose commercial and experimental operating systems including System V UNIX, BSD UNIX, the x-kernel, Choices, and Xinu
Modeling, Analysis, and Hard Real-time Scheduling of Adaptive Streaming Applications
In real-time systems, the application's behavior has to be predictable at
compile-time to guarantee timing constraints. However, modern streaming
applications which exhibit adaptive behavior due to mode switching at run-time,
may degrade system predictability due to unknown behavior of the application
during mode transitions. Therefore, proper temporal analysis during mode
transitions is imperative to preserve system predictability. To this end, in
this paper, we initially introduce Mode Aware Data Flow (MADF) which is our new
predictable Model of Computation (MoC) to efficiently capture the behavior of
adaptive streaming applications. Then, as an important part of the operational
semantics of MADF, we propose the Maximum-Overlap Offset (MOO) which is our
novel protocol for mode transitions. The main advantage of this transition
protocol is that, in contrast to self-timed transition protocols, it avoids
timing interference between modes upon mode transitions. As a result, any mode
transition can be analyzed independently from the mode transitions that
occurred in the past. Based on this transition protocol, we propose a hard
real-time analysis as well to guarantee timing constraints by avoiding
processor overloading during mode transitions. Therefore, using this protocol,
we can derive a lower bound and an upper bound on the earliest starting time of
the tasks in the new mode during mode transitions in such a way that hard
real-time constraints are respected.Comment: Accepted for presentation at EMSOFT 2018 and for publication in IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
(TCAD) as part of the ESWEEK-TCAD special issu
- …