Search CORE

977 research outputs found

Mentat: An object-oriented macro data flow system

Author: Grimshaw Andrew S.
Liu Jane W. S.
Publication venue
Publication date
Field of study

Mentat, an object-oriented macro data flow system designed to facilitate parallelism in distributed systems, is presented. The macro data flow model is a model of computation similar to the data flow model with two principal differences: the computational complexity of the actors is much greater than in traditional data flow systems, and there are persistent actors that maintain state information between executions. Mentat is a system that combines the object-oriented programming paradigm and the macro data flow model of computation. Mentat programs use a dynamic structure called a future list to represent the future of computations

NASA Technical Reports Server

Solving the Ghost-Gluon System of Yang-Mills Theory on GPUs

Author: Aguilar
Alkofer
Alkofer
Alkofer
Atkinson
Boucaud
Cucchieri
Dyson
Fischer
Fischer
Fischer
Fischer
Fister
Glimm
Gribov
Gundolf Haase
Haag
Huber
Kugo
Lerche
Maas
Maas
Maas
Maas
Mandelstam
Maris
Markus Hopfer
Nakanishi
NVIDIA Corporation
NVIDIA Corporation
Osterwalder
Pawlowski
Reinhard Alkofer
Schwinger
Schwinger
Sternbeck
Sternbeck
Takahasi
Taylor
von Smekal
von Smekal
von Smekal
Watson
Zwanziger
Zwanziger
Publication venue: 'Elsevier BV'
Publication date: 18/12/2012
Field of study

We solve the ghost-gluon system of Yang-Mills theory using Graphics Processing Units (GPUs). Working in Landau gauge, we use the Dyson-Schwinger formalism for the mathematical description as this approach is well-suited to directly benefit from the computing power of the GPUs. With the help of a Chebyshev expansion for the dressing functions and a subsequent appliance of a Newton-Raphson method, the non-linear system of coupled integral equations is linearized. The resulting Newton matrix is generated in parallel using OpenMPI and CUDA(TM). Our results show, that it is possible to cut down the run time by two orders of magnitude as compared to a sequential version of the code. This makes the proposed techniques well-suited for Dyson-Schwinger calculations on more complicated systems where the Yang-Mills sector of QCD serves as a starting point. In addition, the computation of Schwinger functions using GPU devices is studied.Comment: 19 pages, 7 figures, additional figure added, dependence on block-size is investigated in more detail, version accepted by CP

arXiv.org e-Print Archive

Crossref

Packing Sporadic Real-Time Tasks on Identical Multiprocessor Systems

Author: Bansal Nikhil
Chakraborty Samarjit
Chen Jian-Jia
von der Brüggen Georg
Publication venue
Publication date: 01/01/2018
Field of study

In real-time systems, in addition to the functional correctness recurrent tasks must fulfill timing constraints to ensure the correct behavior of the system. Partitioned scheduling is widely used in real-time systems, i.e., the tasks are statically assigned onto processors while ensuring that all timing constraints are met. The decision version of the problem, which is to check whether the deadline constraints of tasks can be satisfied on a given number of identical processors, has been known

{\cal NP}

-complete in the strong sense. Several studies on this problem are based on approximations involving resource augmentation, i.e., speeding up individual processors. This paper studies another type of resource augmentation by allocating additional processors, a topic that has not been explored until recently. We provide polynomial-time algorithms and analysis, in which the approximation factors are dependent upon the input instances. Specifically, the factors are related to the maximum ratio of the period to the relative deadline of a task in the given task set. We also show that these algorithms unfortunately cannot achieve a constant approximation factor for general cases. Furthermore, we prove that the problem does not admit any asymptotic polynomial-time approximation scheme (APTAS) unless

{\cal P}={\cal NP}

when the task set has constrained deadlines, i.e., the relative deadline of a task is no more than the period of the task.Comment: Accepted and to appear in ISAAC 2018, Yi-Lan, Taiwa

arXiv.org e-Print Archive

Repository TU/e

CWI's Institutional Repository

Pure OAI Repository

Performance Analysis of a Novel GPU Computation-to-core Mapping Scheme for Robust Facet Image Modeling

Author: Cao Yong
Park Seung In
Quek Francis
Watson Layne T.
Publication venue
Publication date: 01/01/2012
Field of study

Though the GPGPU concept is well-known in image processing, much more work remains to be done to fully exploit GPUs as an alternative computation engine. This paper investigates the computation-to-core mapping strategies to probe the efficiency and scalability of the robust facet image modeling algorithm on GPUs. Our fine-grained computation-to-core mapping scheme shows a significant performance gain over the standard pixel-wise mapping scheme. With in-depth performance comparisons across the two different mapping schemes, we analyze the impact of the level of parallelism on the GPU computation and suggest two principles for optimizing future image processing applications on the GPU platform

Computer Science Technical Reports @Virginia Tech

08071 Abstracts Collection -- Scheduling

Author: Liu Jane W. S.
Pruhs Kirk
Publication venue: Dagstuhl Seminar Proceedings. 08071 - Scheduling
Publication date: 01/01/2008
Field of study

From 10.02. to 15.02., the Dagstuhl Seminar 08071 ``Scheduling\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

Dagstuhl Research Online Publication Server

Design of testbed and emulation tools

Author: Flynn M. J.
Lundstrom S. F.
Publication venue
Publication date
Field of study

The research summarized was concerned with the design of testbed and emulation tools suitable to assist in projecting, with reasonable accuracy, the expected performance of highly concurrent computing systems on large, complete applications. Such testbed and emulation tools are intended for the eventual use of those exploring new concurrent system architectures and organizations, either as users or as designers of such systems. While a range of alternatives was considered, a software based set of hierarchical tools was chosen to provide maximum flexibility, to ease in moving to new computers as technology improves and to take advantage of the inherent reliability and availability of commercially available computing systems

NASA Technical Reports Server

Beyond Dataflow

Author: Borut Robič
Jurij Šilc
Theo Ungerer
Publication venue: 'University of Zagreb - University Computing Centre'
Publication date: 01/01/2000
Field of study

This paper presents some recent advanced dataflow architectures. While the dataflow concept offers the potential of high performance, the performance of an actual dataflow implementation can be restricted by a limited number of functional units, limited memory bandwidth, and the need to associatively match pending operations with available functional units. Since the early 1970s, there have been significant developments in both fundamental research and practical realizations of dataflow models of computation. In particular, there has been active research and development in multithreaded architectures that evolved from the dataflow model. Also some other techniques for combining control-flow and dataflow emerged, such as coarse-grain dataflow, dataflow with complex machine operations, RISC dataflow, and micro dataflow. These developments have also had certain impact on the conception of highperformance superscalar processors in the “post-RISC” era

OPUS Augsburg

Crossref

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

On the implementation of real-time slotbased task-splitting scheduling algorithms for multiprocessor systems

Author: Andersson Björn
Bletsas Konstantinos
Sousa Paulo Baltarejo
Tovar Eduardo
Publication venue: IPP-Hurray Group
Publication date: 01/01/2011
Field of study

In this paper we discuss challenges and design principles of an implementation of slot-based tasksplitting algorithms into the Linux 2.6.34 version. We show that this kernel version is provided with the required features for implementing such scheduling algorithms. We show that the real behavior of the scheduling algorithm is very close to the theoretical. We run and discuss experiments on 4-core and 24-core machines

Repositório Científico do Instituto Politécnico do Porto