Search CORE

5,509 research outputs found

Performance Analysis of a Novel GPU Computation-to-core Mapping Scheme for Robust Facet Image Modeling

Author: Cao Yong
Park Seung In
Quek Francis
Watson Layne T.
Publication venue
Publication date: 01/01/2012
Field of study

Though the GPGPU concept is well-known in image processing, much more work remains to be done to fully exploit GPUs as an alternative computation engine. This paper investigates the computation-to-core mapping strategies to probe the efficiency and scalability of the robust facet image modeling algorithm on GPUs. Our fine-grained computation-to-core mapping scheme shows a significant performance gain over the standard pixel-wise mapping scheme. With in-depth performance comparisons across the two different mapping schemes, we analyze the impact of the level of parallelism on the GPU computation and suggest two principles for optimizing future image processing applications on the GPU platform

Computer Science Technical Reports @Virginia Tech

Packing Sporadic Real-Time Tasks on Identical Multiprocessor Systems

Author: Bansal Nikhil
Chakraborty Samarjit
Chen Jian-Jia
von der Brüggen Georg
Publication venue
Publication date: 01/01/2018
Field of study

In real-time systems, in addition to the functional correctness recurrent tasks must fulfill timing constraints to ensure the correct behavior of the system. Partitioned scheduling is widely used in real-time systems, i.e., the tasks are statically assigned onto processors while ensuring that all timing constraints are met. The decision version of the problem, which is to check whether the deadline constraints of tasks can be satisfied on a given number of identical processors, has been known

{\cal NP}

-complete in the strong sense. Several studies on this problem are based on approximations involving resource augmentation, i.e., speeding up individual processors. This paper studies another type of resource augmentation by allocating additional processors, a topic that has not been explored until recently. We provide polynomial-time algorithms and analysis, in which the approximation factors are dependent upon the input instances. Specifically, the factors are related to the maximum ratio of the period to the relative deadline of a task in the given task set. We also show that these algorithms unfortunately cannot achieve a constant approximation factor for general cases. Furthermore, we prove that the problem does not admit any asymptotic polynomial-time approximation scheme (APTAS) unless

{\cal P}={\cal NP}

when the task set has constrained deadlines, i.e., the relative deadline of a task is no more than the period of the task.Comment: Accepted and to appear in ISAAC 2018, Yi-Lan, Taiwa

arXiv.org e-Print Archive

Repository TU/e

CWI's Institutional Repository

Pure OAI Repository

A bibliography on parallel and vector numerical algorithms

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

NASA Technical Reports Server

Parallel processing and expert systems

Author: Lau Sonie
Yan Jerry C.
Publication venue
Publication date
Field of study

Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited

NASA Technical Reports Server

A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems

Author: Mittal Sparsh
Publication venue
Publication date: 01/01/2014
Field of study

Recent technological advances have greatly improved the performance and features of embedded systems. With the number of just mobile devices now reaching nearly equal to the population of earth, embedded systems have truly become ubiquitous. These trends, however, have also made the task of managing their power consumption extremely challenging. In recent years, several techniques have been proposed to address this issue. In this paper, we survey the techniques for managing power consumption of embedded systems. We discuss the need of power management and provide a classification of the techniques on several important parameters to highlight their similarities and differences. This paper is intended to help the researchers and application-developers in gaining insights into the working of power management techniques and designing even more efficient high-performance embedded systems of tomorrow

arXiv.org e-Print Archive

Crossref

Techniques for the realization of ultrareliable spaceborne computers Interim scientific report

Author: Goldberg J.
Green M. W.
Levitt K. N.
Stone H. S.
Publication venue
Publication date
Field of study

Error-free ultrareliable spaceborne computer

NASA Technical Reports Server

On the periodic behavior of real-time schedulers on identical multiprocessor platforms

Author: Cucu-Grosjean Liliana
Goossens Joël
Grolleau Emmanuel
Publication venue
Publication date: 16/05/2013
Field of study

This paper is proposing a general periodicity result concerning any deterministic and memoryless scheduling algorithm (including non-work-conserving algorithms), for any context, on identical multiprocessor platforms. By context we mean the hardware architecture (uniprocessor, multicore), as well as task constraints like critical sections, precedence constraints, self-suspension, etc. Since the result is based only on the releases and deadlines, it is independent from any other parameter. Note that we do not claim that the given interval is minimal, but it is an upper bound for any cycle of any feasible schedule provided by any deterministic and memoryless scheduler

arXiv.org e-Print Archive

HAL-UNICE

INRIA a CCSD electronic archive server

Very Large-Scale Neighborhoods with Performance Guarantees for Minimizing Makespan on Parallel Machines

Author: Brueggemann T.
Hurink J.L.
Vredeveld T.
Woeginger G.J.
Publication venue: Department of Applied Mathematics, University of Twente
Publication date: 01/01/2006
Field of study

We study the problem of minimizing the makespan on m parallel machines. We introduce a very large-scale neighborhood of exponential size (in the number of machines) that is based on a matching in a complete graph. The idea is to partition the jobs assigned to the same machine into two sets. This partitioning is done for every machine with some chosen rule to receive 2m parts. A new assignment is received by putting to every machine exactly two parts. The neighborhood Nsplit consists of all possible rearrangements of the parts to the machines. The best assignment of Nsplit can be calculated in time O(mlogm) by determining the perfect matching having minimum maximal edge weight in an improvement graph, where the vertices correspond to parts and the weights on the edges correspond to the sum of the processing times of the jobs belonging to the parts. Additionally, we examine local optima in this neighborhood and in combinations with other neighborhoods. We derive performance guarantees for these local optima

Maastricht University Research Portal

University of Twente Research Information

Parallel processing and expert systems

Author: Lau Sonie
Yan Jerry C.
Publication venue
Publication date
Field of study

Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 90's cannot enjoy an increased level of autonomy without the efficient use of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real time demands are met for large expert systems. Speed-up via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial labs in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems was surveyed. The survey is divided into three major sections: (1) multiprocessors for parallel expert systems; (2) parallel languages for symbolic computations; and (3) measurements of parallelism of expert system. Results to date indicate that the parallelism achieved for these systems is small. In order to obtain greater speed-ups, data parallelism and application parallelism must be exploited

NASA Technical Reports Server