Search CORE

31,417 research outputs found

Constructs and evaluation strategies for intelligent speculative parallelism - armageddon revisited

Author: Guzman Adolfo
Hermenegildo Manuel V.
Publication venue: Facultad de Informática (UPM)
Publication date: 01/02/1988
Field of study

This report addresses speculative parallelism (the assignment of spare processing resources to tasks which are not known to be strictly required for the successful completion of a computation) at the user and application level. At this level, the execution of a program is seen as a (dynamic) tree —a graph, in general. A solution for a problem is a traversal of this graph from the initial state to a node known to be the answer. Speculative parallelism then represents the assignment of resources to múltiple branches of this graph even if they are not positively known to be on the path to a solution. In highly non-deterministic programs the branching factor can be very high and a naive assignment will very soon use up all the resources. This report presents work assignment strategies other than the usual depth-first and breadth-first. Instead, best-first strategies are used. Since their definition is application-dependent, the application language contains primitives that allow the user (or application programmer) to a) indícate when intelligent OR-parallelism should be used; b) provide the functions that define "best," and c) indícate when to use them. An abstract architecture enables those primitives to perform the search in a "speculative" way, using several processors, synchronizing them, killing the siblings of the path leading to the answer, etc. The user is freed from worrying about these interactions. Several search strategies are proposed and their implementation issues are addressed. "Armageddon," a global pruning method, is introduced, together with both a software and a hardware implementation for it. The concepts exposed are applicable to áreas of Artificial Intelligence such as extensive expert systems, planning, game playing, and in general to large search problems. The proposed strategies, although showing promise, have not been evaluated by simulation or experimentation

Archivo Digital UPM

Access to vectors in multi-module memories

Author: Ayguadé Parra Eduard
Peiron Guàrdia Montse
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1994
Field of study

The poor bandwidth obtained from memory when conflicts arise in the modules or in the interconnection network degrades the performance of computers. Address transformation schemes, such as interleaving, skewing and linear transformations, have been proposed to achieve conflict-free access for streams with constant stride. However, this is achieved only for some strides. In this paper, we summarize a mechanism to request the elements in an out-of-order way which allows to achieve conflict-free access for a larger number of strides. We study the cases of a single vector processor and of a vector multiprocessor system. For this latter case, we propose a synchronous mode of accessing memory that can be applied in SIMD machines or in MIMD systems with decoupled access and execution.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

The Parallel Complexity of Growth Models

Author: A. Gibbons
D. Richardson
D. Stauffer
D. Willinson
F. E. Fich
H. Kesten
H. Kesten
J. Krug
J. Krug
J. Krug
J. M. Kim
J. Machta
J. T. Chayes
Jonathan Machta
L.-H. Tang
M. Eden
M. J. Vold
M. Kardar
M. Kardar
M. Plischke
M. Pokorny
M. R. Garey
P. Freche
P. Meakin
R. Chandler
Raymond Greenlaw
S. Roux
S. Roux
T. Vicsek
Y.-C. Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1994
Field of study

This paper investigates the parallel complexity of several non-equilibrium growth models. Invasion percolation, Eden growth, ballistic deposition and solid-on-solid growth are all seemingly highly sequential processes that yield self-similar or self-affine random clusters. Nonetheless, we present fast parallel randomized algorithms for generating these clusters. The running times of the algorithms scale as

O(\log^2 N)

, where

N

is the system size, and the number of processors required scale as a polynomial in

N

. The algorithms are based on fast parallel procedures for finding minimum weight paths; they illuminate the close connection between growth models and self-avoiding paths in random environments. In addition to their potential practical value, our algorithms serve to classify these growth models as less complex than other growth models, such as diffusion-limited aggregation, for which fast parallel algorithms probably do not exist.Comment: 20 pages, latex, submitted to J. Stat. Phys., UNH-TR94-0

arXiv.org e-Print Archive

CiteSeerX

Crossref

Conflict-free strides for vectors in matched memories

Author: Ayguadé Parra Eduard
Lang Tomas
Llaberia Griñó José M.
Navarro Guerrero Juan José
Peiron Guàrdia Montse
Valero Cortés Mateo
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/1991
Field of study

Address transformation schemes, such as skewing and linear transformations, have been proposed to achieve conflict-free access to one family of strides in vector processors with matched memories. The paper extends these schemes to achieve this conflict-free access for several families. The basic idea is to perform an out-of-order access to vectors of fixed length, equal to that of the vector registers of the processor. The hardware required is similar to that for the access in order.Peer ReviewedPostprint (author's final draft

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

(Tissue) P Systems with Vesicles of Multisets

Author: Alhazov Artiom
Freund Rudolf
Ivanov Sergiu
Verlan Sergey
Publication venue: 'Open Publishing Association'
Publication date: 21/08/2017
Field of study

We consider tissue P systems working on vesicles of multisets with the very simple operations of insertion, deletion, and substitution of single objects. With the whole multiset being enclosed in a vesicle, sending it to a target cell can be indicated in those simple rules working on the multiset. As derivation modes we consider the sequential mode, where exactly one rule is applied in a derivation step, and the set maximal mode, where in each derivation step a non-extendable set of rules is applied. With the set maximal mode, computational completeness can already be obtained with tissue P systems having a tree structure, whereas tissue P systems even with an arbitrary communication structure are not computationally complete when working in the sequential mode. Adding polarizations (-1, 0, 1 are sufficient) allows for obtaining computational completeness even for tissue P systems working in the sequential mode.Comment: In Proceedings AFL 2017, arXiv:1708.0622

arXiv.org e-Print Archive

HAL - UPEC / UPEM

(Tissue) P Systems with Vesicles of Multisets

Author: Blaise Weber (4265473)
Charlie Hodgman (3508529)
Franziska Turck (215939)
Huub Hoefsloot (2573806)
Jan-Jaap Wesselink (4265470)
Johan Zicola (4265467)
Jonathan Gent (4166353)
Maike Stam (36554)
Nathan Springer (56696)
Rurika Oka (4265464)
Sarah Anderson (661782)
Publication venue
Publication date: 01/01/2017
Field of study

arXiv.org e-Print Archive

Crossref

HAL - UPEC / UPEM

FigShare

Microgrid - The microthreaded many-core architecture

Author: Uddin Irfan
Publication venue
Publication date: 21/09/2013
Field of study

Traditional processors use the von Neumann execution model, some other processors in the past have used the dataflow execution model. A combination of von Neuman model and dataflow model is also tried in the past and the resultant model is referred as hybrid dataflow execution model. We describe a hybrid dataflow model known as the microthreading. It provides constructs for creation, synchronization and communication between threads in an intermediate language. The microthreading model is an abstract programming and machine model for many-core architecture. A particular instance of this model is named as the microthreaded architecture or the Microgrid. This architecture implements all the concurrency constructs of the microthreading model in the hardware with the management of these constructs in the hardware.Comment: 30 pages, 16 figure

arXiv.org e-Print Archive

CiteSeerX

Transparent code authentication at the processor level

Author: A.O. Durahim
Aoki
B. Sunar
Bellare
Black
Boneh
Brassard
Carter
Chevallier-Mames
Choukri
Clarke
Clarke
E. Savaş
Gaj
Gassend
Gaubatz
Hodjat
Hopkins
Joye
Kaps
Krawczyk
Lee
Lim
McCune
Ö. Kocabaş
Reyhani-Masoleh
Satoh
Suh
Sunar
T.B. Pedersen
Yan
Yang
Zhang
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2009
Field of study

The authors present a lightweight authentication mechanism that verifies the authenticity of code and thereby addresses the virus and malicious code problems at the hardware level eliminating the need for trusted extensions in the operating system. The technique proposed tightly integrates the authentication mechanism into the processor core. The authentication latency is hidden behind the memory access latency, thereby allowing seamless on-the-fly authentication of instructions. In addition, the proposed authentication method supports seamless encryption of code (and static data). Consequently, while providing the software users with assurance for authenticity of programs executing on their hardware, the proposed technique also protects the software manufacturers’ intellectual property through encryption. The performance analysis shows that, under mild assumptions, the presented technique introduces negligible overhead for even moderate cache sizes

Crossref

Sabanci University Research Database