Search CORE

27,840 research outputs found

Computational Aspects of Asynchronous CA

Author: Chandesris Jérôme
Dennunzio Alberto
Formenti Enrico
Manzoni Luca
Publication venue
Publication date: 30/04/2011
Field of study

This work studies some aspects of the computational power of fully asynchronous cellular automata (ACA). We deal with some notions of simulation between ACA and Turing Machines. In particular, we characterize the updating sequences specifying which are "universal", i.e., allowing a (specific family of) ACA to simulate any TM on any input. We also consider the computational cost of such simulations

arXiv.org e-Print Archive

CiteSeerX

Intrinsic universality and the computational power of self-assembly

Author: Woods Damien
Publication venue: 'Open Publishing Association'
Publication date: 01/09/2013
Field of study

This short survey of recent work in tile self-assembly discusses the use of simulation to classify and separate the computational and expressive power of self-assembly models. The journey begins with the result that there is a single universal tile set that, with proper initialization and scaling, simulates any tile assembly system. This universal tile set exhibits something stronger than Turing universality: it captures the geometry and dynamics of any simulated system. From there we find that there is no such tile set in the noncooperative, or temperature 1, model, proving it weaker than the full tile assembly model. In the two-handed or hierarchal model, where large assemblies can bind together on one step, we encounter an infinite set, of infinite hierarchies, each with strictly increasing simulation power. Towards the end of our trip, we find one tile to rule them all: a single rotatable flipable polygonal tile that can simulate any tile assembly system. It seems this could be the beginning of a much longer journey, so directions for future work are suggested.Comment: In Proceedings MCU 2013, arXiv:1309.104

arXiv.org e-Print Archive

Directory of Open Access Journals

Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems

Author: Barrett R.
GEORG HAGER
GERALD SCHUBERT
GERHARD WELLEIN
HOLGER FEHSKE
Stüben K.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 29/06/2011
Field of study

We evaluate optimized parallel sparse matrix-vector operations for several representative application areas on widespread multicore-based cluster configurations. First the single-socket baseline performance is analyzed and modeled with respect to basic architectural properties of standard multicore chips. Beyond the single node, the performance of parallel sparse matrix-vector operations is often limited by communication overhead. Starting from the observation that nonblocking MPI is not able to hide communication cost using standard MPI implementations, we demonstrate that explicit overlap of communication and computation can be achieved by using a dedicated communication thread, which may run on a virtual core. Moreover we identify performance benefits of hybrid MPI/OpenMP programming due to improved load balancing even without explicit communication overlap. We compare performance results for pure MPI, the widely used "vector-like" hybrid programming strategies, and explicit overlap on a modern multicore-based cluster and a Cray XE6 system.Comment: 16 pages, 10 figure

arXiv.org e-Print Archive

Crossref

Parallel sparse matrix-vector multiplication as a test case for hybrid MPI+OpenMP programming

Author: Fehske Holger
Hager Georg
Schubert Gerald
Wellein Gerhard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/12/2010
Field of study

We evaluate optimized parallel sparse matrix-vector operations for two representative application areas on widespread multicore-based cluster configurations. First the single-socket baseline performance is analyzed and modeled with respect to basic architectural properties of standard multicore chips. Going beyond the single node, parallel sparse matrix-vector operations often suffer from an unfavorable communication to computation ratio. Starting from the observation that nonblocking MPI is not able to hide communication cost using standard MPI implementations, we demonstrate that explicit overlap of communication and computation can be achieved by using a dedicated communication thread, which may run on a virtual core. We compare our approach to pure MPI and the widely used "vector-like" hybrid programming strategy.Comment: 12 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Improving the scalability of parallel N-body applications with an event driven constraint based execution model

Author: Aarseth SJ
Alfieri RA
Bonachea D
Chandra R
Dekate C
El-Ghazawi T
Hewitt C
Kale L
Message Passing Interface Forum
O’Shea BW
Salmon JK
Singh JP
Publication venue: 'SAGE Publications'
Publication date: 23/09/2011
Field of study

The scalability and efficiency of graph applications are significantly constrained by conventional systems and their supporting programming models. Technology trends like multicore, manycore, and heterogeneous system architectures are introducing further challenges and possibilities for emerging application domains such as graph applications. This paper explores the space of effective parallel execution of ephemeral graphs that are dynamically generated using the Barnes-Hut algorithm to exemplify dynamic workloads. The workloads are expressed using the semantics of an Exascale computing execution model called ParalleX. For comparison, results using conventional execution model semantics are also presented. We find improved load balancing during runtime and automatic parallelism discovery improving efficiency using the advanced semantics for Exascale computing.Comment: 11 figure

arXiv.org e-Print Archive

Crossref

Discrete and fuzzy dynamical genetic programming in the XCSF learning classifier system

Author: B Mesot
C Van den Broeck
CA Reiter
E Di Paulo
HP Schwefel
J Di
JE Moody
JL Elman
L Bull
L Glass
Larry Bull
M Sipper
MC Su
N Lemke
PL Lanzi
PL Lanzi
Richard J. Preen
SW Wilson
T Werner
TE Ingerson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

A number of representation schemes have been presented for use within learning classifier systems, ranging from binary encodings to neural networks. This paper presents results from an investigation into using discrete and fuzzy dynamical system representations within the XCSF learning classifier system. In particular, asynchronous random Boolean networks are used to represent the traditional condition-action production system rules in the discrete case and asynchronous fuzzy logic networks in the continuous-valued case. It is shown possible to use self-adaptive, open-ended evolution to design an ensemble of such dynamical systems within XCSF to solve a number of well-known test problems

arXiv.org e-Print Archive

Crossref

UWE Bristol Research Repository

Efficient Parallel Algorithm for Statistical Ion Track Simulations in Crystalline Materials

Author: Beardmore
Beardmore
Biersack
Brandt
Byoungseon Jeon
Cai
Cai
Elteckov
Firsov
Gropp
Kang
Kishinevskii
Niels Grønbech-Jensen
Robinson
Sillanpää
Tasch
Van Brutzel
Verlet
Ziegler
Ziegler
Publication venue: 'Elsevier BV'
Publication date: 26/10/2008
Field of study

We present an efficient parallel algorithm for statistical Molecular Dynamics simulations of ion tracks in solids. The method is based on the Rare Event Enhanced Domain following Molecular Dynamics (REED-MD) algorithm, which has been successfully applied to studies of, e.g., ion implantation into crystalline semiconductor wafers. We discuss the strategies for parallelizing the method, and we settle on a host-client type polling scheme in which a multiple of asynchronous processors are continuously fed to the host, which, in turn, distributes the resulting feed-back information to the clients. This real-time feed-back consists of, e.g., cumulative damage information or statistics updates necessary for the cloning in the rare event algorithm. We finally demonstrate the algorithm for radiation effects in a nuclear oxide fuel, and we show the balanced parallel approach with high parallel efficiency in multiple processor configurations.Comment: 17 pages, seven figures, four table

arXiv.org e-Print Archive

Crossref