Search CORE

31,344 research outputs found

The future of computing beyond Moore's Law.

Author: Shalf John
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Moore's Law is a techno-economic model that has enabled the information technology industry to double the performance and functionality of digital electronics roughly every 2 years within a fixed cost, power and area. Advances in silicon lithography have enabled this exponential miniaturization of electronics, but, as transistors reach atomic scale and fabrication costs continue to rise, the classical technological driver that has underpinned Moore's Law for 50 years is failing and is anticipated to flatten by 2025. This article provides an updated view of what a post-exascale system will look like and the challenges ahead, based on our most recent understanding of technology roadmaps. It also discusses the tapering of historical improvements, and how it affects options available to continue scaling of successors to the first exascale machine. Lastly, this article covers the many different opportunities and strategies available to continue computing performance improvements in the absence of historical technology drivers. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'

Ezid

eScholarship - University of California

Performance Analysis and Evaluation of Dynamic Loop Scheduling Techniques in a Competitive Runtime Environment for Distributed Memory Architectures

Author: Balasubramaniam Mahadevan
Publication venue: Scholars Junction
Publication date: 10/05/2003
Field of study

Parallel computing offers immense potential to solve large, complex scientific problems. Load imbalance is a major impediment in obtaining high performance by a parallel system. One principal form of parallelism found in scientific applications is data parallelism. Loops without dependencies are data parallel. During the execution of large parallel loops, computational requirements vary due to problem, algorithmic and systemic characteristics. These factors lead to load imbalance which in turn degrades the performance of an application. Over the years, a number of dynamic loop scheduling techniques have been proposed to address one or more of these factors. However, there is no single strategy that works well for different problem domains and system characteristics. Moreover, load balancing during runtime is complicated because of its need for dynamic data redistribution. Therefore, there is a distinct need to integrate the dynamic loop scheduling techniques into a single package and provide them as an application programming interface (API) to the application developer. In recent years, along this direction, a number of dynamic loop scheduling techniques have been integrated into the compiler technologies for shared memory environments. On the other hand, there is no such integrated approach for distributed memory applications. The purpose of this thesis is to present the design, implementation and effectiveness of an integrated approach:the dynamic loop scheduling techniques are integrated into a runtime system for distributed memory architectures. For this purpose, we choose the newly developed parallel runtime environment for multicomputer architecture (PREMA) with its main components: the data movement and control substrate (DMCS) and mobile object layer (MOL). This runtime system has recently been developed and has demonstrated to be one of the most competitive runtime systems for distributed memory architectures. The significance of this work is that the proposed API will enhance the performance of parallel applications by reducing the load imbalance among processors caused by a wide range of factors and will reduce the software developmental cost required for load balancing. With the integration of the scheduling capabilities into the runtime system, its applicability has been expanded. The performance of the API has been evaluated qualitatively and quantitatively. The overhead of the API has been studied analytically and measured experimentally. Three parallel benchmarks including scientific applications of general interest (N-body simulations, automatic quadrature routine and unstructured grid heat solver) were considered for experimentation purpose. Based on the experiments conducted, a cost improvement of up to 76% over the straight forward parallel benchmark has been obtained. For certain application characteristics, the overhead of the runtime system was found to be within 10% of the underlying messaging layer. These results demonstrate that, in large scientific applications it is possible and desirable to combine the rich functionality of a runtime system with the advantages of scheduling techniques to achieve high performance

Scholars Junction - Mississippi State University Institutional Repository

Mask aligner for ultrahigh vacuum with capacitive distance control

Author: Bhaskar Priyamvada
Bindel Jan Raphael
Liebmann Marcus
Mathioudakis Simon
Morgenstern Markus
Muckel Florian
Olschewski Tim
Pratzer Marco
Publication venue: 'AIP Publishing'
Publication date: 01/01/2018
Field of study

We present a mask aligner driven by three piezo motors which guides and aligns a SiN shadow mask under capacitive control towards a sample surface. The three capacitors for read out are located at the backside of the thin mask such that the mask can be placed in

\mu

m distance from the sample surface, while keeping it parallel to the surface. Samples and masks can be exchanged in-situ and the mask can additionally be displaced parallel to the surface. We demonstrate an edge sharpness of the deposited structures below 100 nm, which is likely limited by the diffusion of the deposited Au on Si(111).Comment: 5 pages, 3 figure

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University

Evolving Gene Regulatory Networks with Mobile DNA Mechanisms

Author: Adamatzky Andrew
Bull Larry
Publication venue
Publication date: 24/06/2013
Field of study

This paper uses a recently presented abstract, tuneable Boolean regulatory network model extended to consider aspects of mobile DNA, such as transposons. The significant role of mobile DNA in the evolution of natural systems is becoming increasingly clear. This paper shows how dynamically controlling network node connectivity and function via transposon-inspired mechanisms can be selected for in computational intelligence tasks to give improved performance. The designs of dynamical networks intended for implementation within the slime mould Physarum polycephalum and for the distributed control of a smart surface are considered.Comment: 7 pages, 8 figures. arXiv admin note: substantial text overlap with arXiv:1303.722

arXiv.org e-Print Archive

Crossref

When the path is never shortest: a reality check on shortest path biocomputation

Author: A Adamatzky
A Adamatzky
A Adamatzky
A. Tero
Andrew Adamatzky
C Scherber
CD Harvey
CR Reid
IH Riedel-Kruse
J Baumgardner
J Houten Van
J Houten Van
J Jones
Jeff Jones
K Ramsch
K Vittori
L Zhu
M Dorigo
M Vela-Pérez
Michael Esau
R Mayne
R Mayne
R Mayne
Richard Mayne
S Stepney
Saikat Jana
SC Pratt
T Hennessey
T Nakagaki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/12/2017
Field of study

Shortest path problems are a touchstone for evaluating the computing performance and functional range of novel computing substrates. Much has been published in recent years regarding the use of biocomputers to solve minimal path problems such as route optimisation and labyrinth navigation, but their outputs are typically difficult to reproduce and somewhat abstract in nature, suggesting that both experimental design and analysis in the field require standardising. This chapter details laboratory experimental data which probe the path finding process in two single-celled protistic model organisms, Physarum polycephalum and Paramecium caudatum, comprising a shortest path problem and labyrinth navigation, respectively. The results presented illustrate several of the key difficulties that are encountered in categorising biological behaviours in the language of computing, including biological variability, non-halting operations and adverse reactions to experimental stimuli. It is concluded that neither organism examined are able to efficiently or reproducibly solve shortest path problems in the specific experimental conditions that were tested. Data presented are contextualised with biological theory and design principles for maximising the usefulness of experimental biocomputer prototypes.Comment: To appear in: Adamatzky, A (Ed.) Shortest path solvers. From software to wetware. Springer, 201

arXiv.org e-Print Archive

Crossref

Slime mould computes planar shapes

Author: Adamatzky Andrew
Publication venue
Publication date: 01/01/2011
Field of study

Computing a polygon defining a set of planar points is a classical problem of modern computational geometry. In laboratory experiments we demonstrate that a concave hull, a connected alpha-shape without holes, of a finite planar set is approximated by slime mould Physarum polycephalum. We represent planar points with sources of long-distance attractants and short-distance repellents and inoculate a piece of plasmodium outside the data set. The plasmodium moves towards the data and envelops it by pronounced protoplasmic tubes

arXiv.org e-Print Archive

CiteSeerX