Search CORE

26,456 research outputs found

GRAPHICAL CONFIGURATION PROGRAMMING - THE STRUCTURAL DESCRIPTION, CONSTRUCTION AND EVOLUTION OF SOFTWARE SYSTEMS USING GRAPHICS

Author: KRAMER J
MAGEE J
NG K
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1989
Field of study

Published versio

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository

Intelligent systems in manufacturing: current developments and future prospects

Author: Kobbacy KAH
Meziane F
Proudlove N
Vadera S
Publication venue: 'Emerald'
Publication date: 01/01/2000
Field of study

Global competition and rapidly changing customer requirements are demanding increasing changes in manufacturing environments. Enterprises are required to constantly redesign their products and continuously reconfigure their manufacturing systems. Traditional approaches to manufacturing systems do not fully satisfy this new situation. Many authors have proposed that artificial intelligence will bring the flexibility and efficiency needed by manufacturing systems. This paper is a review of artificial intelligence techniques used in manufacturing systems. The paper first defines the components of a simplified intelligent manufacturing systems (IMS), the different Artificial Intelligence (AI) techniques to be considered and then shows how these AI techniques are used for the components of IMS

University of Salford Institutional Repository

Crossref

The University of Manchester - Institutional Repository

A Compiler and Runtime Infrastructure for Automatic Program Distribution

Author: Chu Matt
Diaconescu Roxana E.
Mouri Zachary
Wang Lei
Publication venue
Publication date: 01/04/2005
Field of study

This paper presents the design and the implementation of a compiler and runtime infrastructure for automatic program distribution. We are building a research infrastructure that enables experimentation with various program partitioning and mapping strategies and the study of automatic distribution's effect on resource consumption (e.g., CPU, memory, communication). Since many optimization techniques are faced with conflicting optimization targets (e.g., memory and communication), we believe that it is important to be able to study their interaction. We present a set of techniques that enable flexible resource modeling and program distribution. These are: dependence analysis, weighted graph partitioning, code and communication generation, and profiling. We have developed these ideas in the context of the Java language. We present in detail the design and implementation of each of the techniques as part of our compiler and runtime infrastructure. Then, we evaluate our design and present preliminary experimental data for each component, as well as for the entire system

Caltech Authors

Using shared-data localization to reduce the cost of inspector-execution in unified-parallel-C programs

Author: Alvanos Michail
Amaral José Nelson
Farreras Esclusa Montserrat
Martorell Bofill Xavier
Tiotto Ettore
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Programs written in the Unified Parallel C (UPC) language can access any location of the entire local and remote address space via read/write operations. However, UPC programs that contain fine-grained shared accesses can exhibit performance degradation. One solution is to use the inspector-executor technique to coalesce fine-grained shared accesses to larger remote access operations. A straightforward implementation of the inspector executor transformation results in excessive instrumentation that hinders performance.; This paper addresses this issue and introduces various techniques that aim at reducing the generated instrumentation code: a shared-data localization transformation based on Constant-Stride Linear Memory Descriptors (CSLMADs) [S. Aarseth, Gravitational N-Body Simulations: Tools and Algorithms, Cambridge Monographs on Mathematical Physics, Cambridge University Press, 2003.], the inlining of data locality checks and the usage of an index vector to aggregate the data. Finally, the paper introduces a lightweight loop code motion transformation to privatize shared scalars that were propagated through the loop body.; A performance evaluation, using up to 2048 cores of a POWER 775, explores the impact of each optimization and characterizes the overheads of UPC programs. It also shows that the presented optimizations increase performance of UPC programs up to 1.8 x their UPC hand-optimized counterpart for applications with regular accesses and up to 6.3 x for applications with irregular accesses.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Loo.py: transformation-based code generation for GPUs and CPUs

Author: Asanovic K.
Ellson J.
Rubinsteyn A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/05/2014
Field of study

Today's highly heterogeneous computing landscape places a burden on programmers wanting to achieve high performance on a reasonably broad cross-section of machines. To do so, computations need to be expressed in many different but mathematically equivalent ways, with, in the worst case, one variant per target machine. Loo.py, a programming system embedded in Python, meets this challenge by defining a data model for array-style computations and a library of transformations that operate on this model. Offering transformations such as loop tiling, vectorization, storage management, unrolling, instruction-level parallelism, change of data layout, and many more, it provides a convenient way to capture, parametrize, and re-unify the growth among code variants. Optional, deep integration with numpy and PyOpenCL provides a convenient computing environment where the transition from prototype to high-performance implementation can occur in a gradual, machine-assisted form

arXiv.org e-Print Archive

Crossref

Scalable data abstractions for distributed parallel computations

Author: Hanlon James
Hollis Simon J.
May David
Publication venue
Publication date: 03/10/2012
Field of study

The ability to express a program as a hierarchical composition of parts is an essential tool in managing the complexity of software and a key abstraction this provides is to separate the representation of data from the computation. Many current parallel programming models use a shared memory model to provide data abstraction but this doesn't scale well with large numbers of cores due to non-determinism and access latency. This paper proposes a simple programming model that allows scalable parallel programs to be expressed with distributed representations of data and it provides the programmer with the flexibility to employ shared or distributed styles of data-parallelism where applicable. It is capable of an efficient implementation, and with the provision of a small set of primitive capabilities in the hardware, it can be compiled to operate directly on the hardware, in the same way stack-based allocation operates for subroutines in sequential machines

arXiv.org e-Print Archive

Explore Bristol Research