Search CORE

218,643 research outputs found

Process-Oriented Parallel Programming with an Application to Data-Intensive Computing

Author: Givelberg Edward
Publication venue
Publication date: 21/07/2014
Field of study

We introduce process-oriented programming as a natural extension of object-oriented programming for parallel computing. It is based on the observation that every class of an object-oriented language can be instantiated as a process, accessible via a remote pointer. The introduction of process pointers requires no syntax extension, identifies processes with programming objects, and enables processes to exchange information simply by executing remote methods. Process-oriented programming is a high-level language alternative to multithreading, MPI and many other languages, environments and tools currently used for parallel computations. It implements natural object-based parallelism using only minimal syntax extension of existing languages, such as C++ and Python, and has therefore the potential to lead to widespread adoption of parallel programming. We implemented a prototype system for running processes using C++ with MPI and used it to compute a large three-dimensional Fourier transform on a computer cluster built of commodity hardware components. Three-dimensional Fourier transform is a prototype of a data-intensive application with a complex data-access pattern. The process-oriented code is only a few hundred lines long, and attains very high data throughput by achieving massive parallelism and maximizing hardware utilization.Comment: 20 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Array operators using multiple dispatch: a design methodology for array implementations in dynamic languages

Author: Bezanson Jeff
Chen Jiahao
Edelman Alan
Karpinski Stefan
Shah Viral
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2014
Field of study

Arrays are such a rich and fundamental data type that they tend to be built into a language, either in the compiler or in a large low-level library. Defining this functionality at the user level instead provides greater flexibility for application domains not envisioned by the language designer. Only a few languages, such as C++ and Haskell, provide the necessary power to define

n

-dimensional arrays, but these systems rely on compile-time abstraction, sacrificing some flexibility. In contrast, dynamic languages make it straightforward for the user to define any behavior they might want, but at the possible expense of performance. As part of the Julia language project, we have developed an approach that yields a novel trade-off between flexibility and compile-time analysis. The core abstraction we use is multiple dispatch. We have come to believe that while multiple dispatch has not been especially popular in most kinds of programming, technical computing is its killer application. By expressing key functions such as array indexing using multi-method signatures, a surprising range of behaviors can be obtained, in a way that is both relatively easy to write and amenable to compiler analysis. The compact factoring of concerns provided by these methods makes it easier for user-defined types to behave consistently with types in the standard library.Comment: 6 pages, 2 figures, workshop paper for the ARRAY '14 workshop, June 11, 2014, Edinburgh, United Kingdo

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Handling Parallelism in a Concurrency Model

Author: Meyer Bertrand
Nanz Sebastian
Schill Mischael
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Programming models for concurrency are optimized for dealing with nondeterminism, for example to handle asynchronously arriving events. To shield the developer from data race errors effectively, such models may prevent shared access to data altogether. However, this restriction also makes them unsuitable for applications that require data parallelism. We present a library-based approach for permitting parallel access to arrays while preserving the safety guarantees of the original model. When applied to SCOOP, an object-oriented concurrency model, the approach exhibits a negligible performance overhead compared to ordinary threaded implementations of two parallel benchmark programs.Comment: MUSEPAT 201

arXiv.org e-Print Archive

Crossref

CREOLE: a Universal Language for Creating, Requesting, Updating and Deleting Resources

Author: Grall Hervé
Lacouture Mayleen
Ledoux Thomas
Publication venue: 'Open Publishing Association'
Publication date: 01/07/2010
Field of study

In the context of Service-Oriented Computing, applications can be developed following the REST (Representation State Transfer) architectural style. This style corresponds to a resource-oriented model, where resources are manipulated via CRUD (Create, Request, Update, Delete) interfaces. The diversity of CRUD languages due to the absence of a standard leads to composition problems related to adaptation, integration and coordination of services. To overcome these problems, we propose a pivot architecture built around a universal language to manipulate resources, called CREOLE, a CRUD Language for Resource Edition. In this architecture, scripts written in existing CRUD languages, like SQL, are compiled into Creole and then executed over different CRUD interfaces. After stating the requirements for a universal language for manipulating resources, we formally describe the language and informally motivate its definition with respect to the requirements. We then concretely show how the architecture solves adaptation, integration and coordination problems in the case of photo management in Flickr and Picasa, two well-known service-oriented applications. Finally, we propose a roadmap for future work.Comment: In Proceedings FOCLASA 2010, arXiv:1007.499

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Portail HAL Nantes Université

INRIA a CCSD electronic archive server

HAL Mines Nantes

A service oriented architecture for engineering design

Author: Fleming P.J.
Shenfield A.
Publication venue: Automatic Control and Systems Engineering, University of Sheffield
Publication date: 01/12/2004
Field of study

Decision making in engineering design can be effectively addressed by using genetic algorithms to solve multi-objective problems. These multi-objective genetic algorithms (MOGAs) are well suited to implementation in a Service Oriented Architecture. Often the evaluation process of the MOGA is compute-intensive due to the use of a complex computer model to represent the real-world system. The emerging paradigm of Grid Computing offers a potential solution to the compute-intensive nature of this objective function evaluation, by allowing access to large amounts of compute resources in a distributed manner. This paper presents a grid-enabled framework for multi-objective optimisation using genetic algorithms (MOGA-G) to aid decision making in engineering design

White Rose Research Online

Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs

Author: Kadetotad Deepak
Kim Minkyu
Linares Barranco Alejandro
Ríos Navarro José Antonio
Seo Jae-Sun
Tapiador Morales Ricardo
Publication venue: 'SAGE Publications'
Publication date: 01/01/2016
Field of study

Deep learning has significantly advanced the state of the art in artificial intelligence, gaining wide popularity from both industry and academia. Special interest is around Convolutional Neural Networks (CNN), which take inspiration from the hierarchical structure of the visual cortex, to form deep layers of convolutional operations, along with fully connected classifiers. Hardware implementations of these deep CNN architectures are challenged with memory bottlenecks that require many convolution and fully-connected layers demanding large amount of communication for parallel computation. Multi-core CPU based solutions have demonstrated their inadequacy for this problem due to the memory wall and low parallelism. Many-core GPU architectures show superior performance but they consume high power and also have memory constraints due to inconsistencies between cache and main memory. FPGA design solutions are also actively being explored, which allow implementing the memory hierarchy using embedded BlockRAM. This boosts the parallel use of shared memory elements between multiple processing units, avoiding data replicability and inconsistencies. This makes FPGAs potentially powerful solutions for real-time classification of CNNs. Both Altera and Xilinx have adopted OpenCL co-design framework from GPU for FPGA designs as a pseudo-automatic development solution. In this paper, a comprehensive evaluation and comparison of Altera and Xilinx OpenCL frameworks for a 5-layer deep CNN is presented. Hardware resources, temporal performance and the OpenCL architecture for CNNs are discussed. Xilinx demonstrates faster synthesis, better FPGA resource utilization and more compact boards. Altera provides multi-platforms tools, mature design community and better execution times

idUS. Depósito de Investigación Universidad de Sevilla