504 research outputs found
Programming SMP clusters: node-level object groups and their use in a framework for Nbody applications
Ankara : The Department of Computer Engineering and Information Science and the Institute of Engineering and Sciences of Bilkent Univ., 1999.Thesis (Master's) -- Bilkent University, 1999.Includes bibliographical references leaves 64-66Cengiz, Ä°lkerM.S
A versatile programming model for dynamic task scheduling on cluster computers
This dissertation studies the development of application programs for parallel and distributed computer systems, especially PC clusters. A methodology is proposed to increase the efficiency of code development, the productivity of programmers and enhance performance of executing the developed programs on PC clusters while facilitating improvement of scalability and code portability of these programs. A new programming model, named the Super-Programming Model (SPM), is created. Programs are developed assuming an instruction set architecture comprised of SuperInstructions (SIs). SPM models the target system as a large Virtual Machine (VM); VM contains functional units which are underlain with sub-computer systems and SIs are implemented with codes. When these functional units execute SIs, their codes will run on member computers to perform the corresponding operations.
This approach resembles the process of designing instruction sets for microprocessors but the VM employs much coarser instructions and data structures. SIs use Super-Data Blocks (SDBs) as their operands. Each SI is assigned to a single member computer and is indivisible (i.e., its implementation is not interrupted for I/O). SIs have predictable execution times because SDB sizes are limited by predefined thresholds. These qualities of SIs help dynamic load balancing. Employing software to implement instructions makes this approach more flexible. The developed programs fit to architectures of cluster systems better. SPM provides mechanisms, such as dynamic load balancing, to assure the efficient execution of programs. The vast majority of current programming models lack such mechanisms for distributed environments that suffer from long communication latencies. Since SPM employs coarse-grain tasks, the overall management overhead is small. SDB access can often overlap the execution of other SIs; a cache system further decreases average memory latencies. Since all SDBs are virtual entities, with the runtime system support, they can be accessed in parallel and efficiently minimizes additional constraints to parallelism from underlying computer systems.
In this research, a reference implementation of VM has been developed. A performance estimation model is developed that takes these features into account. Finally, the definition of scalability for parallel/distributed processing is refined to represent a multi-dimensional entity. Sample cases are analyzed
RELEASE: A High-level Paradigm for Reliable Large-scale Server Software
Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the first six months. The project aim is to scale the Erlang’s radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the effectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene
Vcluster: A Portable Virtual Computing Library For Cluster Computing
Message passing has been the dominant parallel programming model in cluster computing, and libraries like Message Passing Interface (MPI) and Portable Virtual Machine (PVM) have proven their novelty and efficiency through numerous applications in diverse areas. However, as clusters of Symmetric Multi-Processor (SMP) and heterogeneous machines become popular, conventional message passing models must be adapted accordingly to support this new kind of clusters efficiently. In addition, Java programming language, with its features like object oriented architecture, platform independent bytecode, and native support for multithreading, makes it an alternative language for cluster computing. This research presents a new parallel programming model and a library called VCluster that implements this model on top of a Java Virtual Machine (JVM). The programming model is based on virtual migrating threads to support clusters of heterogeneous SMP machines efficiently. VCluster is implemented in 100% Java, utilizing the portability of Java to address the problems of heterogeneous machines. VCluster virtualizes computational and communication resources such as threads, computation states, and communication channels across multiple separate JVMs, which makes a mobile thread possible. Equipped with virtual migrating thread, it is feasible to balance the load of computing resources dynamically. Several large scale parallel applications have been developed using VCluster to compare the performance and usage of VCluster with other libraries. The results of the experiments show that VCluster makes it easier to develop multithreading parallel applications compared to conventional libraries like MPI. At the same time, the performance of VCluster is comparable to MPICH, a widely used MPI library, combined with popular threading libraries like POSIX Thread and OpenMP. In the next phase of our work, we implemented thread group and thread migration to demonstrate the feasibility of dynamic load balancing in VCluster. We carried out experiments to show that the load can be dynamically balanced in VCluster, resulting in a better performance. Thread group also makes it possible to implement collective communication functions between threads, which have been proved to be useful in process based libraries
RELEASE: A High-level Paradigm for Reliable Large-scale Server Software
Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the rst six months. The project aim is to scale the Erlang's radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the e ectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene
Architecture aware parallel programming in Glasgow parallel Haskell (GPH)
General purpose computing architectures are evolving quickly to become manycore
and hierarchical: i.e. a core can communicate more quickly locally than
globally. To be effective on such architectures, programming models must be
aware of the communications hierarchy. This thesis investigates a programming
model that aims to share the responsibility of task placement, load balance, thread
creation, and synchronisation between the application developer and the runtime
system.
The main contribution of this thesis is the development of four new architectureaware
constructs for Glasgow parallel Haskell that exploit information about task
size and aim to reduce communication for small tasks, preserve data locality, or to
distribute large units of work. We define a semantics for the constructs that specifies the sets of PEs that each construct identifies, and we check four properties
of the semantics using QuickCheck.
We report a preliminary investigation of architecture aware programming
models that abstract over the new constructs. In particular, we propose architecture
aware evaluation strategies and skeletons. We investigate three common
paradigms, such as data parallelism, divide-and-conquer and nested parallelism,
on hierarchical architectures with up to 224 cores. The results show that the
architecture-aware programming model consistently delivers better speedup and
scalability than existing constructs, together with a dramatic reduction in the
execution time variability.
We present a comparison of functional multicore technologies and it reports
some of the first ever multicore results for the Feedback Directed Implicit Parallelism
(FDIP) and the semi-explicit parallelism (GpH and Eden) languages. The
comparison reflects the growing maturity of the field by systematically evaluating
four parallel Haskell implementations on a common multicore architecture.
The comparison contrasts the programming effort each language requires with
the parallel performance delivered.
We investigate the minimum thread granularity required to achieve satisfactory
performance for three implementations parallel functional language on a
multicore platform. The results show that GHC-GUM requires a larger thread
granularity than Eden and GHC-SMP. The thread granularity rises as the number
of cores rises
Simulation of networks of spiking neurons: A review of tools and strategies
We review different aspects of the simulation of spiking neural networks. We
start by reviewing the different types of simulation strategies and algorithms
that are currently implemented. We next review the precision of those
simulation strategies, in particular in cases where plasticity depends on the
exact timing of the spikes. We overview different simulators and simulation
environments presently available (restricted to those freely available, open
source and documented). For each simulation tool, its advantages and pitfalls
are reviewed, with an aim to allow the reader to identify which simulator is
appropriate for a given task. Finally, we provide a series of benchmark
simulations of different types of networks of spiking neurons, including
Hodgkin-Huxley type, integrate-and-fire models, interacting with current-based
or conductance-based synapses, using clock-driven or event-driven integration
strategies. The same set of models are implemented on the different simulators,
and the codes are made available. The ultimate goal of this review is to
provide a resource to facilitate identifying the appropriate integration
strategy and simulation tool to use for a given modeling problem related to
spiking neural networks.Comment: 49 pages, 24 figures, 1 table; review article, Journal of
Computational Neuroscience, in press (2007
Parallel Processes in HPX: Designing an Infrastructure for Adaptive Resource Management
Advancement in cutting edge technologies have enabled better energy efficiency as well as scaling computational power for the latest High Performance Computing(HPC) systems. However, complexity, due to hybrid architectures as well as emerging classes of applications, have shown poor computational scalability using conventional execution models. Thus alternative means of computation, that addresses the bottlenecks in computation, is warranted. More precisely, dynamic adaptive resource management feature, both from systems as well as application\u27s perspective, is essential for better computational scalability and efficiency. This research presents and expands the notion of Parallel Processes as a placeholder for procedure definitions, targeted at one or more synchronous domains, meta data for computation and resource management as well as infrastructure for dynamic policy deployment. In addition to this, the research presents additional guidelines for a framework for resource management in HPX runtime system. Further, this research also lists design principles for scalability of Active Global Address Space (AGAS), a necessary feature for Parallel Processes. Also, to verify the usefulness of Parallel Processes, a preliminary performance evaluation of different task scheduling policies is carried out using two different applications. The applications used are: Unbalanced Tree Search, a reference dynamic graph application, implemented by this research in HPX and MiniGhost, a reference stencil based application using bulk synchronous parallel model. The results show that different scheduling policies provide better performance for different classes of applications; and for the same application class, in certain instances, one policy fared better than the others, while vice versa in other instances, hence supporting the hypothesis of the need of dynamic adaptive resource management infrastructure, for deploying different policies and task granularities, for scalable distributed computing
- …