Search CORE

207 research outputs found

Directed taskgraph scheduling using simulated annealing

Author: D'Hollander Erik
Devis Yves
Publication venue: 'Informa UK Limited'
Publication date: 01/01/1991
Field of study

Highly parallel computation

Author: Denning Peter J.
Tichy Walter F.
Publication venue
Publication date
Field of study

Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed

NASA Technical Reports Server

Submicron Systems Architecture Project : Semiannual Technical Report

Author: Martin Alain J.
Seitz Charles L.
Van de Snepscheut Jan L. A.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/1992
Field of study

The Mosaic C is an experimental fine-grain multicomputer based on single-chip nodes. The Mosaic C chip includes 64KB of fast dynamic RAM, processor, packet interface, ROM for bootstrap and self-test, and a two-dimensional selftimed router. The chip architecture provides low-overhead and low-latency handling of message packets, and high memory and network bandwidth. Sixty-four Mosaic chips are packaged by tape-automated bonding (TAB) in an 8 x 8 array on circuit boards that can, in turn, be arrayed in two dimensions to build arbitrarily large machines. These 8 x 8 boards are now in prototype production under a subcontract with Hewlett-Packard. We are planning to construct a 16K-node Mosaic C system from 256 of these boards. The suite of Mosaic C hardware also includes host-interface boards and high-speed communication cables. The hardware developments and activities of the past eight months are described in section 2.1. The programming system that we are developing for the Mosaic C is based on the same message-passing, reactive-process, computational model that we have used with earlier multicomputers, but the model is implemented for the Mosaic in a way that supports finegrain concurrency. A process executes only in response to receiving a message, and may in execution send messages, create new processes, and modify its persistent variables before it either exits or becomes dormant in preparation for receiving another message. These computations are expressed in an object-oriented programming notation, a derivative of C++ called C+-. The computational model and the C+- programming notation are described in section 2.2. The Mosaic C runtime system, which is written in C+-, provides automatic process placement and highly distributed management of system resources. The Mosaic C runtime system is described in section 2.3

Caltech Authors

A genetic approach using direct representation of solution for the parallel task scheduling problem

Author: Esquivel Susana Cecilia
Gallard Raúl Hector
Gatica Claudia R.
Publication venue
Publication date: 01/10/2001
Field of study

In scheduling, a set of machines in parallel is a setting that is important, from both the theoretical and practical points of view. From the theoretical viewpoint, it is a generalization of the single machine scheduling problem. From the practical point of view the occurrence of resources in parallel is common in real-world. When machines are computers, a parallel program can be conceived as a set of parallel components (tasks) which can be executed according to some precedence relationship. In this case efficient scheduling of tasks permits to take full advantage of the computational power provided by a multiprocessor or a multicomputer system. This kind of planning involves the assignment of partially ordered tasks onto the system architecture processing components. This paper shows the problem of allocating a number of non-identical tasks in a multiprocessor or multicomputer system. The model assumes that the system consists of a number of identical processors and only one task may execute on a processor at a time. All schedules and tasks are non-preemptive. The well-known Graham’s list scheduling algorithm (LSA) is contrasted with an evolutionary approach using a direct representation of solutions.Eje: Computación evolutivaRed de Universidades con Carreras en Informática (RedUNCI

An Evolutionary Approach to Load Balancing Parallel Computations

Author: Fox Geoffrey C.
Mansouri N.
Publication venue: SURFACE at Syracuse University
Publication date: 01/04/1991
Field of study

We present a new approach to balancing the workload in a multicomputer when the problem is decomposed into subproblems mapped to the processors. It is based on a hybrid genetic algorithm. A number of design choices for genetic algorithms are combined in order to ameliorate the problem of premature convergence that is often encountered in the implementation of classical genetic algorithms. The algorithm is hybridized by including a hill climbing procedure which significantly improves the efficiency of the evolution. Moreover, it makes use of problem specific information to evade some computational costs and to reinforce favorable aspects of the genetic search at some appropriate points. The experimental results show that the hybrid genetic algorithm can find solutions within 3% of the optimum in a reasonable time. They also suggest that this approach is not biased towards particular problem structures

Syracuse University Research Facility and Collaborative Environment

A Distributed Discrete-Time Neural Network Architecture for Pattern Allocation and Control

Author: Chronopoulos A.T.
Sarangapani Jagannathan
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2002
Field of study

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Parallel mapping and circuit partitioning heuristics based on mean field annealing

Author: Bultan Tevfik
Publication venue: Bilkent University
Publication date: 01/01/1992
Field of study

Ankara : Department of Computer Engineering and Information Science and the Institute of Engineering and Science of Bilkent University, 1992.Thesis (Master's) -- Bilkent University, 1992.Includes bibliographical references.Moan Field Annealinp; (MFA) aJgoritlim, receñí,ly proposc'd for solving com binatorial optimization problems, combines the characteristics of nenral networks and simulated annealing. In this thesis, MFA is formulated for tlie mapping i)roblcm and the circuit partitioning problem. EHicient implementation schemes, which decrease the complexity of the proposed algorithms by asymptotical factors, are also given. Perlormances of the proposed MFA algorithms are evaluated in comparison with two well-known heuristics: simulated annealing and Kernighan-Lin. Results of the experiments indicate that MFA can be used as an alternative heuristic for the mapping problem and the circuit partitioning problem. Inherent parallelism of the MFA is exploited by designing efficient parallel algorithms for the proposed MFA heuristics. Parallel MFA algorithms proposed for solving the circuit partitioning problem are implemented on an iPS(J/2’ hypercube multicompute.r. Experimental results show that the proposed heuristics can be efficiently parallelized, which is crucial for algorithms that solve such computationally hard problems.Bultan, TevfikM.S

Bilkent University Institutional Repository

Efficient Mapping of Neural Network Models on a Class of Parallel Architectures.

Author: Arad Behnam Seyed
Publication venue: LSU Digital Commons
Publication date: 01/01/1997
Field of study

This dissertation develops a formal and systematic methodology for efficient mapping of several contemporary artificial neural network (ANN) models on k-ary n-cube parallel architectures (KNC\u27s). We apply the general mapping to several important ANN models including feedforward ANN\u27s trained with backpropagation algorithm, radial basis function networks, cascade correlation learning, and adaptive resonance theory networks. Our approach utilizes a parallel task graph representing concurrent operations of the ANN model during training. The mapping of the ANN is performed in two steps. First, the parallel task graph of the ANN is mapped to a virtual KNC of compatible dimensionality. This involves decomposing each operation into its atomic tasks. Second, the dimensionality of the virtual KNC architecture is recursively reduced through a sequence of transformations until a desired metric is optimized. We refer to this process as folding the virtual architecture. The optimization criteria we consider in this dissertation are defined in terms of the iteration time of the algorithm on the folded architecture. If necessary, the mapping scheme may utilize a subset of the processors of a given KNC architecture if it results in the most efficient simulation. A unique feature of our mapping is that it systematically selects an appropriate degree of parallelism leading to a highly efficient realization of the ANN model on KNC architectures. A novel feature of our work is its ability to efficiently map unit-allocating ANN\u27s. These networks possess a dynamic structure which grows during training. We present a highly efficient scheme for simulating such networks on existing KNC parallel architectures. We assume an upper bound on size of the neural network We perform the folding such that the iteration time of the largest network is minimized. We show that our mapping leads to near-optimal simulation of smaller instances of the neural network. In addition, based on our mapping no data migration or task rescheduling is needed as the size of network grows

Louisiana State University

Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing

Author: Fox Geoffrey C.
Mansouri N.
Publication venue: SURFACE at Syracuse University
Publication date: 01/09/1991
Field of study

A new coarse grain parallel genetic algorithm (PGA) and a new implementation of a data-parallel GA are presented in this paper. They are based on models of natural evolution in which the population is formed of discontinuous or continuous subpopulations. In addition to simulating natural evolution, the intrinsic parallelism in the two PGA\u27s minimizes the possibility of premature convergence that the implementation of classic GA\u27s often encounters. Intrinsic parallelism also allows the evolution of fit genotypes in a smaller number of generations in the PGA\u27s than in sequential GA\u27s, leading to superlinear speed-ups. The PGA\u27s have been implemented on a hypercube and a Connection Machine, and their operation is demonstrated by applying them to the load balancing problem in parallel computing. The PGA\u27s have found near-optimal solutions which are comparable to the solutions of a simulated annealing algorithm and are better than those produced by a sequential GA and by other load balancing methods. On one hand, The PGA\u27s accentuate the advantage of parallel computers for simulating natural evolution. On the other hand, they represent new techniques for load balancing parallel computations

Syracuse University Research Facility and Collaborative Environment