There has been much research interest in efficient implementations of the Committed
Choice Non-Deterministic (CCND) logic languages on parallel computers. To take
full advantage of the speed gains of parallel computers, methods need to be found
to automatically distribute goals over the machine processors, ideally with as little
involvement from the user as possible.In this thesis we explore some automatic goal distribution strategies for the execu¬
tion of the CCND languages on commercially available distributed memory parallel
computers.There are two facets to the goal distribution strategies we have chosen to explore:DEMAND DRIVEN: An idle processor requests work from other processors. We describe
two strategies in this class: one in which an idle processor asks only neighbouring
processors for spare work, the nearest-neighbour strategy; and one where an idle
processor may ask any other processor in the machine for spare work, the allprocessors strategy.WEIGHTS: Using a program analysis technique devised by Tick, weights are attached to
goals; the weights can be used to order the goals so that they can be executed
and distributed out in weighted order, possibly increasing performance.We describe a framework in which to implement and analyse goal distribution strategies, and then go on to describe experiments with demand driven strategies, both with
and without weights. The experiments were made using two of our own implementations of Flat Guarded Horn Clauses — an interpreter and a WAM-like system —
executing on a MEIKO T800 Transputer Array configured in a 2-D mesh topology.Analysis of the results show that the all-processors strategies are promising (AP-NW),
adding weights had little positive effect on performance, and that nearest-neighbours
strategies can reduce performance due to bad load balancing.We also describe some preliminary experiments for a variant of the AP-NW strategy:
goals which suspend on one variable are sent to the processor that controls that variable,
the processes-to-data strategy. And we briefly look at some preliminary results of
executing programs on large numbers of processors (> 30)