This paper addresses optimizing the execution of range queries into
multi-dimensional datasets on distributed memory parallel machines within
the Active Data Repository framework. ADR is an infrastructure that
integrates storage, retrieval and processing of large multi-dimensional
datasets on distributed memory parallel architectures with multiple disks
attached to each node. We describe three potential strategies for
efficient execution of such queries that employ different tiling and
workload partitioning approaches. We evaluate scalability of these
strategies for different application scenarios, varying both the number of
processors and the input dataset size on a 128 processor IBM SP
multicomputer.
Also cross-referenced as UMIACS-TR-99-2