Search CORE

9 research outputs found

A Parallel, Multithreaded Decision Tree Builder

Author: Girija J. Narlikar
Publication venue
Publication date: 01/01/1998
Field of study

Parallelization has become a popular mechanism to speed up data classification tasks that deal with large amounts of data. This paper describes a high-level, fine-grained parallel formulation of a decision tree-based classifier for memory-resident datasets on SMPs. We exploit two levels of divide-and-conquer parallelism in the tree builder: at the outer level across the tree nodes, and at the inner level within each tree node. Lightweight Pthreads are used to express this highly irregular and dynamic parallelism in a natural manner. The task of scheduling the threads and balancing the load is left to a space-efficient Pthreads scheduler. Experimental results on large datasets indicate that the space and time performance of the tree builder scales well with both the data size and number of processors. This research is supported by ARPA Contract No. DABT63-96-C-0071. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes, notwithstanding any copyri..

CiteSeerX

Scheduling Threads for Low Space Requirement and Good Locality

Author: Girija J. Narlikar
Publication venue
Publication date: 01/01/1999
Field of study

The running time and memory requirement of a parallel program with dynamic, lightweight threads depends heavily on the underlying thread scheduler. In this paper, we present a simple, asynchronous, space-efficient scheduling algorithm for shared memory machines that combines the low scheduling overheads and good locality of work stealing with the low space requirements of depth-first schedulers. For a nested-parallel program with depth D and serial space requirement S 1 , we show that the expected space requirement is S 1 +O(K \Delta p \Delta D) on p processors. Here, K is a user-adjustable runtime parameter, which provides a tradeoff between running time and space requirement. Our algorithm achieves good locality and low scheduling overheads by automatically increasing the granularity of the work scheduled on each processor. We have implemented the new scheduling algorithm in the context of a native, user-level implementation of Posix standard threads or Pthreads, and evaluated its p..

CiteSeerX

Crossref

Pthreads for Dynamic Parallelism

Author: Girija J. Narlikar
Guy E. Blelloch
Publication venue
Publication date: 01/01/1998
Field of study

Expressing a large number of lightweight, parallel threads in a shared address space significantly eases the task of writing a parallel program. Threads can be dynamically created to execute individual parallel tasks; the implementation schedules these threads onto the processors and effectively balances the load. However, unless the threads scheduler is designed carefully, such a parallel program may suffer poor space and time performance. In this paper, we evaluate the performance of a native, lightweight POSIX threads (Pthreads) library on a shared memory machine using a set of parallel benchmarks that dynamically create a large number of threads. By studying the performance of one of the benchmarks, matrix multiply, we show how simple, yet provably good modifications to the library can result in significantly improved space and time performance. With the modified Pthreads library, each of the parallel benchmarks performs as well as its coarse-grained, hand-partitioned counterpart. ..

CiteSeerX

A framework for space and time efficient scheduling of parallelism

Author: Girija J. Narlikar
Guy E. Blelloch
Publication venue
Publication date: 01/01/1996
Field of study

Many of today’s high level parallel languages support dynamic, fine-grained parallelism. These languages allow the user to expose all the parallelism in the program, which is typically of a much higher degree than the number of processors. Hence an efficient scheduling algorithm is required to assign computations to processors at runtime. Besides having low overheads and good load balancing, it is important for the scheduling algorithm to minimize the space usage of the parallel program. In this paper, we first present a general framework to model non-preemptive parallel computations based on task graphs, in which schedules of the graphs represent executions of the computations. We then prove bounds on the space and time requirements of certain classes of schedules that can be generated by an offline scheduler. Next, we present an online scheduling algorithm that is provably space-efficient and time-efficient for multithreaded computations with nested parallelism. If a serial execution requires 1 units of memory for a computation of depth and work, our algorithm results in an execution on processors that requires 1 log units of memory, and log time, including scheduling overheads. Finally, we demonstrate that our scheduling algorithm is efficient in practice. We have implemented a runtime system that uses our algorithm to schedule parallel threads. The results of executing parallel programs on this system show that our scheduling algorithm significantly reduces memory usage compared to previous techniques, without compromising performance

CiteSeerX

Space-Efficient Scheduling of Nested Parallelism

Author: Girija J. Narlikar
Guy E. Blelloch
Publication venue
Publication date: 01/01/1999
Field of study

This article presents an on-line scheduling algorithm that is provably space e#cient and time e#cient for nested-parallel languages. For a computation with depth D and serial space requirement S1 , the algorithm generates a schedule that requires at most S1 +O(K D p)space (including scheduler space) on p processors. Here, K is a user-adjustable runtime parameter specifying the net amount of memory that a thread may allocate before it is preempted by the scheduler. Adjusting the value of K provides a trade-o# between the running time and the memory requirement of a parallel computation. To allow the scheduler to scale with the number of processors, we also parallelize the scheduler and analyze the space and time bounds of the computation to include scheduling costs. In addition to showing that the scheduling algorithm is space and time e#cient in theory, we demonstrate that it is e#ective in practice. We have implemented a runtime system that uses our algorithm to schedule lightweight parallel threads. The results of executing parallel programs on this system show that our scheduling algorithm significantly reduces memory usage compared to previous techniques, without compromising performanc

CiteSeerX

Space-efficient scheduling of parallelism with synchronization variables

Author: Carnegie Mellon
Girija J. Narlikar
Guy Blelloch
Phillip B. Gibbons
Yossi Matias
Publication venue
Publication date: 01/01/1997
Field of study

Recent work on scheduling algorithms has resulted in provable bounds on the space taken by parallel computations in relation to the space taken by sequential computations. The results for online versions of these algorithms, however, have been limited to computations in which threads can only synchronize with ancestor or sibling threads. Such computations do not include Ianguages with futures or user-specified synchronize ation const mints. Here we extend the results to languages with synchronization variables. Such languages include languages with futures, such as Multilisp and Cool, as well as other languages such as ID. The main result is an ordine scheduling algorithm which, given a computation with w work (total operations), u synchronizations, a’depth (critical path) and SI sequential space, WiIl run in O(w/P +

CiteSeerX

Crossref

Provably space-efficient parallel functional programming

Crossref