4,335 research outputs found

    Climbing depth-bounded adjacent discrepancy search for solving hybrid flow shop scheduling problems with multiprocessor tasks

    Full text link
    This paper considers multiprocessor task scheduling in a multistage hybrid flow-shop environment. The problem even in its simplest form is NP-hard in the strong sense. The great deal of interest for this problem, besides its theoretical complexity, is animated by needs of various manufacturing and computing systems. We propose a new approach based on limited discrepancy search to solve the problem. Our method is tested with reference to a proposed lower bound as well as the best-known solutions in literature. Computational results show that the developed approach is efficient in particular for large-size problems

    Survivable algorithms and redundancy management in NASA's distributed computing systems

    Get PDF
    The design of survivable algorithms requires a solid foundation for executing them. While hardware techniques for fault-tolerant computing are relatively well understood, fault-tolerant operating systems, as well as fault-tolerant applications (survivable algorithms), are, by contrast, little understood, and much more work in this field is required. We outline some of our work that contributes to the foundation of ultrareliable operating systems and fault-tolerant algorithm design. We introduce our consensus-based framework for fault-tolerant system design. This is followed by a description of a hierarchical partitioning method for efficient consensus. A scheduler for redundancy management is introduced, and application-specific fault tolerance is described. We give an overview of our hybrid algorithm technique, which is an alternative to the formal approach given

    A parallel implementation of a multisensor feature-based range-estimation method

    Get PDF
    There are many proposed vision based methods to perform obstacle detection and avoidance for autonomous or semi-autonomous vehicles. All methods, however, will require very high processing rates to achieve real time performance. A system capable of supporting autonomous helicopter navigation will need to extract obstacle information from imagery at rates varying from ten frames per second to thirty or more frames per second depending on the vehicle speed. Such a system will need to sustain billions of operations per second. To reach such high processing rates using current technology, a parallel implementation of the obstacle detection/ranging method is required. This paper describes an efficient and flexible parallel implementation of a multisensor feature-based range-estimation algorithm, targeted for helicopter flight, realized on both a distributed-memory and shared-memory parallel computer

    Performance Analysis of a Novel GPU Computation-to-core Mapping Scheme for Robust Facet Image Modeling

    Get PDF
    Though the GPGPU concept is well-known in image processing, much more work remains to be done to fully exploit GPUs as an alternative computation engine. This paper investigates the computation-to-core mapping strategies to probe the efficiency and scalability of the robust facet image modeling algorithm on GPUs. Our fine-grained computation-to-core mapping scheme shows a significant performance gain over the standard pixel-wise mapping scheme. With in-depth performance comparisons across the two different mapping schemes, we analyze the impact of the level of parallelism on the GPU computation and suggest two principles for optimizing future image processing applications on the GPU platform

    ILP-based approaches to partitioning recurrent workloads upon heterogeneous multiprocessors

    Get PDF
    The problem of partitioning systems of independent constrained-deadline sporadic tasks upon heterogeneous multiprocessor platforms is considered. Several different integer linear program (ILP) formulations of this problem, offering different tradeoffs between effectiveness (as quantified by speedup bound) and running time efficiency, are presented

    Parallel processing and expert systems

    Get PDF
    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited

    Efficient Scheduling Algorithms for Robot Inverse Dynamics Computation on a Multiprocessor System

    Get PDF
    Robot manipulators are highly nonlinear systems and their motion control requires the computation of generalized forces/torques to drive all the joint motors at an adequate rate. This paper presents efficient scheduling algorithms for computing the robot inverse dynamics on a multiprocessor system. The problem of scheduling the inverse dynamics computation consisting of m computational modules to be executed on a multiprocessor system consisting of p identical homogeneous processors to achieve a minimum-scheduled length is known to be NP-complete. In order to achieve the minimum computation time, the Newton-Euler equations of motion are expressed in the homogeneous linear recurrence form which results in achieving maximum parallelism. To speed up the searching for a solution, a heuristic search algorithm called Dynamical Highest Level First/Most Immediate Successors First (DHLF /MISF) is first proposed to find a fast but suboptimal schedule. For an optimal schedule, the minimum-scheduled-length problem can be solved by a state- space search method — the A* algorithm coupled with an efficient heuristic function derived from the Fernandez and Bussell bound. The state-space search method is a classical minimum cost graph search algorithm, which is guaranteed to find the optimal solution if the evaluation function is properly defined. An objective function is defined in terms of the task execution time and the optimization of the objective function is based on the minimax of the execution time. The proposed optimization algorithm solves the minimum-scheduled-length problem in pseudo-polynominal time and can be used to solve various large-scale problems in a reasonable time. An illustrative example of computing the inverse dynamics of an n-link manipulator based on the Newton-Euler dynamic equations is performed to show the effectiveness of the A algorithm and the heuristic algorithm DHLF /MISF

    A communication-ordered task graph allocation algorithm

    Get PDF
    technical reportThe inherently asynchronous nature of the data flow computation model allows the exploitation of maximum parallelism in program execution. While this computational model holds great promise, several problems must be solved in order to achieve a high degree of program performance. The allocation and scheduling of programs on MIMD distributed memory parallel hardware, is necessary for the implementation of efficient parallel systems. Finding optimal solutions requires that maximum parallelism be achieved consistent with resource limits and minimizing communication costs, and has been proven to be in the class of NP-complete problems. This paper addresses the problem of static allocation of tasks to distributed memory MIMD systems where simultaneous computation and communication is a factor. This paper discusses similarities and differences between several recent heuristic allocation approaches and identifies common problems inherent in these approaches. This paper presents a new algorithm scheme and heuristics that resolves the identified problems and shows significant performance benefits

    A scheduling framework for heterogenous multiprocessor architectures based on industrial processors (DSP and microcontrollers)

    Get PDF
    Current VLSI and networking technology, the increase in computational power, and the rapid decrease in computational cost, enable the interconnection of VLSI processors, which can be arranged on a functional decomposition of the computational task to exploit the potential of multiprocessing. The use of multiprocessor systems in such way, provides a novel and cost effective solution in solving many practical problems in signal processing, control systems, instrumentation systems and robotics. In this article we present a framework that addresses the specificities of industrial processors, such as DSPs and microcontrollers and can easily be used to implement a huge range of scheduling algorithms
    corecore