7 research outputs found

    The cooperative parallel: A discussion about run-time schedulers for nested parallelism

    Get PDF
    Nested parallelism is a well-known parallelization strategy to exploit irregular parallelism in HPC applications. This strategy also fits in critical real-time embedded systems, composed of a set of concurrent functionalities. In this case, nested parallelism can be used to further exploit the parallelism of each functionality. However, current run-time implementations of nested parallelism can produce inefficiencies and load imbalance. Moreover, in critical real-time embedded systems, it may lead to incorrect executions due to, for instance, a work non-conserving scheduler. In both cases, the reason is that the teams of OpenMP threads are a black-box for the scheduler, i.e., the scheduler that assigns OpenMP threads and tasks to the set of available computing resources is agnostic to the internal execution of each team. This paper proposes a new run-time scheduler that considers dynamic information of the OpenMP threads and tasks running within several concurrent teams, i.e., concurrent parallel regions. This information may include the existence of OpenMP threads waiting in a barrier and the priority of tasks ready to execute. By making the concurrent parallel regions to cooperate, the shared computing resources can be better controlled and a work conserving and priority driven scheduler can be guaranteed.Peer ReviewedPostprint (author's final draft

    EFFICIENT SCHEDULING OF DYNAMIC PROGRAMMING ALGORITHMS ON MULTICORE ARCHITECTURES

    Get PDF
    Dynamic programming is one of the Berkley 13 dwarfs widely used for solving various combinatorial and optimization problems, including matrix chain multiplication, longest common subsequence, binary (0/1) knapsack and so on. Due to nonuniformity in the inherent dependence in dynamic programming algorithms, it becomes necessary to schedule the subproblems of dynamic programming effectively to processing cores for optimal utilization of multicore technology. The computational matrix of dynamic programming is divided into three parts; growing region, stable region and shrinking region depending on whether the number of subproblems increases, remain stable or decreases uniformly phase by phase respectively. We realize the parallel implementations of matrix chain multiplication, longest common subsequence and 0/1 knapsack on Intel Xeon X5650 and E5-2695 using OpenMP with different scheduling policies and adequate chunk sizes. It is concluded that, for the growing or the shrinking region of dynamic programming parallelization adopted in this article, guided schedule is better as compared to other scheduling scheme. Static or dynamic schedule is better for the stable region of dynamic programming. Dynamic programming approach, where all three regions are present, more speedup is achieved by applying the mixed scheduling approach rather than applying only single scheduling technique for the entire computations. In LCS, approximately 20% more speedup is achieved using a mixed scheduling technique over the conventional single scheduling approach on Intel Xeon E5-2695

    A Fourier Continuation Method for the Solution of Elliptic Eigenvalue Problems in General Domains

    Get PDF
    We present a new computational method for the solution of elliptic eigenvalue problems with variable coefficients in general two-dimensional domains. The proposed approach is based on use of the novel Fourier continuation method (which enables fast and highly accurate Fourier approximation of nonperiodic functions in equispaced grids without the limitations arising from the Gibbs phenomenon) in conjunction with an overlapping patch domain decomposition strategy and Arnoldi iteration. A variety of examples demonstrate the versatility, accuracy, and generality of the proposed methodology
    corecore