1,767 research outputs found

    Adapting the interior point method for the solution of linear programs on high performance computers

    Get PDF
    In this paper we describe a unified algorithmic framework for the interior point method (IPM) of solving Linear Programs (LPs) which allows us to adapt it over a range of high performance computer architectures. We set out the reasons as to why IPM makes better use of high performance computer architecture than the sparse simplex method. In the inner iteration of the IPM a search direction is computed using Newton or higher order methods. Computationally this involves solving a sparse symmetric positive definite (SSPD) system of equations. The choice of direct and indirect methods for the solution of this system and the design of data structures to take advantage of coarse grain parallel and massively parallel computer architectures are considered in detail. Finally, we present experimental results of solving NETLIB test problems on examples of these architectures and put forward arguments as to why integration of the system within sparse simplex is beneficial

    Joint multicast routing and channel assignment in multiradio multichannel wireless mesh networks using simulated annealing

    Get PDF
    This is the post-print version of the article - Copyright @ 2008 Springer-VerlagThis paper proposes a simulated annealing (SA) algorithm based optimization approach to search a minimum-interference multicast tree which satisfies the end-to-end delay constraint and optimizes the usage of the scarce radio network resource in wireless mesh networks. In the proposed SA multicast algorithm, the path-oriented encoding method is adopted and each candidate solution is represented by a tree data structure (i.e., a set of paths). Since we anticipate the multicast trees on which the minimum-interference channel assignment can be produced, a fitness function that returns the total channel conflict is devised. The techniques for controlling the annealing process are well developed. A simple yet effective channel assignment algorithm is proposed to reduce the channel conflict. Simulation results show that the proposed SA based multicast algorithm can produce the multicast trees which have better performance in terms of both the total channel conflict and the tree cost than that of a well known multicast algorithm in wireless mesh networks.This work was supported by the Engineering and Physical Sciences Research Council (EPSRC) of UK under Grant EP/E060722/1

    Joint QoS multicast routing and channel assignment in multiradio multichannel wireless mesh networks using intelligent computational methods

    Get PDF
    Copyright @ 2010 Elsevier B.V. All rights reserved.In this paper, the quality of service multicast routing and channel assignment (QoS-MRCA) problem is investigated. It is proved to be a NP-hard problem. Previous work separates the multicast tree construction from the channel assignment. Therefore they bear severe drawback, that is, channel assignment cannot work well with the determined multicast tree. In this paper, we integrate them together and solve it by intelligent computational methods. First, we develop a unified framework which consists of the problem formulation, the solution representation, the fitness function, and the channel assignment algorithm. Then, we propose three separate algorithms based on three representative intelligent computational methods (i.e., genetic algorithm, simulated annealing, and tabu search). These three algorithms aim to search minimum-interference multicast trees which also satisfy the end-to-end delay constraint and optimize the usage of the scarce radio network resource in wireless mesh networks. To achieve this goal, the optimization techniques based on state of the art genetic algorithm and the techniques to control the annealing process and the tabu search procedure are well developed separately. Simulation results show that the proposed three intelligent computational methods based multicast algorithms all achieve better performance in terms of both the total channel conflict and the tree cost than those comparative references.This work was supported by the Engineering and Physical Sciences Research Council (EPSRC) of UK under Grant EP/E060722/1

    Requirements for implementing real-time control functional modules on a hierarchical parallel pipelined system

    Get PDF
    Analysis of a robot control system leads to a broad range of processing requirements. One fundamental requirement of a robot control system is the necessity of a microcomputer system in order to provide sufficient processing capability.The use of multiple processors in a parallel architecture is beneficial for a number of reasons, including better cost performance, modular growth, increased reliability through replication, and flexibility for testing alternate control strategies via different partitioning. A survey of the progression from low level control synchronizing primitives to higher level communication tools is presented. The system communication and control mechanisms of existing robot control systems are compared to the hierarchical control model. The impact of this design methodology on the current robot control systems is explored

    Programming Models\u27 Support for Heterogeneous Architecture

    Get PDF
    Accelerator-enhanced computing platforms have drawn a lot of attention due to their massive peak computational capacity. Heterogeneous systems equipped with accelerators such as GPUs have become the most prominent components of High Performance Computing (HPC) systems. Even at the node level the significant heterogeneity of CPU and GPU, i.e. hardware and memory space differences, leads to challenges for fully exploiting such complex architectures. Extending outside the node scope, only escalate such challenges. Conventional programming models such as data- ow and message passing have been widely adopted in HPC communities. When moving towards heterogeneous systems, the lack of GPU integration causes such programming models to struggle in handling the heterogeneity of different computing units, leading to sub-optimal performance and drastic decrease in developer productivity. To bridge the gap between underlying heterogeneous architectures and current programming paradigms, we propose to extend such programming paradigms with architecture awareness optimization. Two programming models are used to demonstrate the impact of heterogeneous architecture awareness. The PaRSEC task-based runtime, an adopter of the data- ow model, provides opportunities for overlapping communications with computations and minimizing data movements, as well as dynamically adapting the work granularity to the capability of the hardware. To fulfill the demand of an efficient and portable Message Passing Interface (MPI) implementation to communicate GPU data, a GPU-aware design is presented based on the Open MPI infrastructure supporting efficient point-to-point and collective communications of GPU-residential data, for both contiguous and non-contiguous memory layouts, by leveraging GPU network topology and hardware capabilities such as GPUDirect. The tight integration of GPU support in a widely used programming environment, free the developers from manually move data into/out of host memory before/after relying on MPI routines for communications, allowing them to focus instead on algorithmic optimizations. Experimental results have confirmed that supported by such a tight and transparent integration, conventional programming models can once again take advantage of the state-of-the-art hardware and exhibit performance at the levels expected by the underlying hardware capabilities

    Three Highly Parallel Computer Architectures and Their Suitability for Three Representative Artificial Intelligence Problems

    Get PDF
    Virtually all current Artificial Intelligence (AI) applications are designed to run on sequential (von Neumann) computer architectures. As a result, current systems do not scale up. As knowledge is added to these systems, a point is reached where their performance quickly degrades. The performance of a von Neumann machine is limited by the bandwidth between memory and processor (the von Neumann bottleneck). The bottleneck is avoided by distributing the processing power across the memory of the computer. In this scheme the memory becomes the processor (a smart memory ). This paper highlights the relationship between three representative AI application domains, namely knowledge representation, rule-based expert systems, and vision, and their parallel hardware realizations. Three machines, covering a wide range of fundamental properties of parallel processors, namely module granularity, concurrency control, and communication geometry, are reviewed: the Connection Machine (a fine-grained SIMD hypercube), DADO (a medium-grained MIMD/SIMD/MSIMD tree-machine), and the Butterfly (a coarse-grained MIMD Butterflyswitch machine)

    State of the art baseband DSP platforms for Software Defined Radio: A survey

    Get PDF
    Software Defined Radio (SDR) is an innovative approach which is becoming a more and more promising technology for future mobile handsets. Several proposals in the field of embedded systems have been introduced by different universities and industries to support SDR applications. This article presents an overview of current platforms and analyzes the related architectural choices, the current issues in SDR, as well as potential future trends.Peer reviewe
    corecore