1,045 research outputs found

    Software Development for Parallel and Multi-Core Processing

    Get PDF

    Requirements for implementing real-time control functional modules on a hierarchical parallel pipelined system

    Get PDF
    Analysis of a robot control system leads to a broad range of processing requirements. One fundamental requirement of a robot control system is the necessity of a microcomputer system in order to provide sufficient processing capability.The use of multiple processors in a parallel architecture is beneficial for a number of reasons, including better cost performance, modular growth, increased reliability through replication, and flexibility for testing alternate control strategies via different partitioning. A survey of the progression from low level control synchronizing primitives to higher level communication tools is presented. The system communication and control mechanisms of existing robot control systems are compared to the hierarchical control model. The impact of this design methodology on the current robot control systems is explored

    An Implementation of a Dual-Processor System on FPGA

    Get PDF
    In recent years, Field-Programmable Gate Arrays (FPGA) have evolved rapidly paving the way for a whole new range of computing paradigms. On the other hand, computer applications are evolving. There is a rising demand for a system that is general-purpose and yet has the processing abilities to accommodate current trends in application processing. This work proposes a design and implementation of a tightly-coupled FPGA-based dual-processor platform. We architect a platform that optimizes the utilization of FPGA resources and allows for the investigation of practical implementation issues such as cache design. The performance of the proposed prototype is then evaluated, as different configurations of a uniprocessor and a dual-processor system are studied and compared against each other and against published results for common industry-standard CPU platforms. The proposed implementation utilizes the Nios II 32-bit embedded soft-core processor architecture designed for the Altera Cyclone III family of FPGAs

    Design of a monitor for the debugging and development of multiprocessing process control systems : a thesis presented in partial fulfilment of the requirements for the degree of Master of Technology in Computing Technology at Massey University

    Get PDF
    This thesis describes the design of a general purpose tool for debugging and developing multimicroprocessor process control systems. With the decreasing pnce of computers, multimicroprocessors are increasingly being used for process control. However, the lack of published information on multiprocessing systems and distributed systems has meant that methodologies and tools for debugging and developing such systems have been slow to develop. The monitor designed here is system independent, a considerable advantage over other such tools that are currently available

    Hyperswitch communication network

    Get PDF
    The Hyperswitch Communication Network (HCN) is a large scale parallel computer prototype being developed at JPL. Commercial versions of the HCN computer are planned. The HCN computer being designed is a message passing multiple instruction multiple data (MIMD) computer, and offers many advantages in price-performance ratio, reliability and availability, and manufacturing over traditional uniprocessors and bus based multiprocessors. The design of the HCN operating system is a uniquely flexible environment that combines both parallel processing and distributed processing. This programming paradigm can achieve a balance among the following competing factors: performance in processing and communications, user friendliness, and fault tolerance. The prototype is being designed to accommodate a maximum of 64 state of the art microprocessors. The HCN is classified as a distributed supercomputer. The HCN system is described, and the performance/cost analysis and other competing factors within the system design are reviewed

    Many-Task Computing and Blue Waters

    Full text link
    This report discusses many-task computing (MTC) generically and in the context of the proposed Blue Waters systems, which is planned to be the largest NSF-funded supercomputer when it begins production use in 2012. The aim of this report is to inform the BW project about MTC, including understanding aspects of MTC applications that can be used to characterize the domain and understanding the implications of these aspects to middleware and policies. Many MTC applications do not neatly fit the stereotypes of high-performance computing (HPC) or high-throughput computing (HTC) applications. Like HTC applications, by definition MTC applications are structured as graphs of discrete tasks, with explicit input and output dependencies forming the graph edges. However, MTC applications have significant features that distinguish them from typical HTC applications. In particular, different engineering constraints for hardware and software must be met in order to support these applications. HTC applications have traditionally run on platforms such as grids and clusters, through either workflow systems or parallel programming systems. MTC applications, in contrast, will often demand a short time to solution, may be communication intensive or data intensive, and may comprise very short tasks. Therefore, hardware and software for MTC must be engineered to support the additional communication and I/O and must minimize task dispatch overheads. The hardware of large-scale HPC systems, with its high degree of parallelism and support for intensive communication, is well suited for MTC applications. However, HPC systems often lack a dynamic resource-provisioning feature, are not ideal for task communication via the file system, and have an I/O system that is not optimized for MTC-style applications. Hence, additional software support is likely to be required to gain full benefit from the HPC hardware

    SKIRT: hybrid parallelization of radiative transfer simulations

    Full text link
    We describe the design, implementation and performance of the new hybrid parallelization scheme in our Monte Carlo radiative transfer code SKIRT, which has been used extensively for modeling the continuum radiation of dusty astrophysical systems including late-type galaxies and dusty tori. The hybrid scheme combines distributed memory parallelization, using the standard Message Passing Interface (MPI) to communicate between processes, and shared memory parallelization, providing multiple execution threads within each process to avoid duplication of data structures. The synchronization between multiple threads is accomplished through atomic operations without high-level locking (also called lock-free programming). This improves the scaling behavior of the code and substantially simplifies the implementation of the hybrid scheme. The result is an extremely flexible solution that adjusts to the number of available nodes, processors and memory, and consequently performs well on a wide variety of computing architectures.Comment: 21 pages, 20 figure

    TensorFlow Enabled Genetic Programming

    Full text link
    Genetic Programming, a kind of evolutionary computation and machine learning algorithm, is shown to benefit significantly from the application of vectorized data and the TensorFlow numerical computation library on both CPU and GPU architectures. The open source, Python Karoo GP is employed for a series of 190 tests across 6 platforms, with real-world datasets ranging from 18 to 5.5M data points. This body of tests demonstrates that datasets measured in tens and hundreds of data points see 2-15x improvement when moving from the scalar/SymPy configuration to the vector/TensorFlow configuration, with a single core performing on par or better than multiple CPU cores and GPUs. A dataset composed of 90,000 data points demonstrates a single vector/TensorFlow CPU core performing 875x better than 40 scalar/Sympy CPU cores. And a dataset containing 5.5M data points sees GPU configurations out-performing CPU configurations on average by 1.3x.Comment: 8 pages, 5 figures; presented at GECCO 2017, Berlin, German

    Simulation of Multiprocessor System Scheduling

    Get PDF
    The speed and performance of computers have become a major concern today. Multiprocessor systems are introduced to share the work load between the processors. By sharing the work load among processors, the speed and efficiency of the system can be increased drastically. To share the workload properly between the processors in the multiprocessor system a proper scheduling algorithm is needed. Hence, one of the major factors that influence the speed and efficiency of the multiprocessor system is scheduling. In this thesis, the main focus is on the process scheduling for multiprocessor systems. The factors which influence scheduling and scheduling algorithms are discussed. Based on this idea of sharing the load among processors in the multiprocessor system, a simulation model for scheduling in a symmetric multiprocessor system is developed in the Department of Digital and Computer systems at Tampere University of Technology. This model treats all the processors in the system equally by allocating equal processor time for all the processes in the task and also evaluates the total execution time of the system for processing an input job. The scheduling algorithm in this simulation model is based on the input processes priority. The necessity of scheduling in multiprocessor systems is elaborated. The goal of this thesis is to analyse how process scheduling influences the speed of the multiprocessor system. Also, the difference in total execution time of the input jobs with different number of processors and capacity of the processors in the multiprocessor system is studied
    corecore