12 research outputs found

    Parallel processing for scientific computations

    Get PDF
    The main contribution of the effort in the last two years is the introduction of the MOPPS system. After doing extensive literature search, we introduced the system which is described next. MOPPS employs a new solution to the problem of managing programs which solve scientific and engineering applications on a distributed processing environment. Autonomous computers cooperate efficiently in solving large scientific problems with this solution. MOPPS has the advantage of not assuming the presence of any particular network topology or configuration, computer architecture, or operating system. It imposes little overhead on network and processor resources while efficiently managing programs concurrently. The core of MOPPS is an intelligent program manager that builds a knowledge base of the execution performance of the parallel programs it is managing under various conditions. The manager applies this knowledge to improve the performance of future runs. The program manager learns from experience

    Clock synchronization in multiprocessor systems

    Get PDF

    A multiple-bus, active backplane architecture for multiprocessor systems

    Get PDF
    This research investigates several problems associated with current multiprocessor interconnection networks, focusing primarily on general-purpose, shared-memory configurations. The project deals with all aspects of the interconnection, from the architectural level to the physical backplane. A bus-based architecture is presented as an alternative to the limitations of current schemes. This dissertation will focus on the physical layer implementation;For increased reliability, performance and scalability, a multiple-bus architecture is proposed. Each bus uses a word-serial approach to keep the total number of bus signals manageable. A source-synchronous transfer protocol allows data to be streamed at a high rate, thus increasing the pin-efficiency of the bus. The control acquisition scheme combines collision detection and priority arbitration to minimize bus access time without requiring additional signal lines. Cache coherence, message passing, and synchronization primitives are provided within the bus protocol to support multiple-processor systems;To reduce the capacitive loading on the bus, an active backplane is employed. This moves the transceiver and bus interface unit from the plug-in module down to the backplane. In addition to increasing the characteristic impedance of the bus, it also reduces the end-to-end propagation delay. Another advantage of moving the bus transceivers to the backplane is the uniform load presented to the bus, regardless of whether a slot is populated;Due to the reduction in drive current required, a custom CMOS transceiver, suitable for VLSI implementation, is used. It incorporates the collision detection circuitry required for the control acquisition scheme. Initial transceiver prototypes have been designed and fabricated in 2-[mu]m CMOS. These have been successfully tested at transfer rates in excess of 50MHz

    HyperForest: A high performance multi-processor architecture for real-time intelligent systems

    Full text link

    A logical layer protocol for ActiveBus architecture

    Get PDF
    This research investigates several problems associated with current multiprocessor interconnection networks, focusing primarily on general-purpose, shared-memory configurations. The project deals with all aspects of the interconnection, from the architectural level to the physical backplane. A multiple-bus based architecture is presented as an alternative to the limitations of current schemes. This dissertation will focus on the logical layer specification;The ActiveBus--a multiple, active bus--interconnection is proposed. Multiple buses increase the bandwidth as well as reliability of the interconnection while the active backplane shows a reduced and uniform capacitive load;A logical layer protocol was designed for each bus to work independently, to achieve fault tolerance. Each bus uses a word-serial approach to keep the total number of bus signal lines manageable. A dual clocking scheme is proposed. The faster clock is used for data transfer. The other clock, refered to as sync clock, is used for arbitration and handshaking;Absence of discontinuities on the bus coupled with a source-synchronous transfer protocol allows data to be streamed at a high rate, thus increasing the pin-efficiency of the bus. The data transmission rate is limited only by clock skew. In addition, the ActiveBus interface unit and the source synchronous protocol move the synchronization penalty from the shared bus to the private buffer in the unit;The protocol uses a new arbitration scheme, termed Previous Priority First. This hybrid control acquisition scheme combines collision detection and priority arbitration to minimize bus access time without requiring additional signal lines. Collision detection provides a quick access in an unsaturated system while priority arbitration guarantees the deterministic election of the master in a saturated system. The scheme also incorporates a fairness mode to minimize starvation and bus access delay in the system;The cache coherence scheme supports both copy-back and write-through policies to reduce the overhead. MOESI protocol with snoopy caches, being the most general, is followed. Message passing and synchronization primitives are provided within the bus protocol to support multiple processor systems. These primitives attempt to minimize the traffic generated by the spin locks or the memory hot spots
    corecore