938 research outputs found

    Shared versus distributed memory multiprocessors

    Get PDF
    The question of whether multiprocessors should have shared or distributed memory has attracted a great deal of attention. Some researchers argue strongly for building distributed memory machines, while others argue just as strongly for programming shared memory multiprocessors. A great deal of research is underway on both types of parallel systems. Special emphasis is placed on systems with a very large number of processors for computation intensive tasks and considers research and implementation trends. It appears that the two types of systems will likely converge to a common form for large scale multiprocessors

    Evaluation of a local strategy for high performance memory management

    Get PDF
    Conventional operating systems, like Silicon Graphics' IRIX and IBM's AIX, adopt a single Memory Management algorithm. The choice of this algorithm is usually based on its good performance in relation to the set of programs executed in the computer. Some approximation of LRU (least­recently used) is usually adopted. This choice can take to certain situations in that the computer presents a bad performance due to its bad behavior for certain programs. A possible solution for such cases is to enable each program to have a specific Management algorithm (local strategy) that is adapted to its Memory access pattern. For example, programs with sequential access pattern, such as SOR, should be managed by the algorithm MRU (most­recently used) because its bad performance when managed by LRU. In this strategy it is very important to decide the Memory partitioning strategy among the programs in execution in a multiprogramming environment. Our strategy named CAPR (Compiler­Aided Page Replacement) analyze the pattern of Memory references from the source program of an application and communicate these characteristics to the operating system that will make the choice of the best Management algorithm and Memory partitioning strategy. This paper evaluates the influence of the Management algorithms and Memory partitioning strategy in the global system performance and in the individual performance of each program. It is also presented a comparison of this local strategy with the classic global strategy and the viability of the strategy is analyzed. The obtained results showed a difference of at least an order of magnitude in the number of page faults among the algorithms LRU and MRU in the global strategy. After that, starting from the analysis of the intrinsic behavior of each application in relation to its Memory access pattern and of the number of page faults, an optimization procedure of Memory system performance was developed for multiprogramming environments. This procedure allows to decide system performance parameters, such as Memory partitioning strategy among the programs and the appropriate Management algorithm for each program. The results showed that, with the local Management strategy, it was obtained a reduction of at least an order of magnitude in the number of page faults and a reduction in the mean Memory usage of about 3 to 4 times in relation to the global strategy. This performance improvement shows the viability of our strategy. It is also presented some implementation aspects of this strategy in traditional operating systems.Sistemas Distribuidos - Redes ConcurrenciaRed de Universidades con Carreras en Informática (RedUNCI

    A Classification and Survey of Computer System Performance Evaluation Techniques

    Get PDF
    Classification and survey of computer system performance evaluation technique

    Scheduling issues on IBM p690: Performance Analysis with the PARbench Environment

    Get PDF

    The susceptibility of programs to context switching

    Full text link

    Near-Memory Address Translation

    Full text link
    Memory and logic integration on the same chip is becoming increasingly cost effective, creating the opportunity to offload data-intensive functionality to processing units placed inside memory chips. The introduction of memory-side processing units (MPUs) into conventional systems faces virtual memory as the first big showstopper: without efficient hardware support for address translation MPUs have highly limited applicability. Unfortunately, conventional translation mechanisms fall short of providing fast translations as contemporary memories exceed the reach of TLBs, making expensive page walks common. In this paper, we are the first to show that the historically important flexibility to map any virtual page to any page frame is unnecessary in today's servers. We find that while limiting the associativity of the virtual-to-physical mapping incurs no penalty, it can break the translate-then-fetch serialization if combined with careful data placement in the MPU's memory, allowing for translation and data fetch to proceed independently and in parallel. We propose the Distributed Inverted Page Table (DIPTA), a near-memory structure in which the smallest memory partition keeps the translation information for its data share, ensuring that the translation completes together with the data fetch. DIPTA completely eliminates the performance overhead of translation, achieving speedups of up to 3.81x and 2.13x over conventional translation using 4KB and 1GB pages respectively.Comment: 15 pages, 9 figure
    • …
    corecore