983,430 research outputs found

    The Longest Queue Drop Policy for Shared-Memory Switches is 1.5-competitive

    Full text link
    We consider the Longest Queue Drop memory management policy in shared-memory switches consisting of NN output ports. The shared memory of size MNM\geq N may have an arbitrary number of input ports. Each packet may be admitted by any incoming port, but must be destined to a specific output port and each output port may be used by only one queue. The Longest Queue Drop policy is a natural online strategy used in directing the packet flow in buffering problems. According to this policy and assuming unit packet values and cost of transmission, every incoming packet is accepted, whereas if the shared memory becomes full, one or more packets belonging to the longest queue are preempted, in order to make space for the newly arrived packets. It was proved in 2001 [Hahne et al., SPAA '01] that the Longest Queue Drop policy is 2-competitive and at least 2\sqrt{2}-competitive. It remained an open question whether a (2-\epsilon) upper bound for the competitive ratio of this policy could be shown, for any positive constant \epsilon. We show that the Longest Queue Drop online policy is 1.5-competitive

    HAPPY: Hybrid Address-based Page Policy in DRAMs

    Full text link
    Memory controllers have used static page closure policies to decide whether a row should be left open, open-page policy, or closed immediately, close-page policy, after the row has been accessed. The appropriate choice for a particular access can reduce the average memory latency. However, since application access patterns change at run time, static page policies cannot guarantee to deliver optimum execution time. Hybrid page policies have been investigated as a means of covering these dynamic scenarios and are now implemented in state-of-the-art processors. Hybrid page policies switch between open-page and close-page policies while the application is running, by monitoring the access pattern of row hits/conflicts and predicting future behavior. Unfortunately, as the size of DRAM memory increases, fine-grain tracking and analysis of memory access patterns does not remain practical. We propose a compact memory address-based encoding technique which can improve or maintain the performance of DRAMs page closure predictors while reducing the hardware overhead in comparison with state-of-the-art techniques. As a case study, we integrate our technique, HAPPY, with a state-of-the-art monitor, the Intel-adaptive open-page policy predictor employed by the Intel Xeon X5650, and a traditional Hybrid page policy. We evaluate them across 70 memory intensive workload mixes consisting of single-thread and multi-thread applications. The experimental results show that using the HAPPY encoding applied to the Intel-adaptive page closure policy can reduce the hardware overhead by 5X for the evaluated 64 GB memory (up to 40X for a 512 GB memory) while maintaining the prediction accuracy

    LERC: Coordinated Cache Management for Data-Parallel Systems

    Full text link
    Memory caches are being aggressively used in today's data-parallel frameworks such as Spark, Tez and Storm. By caching input and intermediate data in memory, compute tasks can witness speedup by orders of magnitude. To maximize the chance of in-memory data access, existing cache algorithms, be it recency- or frequency-based, settle on cache hit ratio as the optimization objective. However, unlike the conventional belief, we show in this paper that simply pursuing a higher cache hit ratio of individual data blocks does not necessarily translate into faster task completion in data-parallel environments. A data-parallel task typically depends on multiple input data blocks. Unless all of these blocks are cached in memory, no speedup will result. To capture this all-or-nothing property, we propose a more relevant metric, called effective cache hit ratio. Specifically, a cache hit of a data block is said to be effective if it can speed up a compute task. In order to optimize the effective cache hit ratio, we propose the Least Effective Reference Count (LERC) policy that persists the dependent blocks of a compute task as a whole in memory. We have implemented the LERC policy as a memory manager in Spark and evaluated its performance through Amazon EC2 deployment. Evaluation results demonstrate that LERC helps speed up data-parallel jobs by up to 37% compared with the widely employed least-recently-used (LRU) policy
    corecore