    Paging with Dynamic Memory Capacity

    We study a generalization of the classic paging problem that allows the amount of available memory to vary over time - capturing a fundamental property of many modern computing realities, from cloud computing to multi-core and energy-optimized processors. It turns out that good performance in the "classic" case provides no performance guarantees when memory capacity fluctuates: roughly speaking, moving from static to dynamic capacity can mean the difference between optimality within a factor 2 in space and time, and suboptimality by an arbitrarily large factor. More precisely, adopting the competitive analysis framework, we show that some online paging algorithms, despite having an optimal (h,k)-competitive ratio when capacity remains constant, are not (3,k)-competitive for any arbitrarily large k in the presence of minimal capacity fluctuations. In this light it is surprising that several classic paging algorithms perform remarkably well even if memory capacity changes adversarially - in fact, even without taking those changes into explicit account! In particular, we prove that LFD still achieves the minimum number of faults, and that several classic online algorithms such as LRU have a "dynamic" (h,k)-competitive ratio that is the best one can achieve without knowledge of future page requests, even if one had perfect knowledge of future capacity fluctuations. Thus, with careful management, knowing/predicting future memory resources appears far less crucial to performance than knowing/predicting future data accesses. We characterize the optimal "dynamic" (h,k)-competitive ratio exactly, and show it has a somewhat complex expression that is almost but not quite equal to the "classic" ratio k/(k-h+1), thus proving a strict if minuscule separation between online paging performance achievable in the presence or absence of capacity fluctuations

    Efficient caching algorithms for memory management in computer systems

    As disk performance continues to lag behind that of memory systems and processors, fully utilizing memory to reduce disk accesses is a highly effective effort to improve the entire system performance. Furthermore, to serve the applications running on a computer in distributed systems, not only the local memory but also the memory on remote servers must be effectively managed to minimize I/O operations. The critical challenges in an effective memory cache management include: (1) Insightfully understanding and quantifying the locality inherent in the memory access requests; (2) Effectively utilizing the locality information in replacement algorithms; (3) Intelligently placing and replacing data in the multi-level caches of a distributed system; (4) Ensuring that the overheads of the proposed schemes are acceptable.;This dissertation provides solutions and makes unique and novel contributions in application locality quantification, general replacement algorithms, low-cost replacement policy, thrashing protection, as well as multi-level cache management in a distributed system. First, the dissertation proposes a new method to quantify locality strength, and accurately to identify the data with strong locality. It also provides a new replacement algorithm, which significantly outperforms existing algorithms. Second, considering the extremely low-cost requirements on replacement policies in virtual memory management, the dissertation proposes a policy meeting the requirements, and considerably exceeding the performance existing policies. Third, the dissertation provides an effective scheme to protect the system from thrashing for running memory-intensive applications. Finally, the dissertation provides a multi-level block placement and replacement protocol in a distributed client-server environment, exploiting non-uniform locality strengths in the I/O access requests.;The methodology used in this study include careful application behavior characterization, system requirement analysis, algorithm designs, trace-driven simulation, and system implementations. A main conclusion of the work is that there is still much room for innovation and significant performance improvement for the seemingly mature and stable policies that have been broadly used in the current operating system design

    New Hybrid Approach to Exploit Localities: LRFU with Adaptive Prefetching

    Resource Management in Multi-Access Edge Computing (MEC)

    This PhD thesis investigates the effective ways of managing the resources of a Multi-Access Edge Computing Platform (MEC) in 5th Generation Mobile Communication (5G) networks. The main characteristics of MEC include distributed nature, proximity to users, and high availability. Based on these key features, solutions have been proposed for effective resource management. In this research, two aspects of resource management in MEC have been addressed. They are the computational resource and the caching resource which corresponds to the services provided by the MEC. MEC is a new 5G enabling technology proposed to reduce latency by bringing cloud computing capability closer to end-user Internet of Things (IoT) and mobile devices. MEC would support latency-critical user applications such as driverless cars and e-health. These applications will depend on resources and services provided by the MEC. However, MEC has limited computational and storage resources compared to the cloud. Therefore, it is important to ensure a reliable MEC network communication during resource provisioning by eradicating the chances of deadlock. Deadlock may occur due to a huge number of devices contending for a limited amount of resources if adequate measures are not put in place. It is crucial to eradicate deadlock while scheduling and provisioning resources on MEC to achieve a highly reliable and readily available system to support latency-critical applications. In this research, a deadlock avoidance resource provisioning algorithm has been proposed for industrial IoT devices using MEC platforms to ensure higher reliability of network interactions. The proposed scheme incorporates Banker’s resource-request algorithm using Software Defined Networking (SDN) to reduce communication overhead. Simulation and experimental results have shown that system deadlock can be prevented by applying the proposed algorithm which ultimately leads to a more reliable network interaction between mobile stations and MEC platforms. Additionally, this research explores the use of MEC as a caching platform as it is proclaimed as a key technology for reducing service processing delays in 5G networks. Caching on MEC decreases service latency and improve data content access by allowing direct content delivery through the edge without fetching data from the remote server. Caching on MEC is also deemed as an effective approach that guarantees more reachability due to proximity to endusers. In this regard, a novel hybrid content caching algorithm has been proposed for MEC platforms to increase their caching efficiency. The proposed algorithm is a unification of a modified Belady’s algorithm and a distributed cooperative caching algorithm to improve data access while reducing latency. A polynomial fit algorithm with Lagrange interpolation is employed to predict future request references for Belady’s algorithm. Experimental results show that the proposed algorithm obtains 4% more cache hits due to its selective caching approach when compared with case study algorithms. Results also show that the use of a cooperative algorithm can improve the total cache hits up to 80%. Furthermore, this thesis has also explored another predictive caching scheme to further improve caching efficiency. The motivation was to investigate another predictive caching approach as an improvement to the formal. A Predictive Collaborative Replacement (PCR) caching framework has been proposed as a result which consists of three schemes. Each of the schemes addresses a particular problem. The proactive predictive scheme has been proposed to address the problem of continuous change in cache popularity trends. The collaborative scheme addresses the problem of cache redundancy in the collaborative space. Finally, the replacement scheme is a solution to evict cold cache blocks and increase hit ratio. Simulation experiment has shown that the replacement scheme achieves 3% more cache hits than existing replacement algorithms such as Least Recently Used, Multi Queue and Frequency-based replacement. PCR algorithm has been tested using a real dataset (MovieLens20M dataset) and compared with an existing contemporary predictive algorithm. Results show that PCR performs better with a 25% increase in hit ratio and a 10% CPU utilization overhead

    Interpretability of AI in Computer Systems and Public Policy

    Advances in Artificial Intelligence (AI) have led to spectacular innovations and sophisticated systems for tasks that were thought to be capable only by humans. Examples include playing chess and Go, face and voice recognition, driving vehicles, and more. In recent years, the impact of AI has moved beyond offering mere predictive models into building interpretable models that appeal to human logic and intuition because they ensure transparency and simplicity and can be used to make meaningful decisions in real-world applications. A second trend in AI is characterized by important advancements in the realm of causal reasoning. Identifying causal relationships is an important aspect of scientific endeavors in a variety of fields. Causal models and Bayesian inference can help us gain better domain-specific insight and make better data-driven decisions because of their interpretability. The main objective of this dissertation was to adapt theoretically sound AI-based interpretable data-analytic approaches to solve domain-specific problems in the two un-related fields of Storage Systems and Public Policy. For the first task, we considered the well-studied problem of cache replacement problem in computing systems, which can be modeled as a variant of the well-known Multi-Armed Bandit (MAB) problem with delayed feedback and decaying costs, and developed an algorithm called EXP4-DFDC. We proved theoretically that EXP4-DFDC exhibits an important feature called vanishing regret. Based on the theoretical analysis, we designed a machine-learning algorithm called ALeCaR, with adaptive hyperparameters. We used extensive experiments on a wide range of workloads to show that ALeCaR performed better than LeCaR, the best machine learning algorithm for cache replacement at that time. We concluded that reinforcement machine learning can offer an outstanding approach for implementing cache management policies. For the second task, we used Bayesian networks to analyze the service request data from three 311 centers providing non-emergency services in the cities of Miami-Dade, New York City, and San Francisco. Using a causal inference approach, this study investigated the presence of inequities in the quality of the 311 services to neighborhoods with varying demographics and socioeconomic status. We concluded that the services provided by the local governments showed no detectable biases on the basis of race, ethnicity, or socioeconomic status

    Profiler and compiler assisted adaptive I/O prefetching for shared storage caches

    I/O prefetching has been employed in the past as one of the mech- anisms to hide large disk latencies. However, I/O prefetching in parallel applications is problematic when multiple CPUs share the same set of disks due to the possibility that prefetches from different CPUs can interact on shared memory caches in the I/O nodes in complex and unpredictable ways. In this paper, we (i) quantify the impact of compiler-directed I/O prefetching - developed originally in the context of sequential execution - on shared caches at I/O nodes. The experimental data collected shows that while I/O prefetching brings benefits, its effectiveness reduces significantly as the number of CPUs is increased; (ii) identify inter-CPU misses due to harmful prefetches as one of the main sources for this re- duction in performance with the increased number of CPUs; and (iii) propose and experimentally evaluate a profiler and compiler assisted adaptive I/O prefetching scheme targeting shared storage caches. The proposed scheme obtains inter-thread data sharing information using profiling and, based on the captured data sharing patterns, divides the threads into clusters and assigns a separate (customized) I/O prefetcher thread for each cluster. In our approach, the compiler generates the I/O prefetching threads automatically. We implemented this new I/O prefetching scheme using a compiler and the PVFS file system running on Linux, and the empirical data collected clearly underline the importance of adapting I/O prefetching based on program phases. Specifically, our pro- posed scheme improves performance, on average, by 19.9%, 11.9% and http://dx.doi.org/10.3% over the cases without I/O prefetching, with independent I/O prefetching (each CPU is performing compiler-directed I/O prefetching independently), and with one CPU prefetching (one CPU is reserved for prefetching on behalf of others), respectively, when 8 CPUs are used. Copyright 2008 ACM

    A Web Cache Replacement Strategy for Safety-Critical Systems

    A Safety-Critical System (SCS), such as a spacecraft, is usually a complex system. It produces a large amount of test data during a comprehensive testing process. The large amount of data is often managed by a comprehensive test data query system. The primary factor affecting the management experience of a comprehensive test data query system is the performance of querying the test data. It is a big challenge to manage and maintain the huge and complex testing data.To address this challenge, a web cache replacement algorithm which can effectively improve the query performance and reduce the network latency is needed. However, a general-purpose web cache replacement algorithm usually cannot be directly applied to this type of system due to the low hit rate and low byte hit rate. In order to improve the hit rate and byte hit rate, a data stream mining technology is introduced, and a new web cache algorithm GDSF-DST (Greedy Dual-Size Frequency with Data Stream Technology) for the Safety-Critical System (SCS) is proposed based on the original GDSF algorithm. The experimental results show that compared with state of the art traditional algorithms, GDSF-DST achieves competitive performance and improves the hit rate and byte hit rate by about 20%

    36th International Symposium on Theoretical Aspects of Computer Science: STACS 2019, March 13-16, 2019, Berlin, Germany

