112 research outputs found

    Contextual Bandit Modeling for Dynamic Runtime Control in Computer Systems

    Get PDF
    Modern operating systems and microarchitectures provide a myriad of mechanisms for monitoring and affecting system operation and resource utilization at runtime. Dynamic runtime control of these mechanisms can tailor system operation to the characteristics and behavior of the current workload, resulting in improved performance. However, developing effective models for system control can be challenging. Existing methods often require extensive manual effort, computation time, and domain knowledge to identify relevant low-level performance metrics, relate low-level performance metrics and high-level control decisions to workload performance, and to evaluate the resulting control models. This dissertation develops a general framework, based on the contextual bandit, for describing and learning effective models for runtime system control. Random profiling is used to characterize the relationship between workload behavior, system configuration, and performance. The framework is evaluated in the context of two applications of progressive complexity; first, the selection of paging modes (Shadow Paging, Hardware-Assisted Page) in the Xen virtual machine memory manager; second, the utilization of hardware memory prefetching for multi-core, multi-tenant workloads with cross-core contention for shared memory resources, such as the last-level cache and memory bandwidth. The resulting models for both applications are competitive in comparison to existing runtime control approaches. For paging mode selection, the resulting model provides equivalent performance to the state of the art while substantially reducing the computation requirements of profiling. For hardware memory prefetcher utilization, the resulting models are the first to provide dynamic control for hardware prefetchers using workload statistics. Finally, a correlation-based feature selection method is evaluated for identifying relevant low-level performance metrics related to hardware memory prefetching

    Machine Learning-based Orchestration Solutions for Future Slicing-Enabled Mobile Networks

    Get PDF
    The fifth generation mobile networks (5G) will incorporate novel technologies such as network programmability and virtualization enabled by Software-Defined Networking (SDN) and Network Function Virtualization (NFV) paradigms, which have recently attracted major interest from both academic and industrial stakeholders. Building on these concepts, Network Slicing raised as the main driver of a novel business model where mobile operators may open, i.e., “slice”, their infrastructure to new business players and offer independent, isolated and self-contained sets of network functions and physical/virtual resources tailored to specific services requirements. While Network Slicing has the potential to increase the revenue sources of service providers, it involves a number of technical challenges that must be carefully addressed. End-to-end (E2E) network slices encompass time and spectrum resources in the radio access network (RAN), transport resources on the fronthauling/backhauling links, and computing and storage resources at core and edge data centers. Additionally, the vertical service requirements’ heterogeneity (e.g., high throughput, low latency, high reliability) exacerbates the need for novel orchestration solutions able to manage end-to-end network slice resources across different domains, while satisfying stringent service level agreements and specific traffic requirements. An end-to-end network slicing orchestration solution shall i) admit network slice requests such that the overall system revenues are maximized, ii) provide the required resources across different network domains to fulfill the Service Level Agreements (SLAs) iii) dynamically adapt the resource allocation based on the real-time traffic load, endusers’ mobility and instantaneous wireless channel statistics. Certainly, a mobile network represents a fast-changing scenario characterized by complex spatio-temporal relationship connecting end-users’ traffic demand with social activities and economy. Legacy models that aim at providing dynamic resource allocation based on traditional traffic demand forecasting techniques fail to capture these important aspects. To close this gap, machine learning-aided solutions are quickly arising as promising technologies to sustain, in a scalable manner, the set of operations required by the network slicing context. How to implement such resource allocation schemes among slices, while trying to make the most efficient use of the networking resources composing the mobile infrastructure, are key problems underlying the network slicing paradigm, which will be addressed in this thesis

    Bao: Learning to Steer Query Optimizers

    Full text link
    Query optimization remains one of the most challenging problems in data management systems. Recent efforts to apply machine learning techniques to query optimization challenges have been promising, but have shown few practical gains due to substantive training overhead, inability to adapt to changes, and poor tail performance. Motivated by these difficulties and drawing upon a long history of research in multi-armed bandits, we introduce Bao (the BAndit Optimizer). Bao takes advantage of the wisdom built into existing query optimizers by providing per-query optimization hints. Bao combines modern tree convolutional neural networks with Thompson sampling, a decades-old and well-studied reinforcement learning algorithm. As a result, Bao automatically learns from its mistakes and adapts to changes in query workloads, data, and schema. Experimentally, we demonstrate that Bao can quickly (an order of magnitude faster than previous approaches) learn strategies that improve end-to-end query execution performance, including tail latency. In cloud environments, we show that Bao can offer both reduced costs and better performance compared with a sophisticated commercial system

    Strategic and Blockchain-based Market Decisions for Cloud Computing

    Get PDF
    The cloud computing market has been in the center of attention for years where cloud providers strive to survive by either competition or cooperation. Some cloud providers choose to compete in the market that is dominated by few large providers and try to maximize their profit without sacrificing the service quality which leads to higher user ratings. Many research proposals tried to contribute to the cloud market competition. However, the majority of these proposals focus only on pricing mechanisms, neglecting thus the cloud service quality and users satisfaction. Meanwhile, cloud providers intend to form cloud federations to enhance their services quality and revenues. Nevertheless, traditional centralized cloud federations have strict challenges that might hinder the members' motivation to participate in, such as formation of stable coalitions with long-term commitments, participants' trustworthiness, shared revenue, and security of the managed data and services. For a stable and trustworthy federation, it is vital to avoid blind-trust on the claimed SLA guarantees from the members and monitor the quality of service considering the various characteristics of cloud services. This thesis aims to tackle the issues of cloud computing market from the two perspectives of competition and cooperation by: 1) modeling and solving the conflicting situation of revenue, user ratings and service quality, to improve the providers position in the market and increase the future users' demand; 2) proposing a user-centric game theoretical framework to allow the new and smaller cloud providers to have a share in the market and increase users satisfaction through providing high quality and added-value services; 3) motivating the cloud providers to adopt a coopetition behavior through a novel, fully distributed blockchain-based federation's structure that enables them to trade their computing resources through smart contracts; 4) introducing a new role of oracle as a verifier agent to monitor the quality of service and report to the smart contract agents deployed on the blockchain while optimizing the cost of using oracles; and 5) developing a Bayesian bandit learning oracles reliability mechanism to select the oracles smartly and optimize the cost and reliability of utilized oracles. All of the contributions are validated by simulations and implementations using real-world data

    Machine Learning-Powered Management Architectures for Edge Services in 5G Networks

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Business process improvement with performance-based sequential experiments

    Full text link
    Various lifecycle approaches to Business Process Management (BPM) have a common assumption that a process is incrementally improved in the redesign phase. While this assumption is hardly questioned in BPM research, there is evidence from the field of AB testing that improvement concepts often do not lead to actual improvements. If incremental process improvement can only be achieved in a fraction of the cases, there is a need to rapidly validate the assumed benefits. Contemporary BPM research does not provide techniques and guidelines on testing and validating the supposed improvements in a fair manner. In this research, we address these challenges by integrating business process execution concepts with ideas from a set of software engineering practices known as DevOps. We propose a business process improvement methodology named AB-BPM, and a set of techniques that allow us to enact the steps in this methodology. As a first technique, we develop a simulation technique that estimates the performance of a new version in an offline setting using historical data of the old version. Since the results of simulation can be speculative, we propose shadow testing as the next step. Our Shadow testing technique partially executes the new version in production alongside the old version in such a way that the new version does not throttle the old version. Finally, we develop techniques that offer AB testing for redesigned processes with immediate feedback at runtime. AB testing compares two versions of a deployed product (e.g., a Web page) by observing users responses to versions A/B, and determines which one performs better. We propose two algorithms, LTAvgR and ProcessBandit, that dynamically adjust request allocation to two versions during the test based on their performance
    • …
    corecore