953 research outputs found

    Strategic and Blockchain-based Market Decisions for Cloud Computing

    Get PDF
    The cloud computing market has been in the center of attention for years where cloud providers strive to survive by either competition or cooperation. Some cloud providers choose to compete in the market that is dominated by few large providers and try to maximize their profit without sacrificing the service quality which leads to higher user ratings. Many research proposals tried to contribute to the cloud market competition. However, the majority of these proposals focus only on pricing mechanisms, neglecting thus the cloud service quality and users satisfaction. Meanwhile, cloud providers intend to form cloud federations to enhance their services quality and revenues. Nevertheless, traditional centralized cloud federations have strict challenges that might hinder the members' motivation to participate in, such as formation of stable coalitions with long-term commitments, participants' trustworthiness, shared revenue, and security of the managed data and services. For a stable and trustworthy federation, it is vital to avoid blind-trust on the claimed SLA guarantees from the members and monitor the quality of service considering the various characteristics of cloud services. This thesis aims to tackle the issues of cloud computing market from the two perspectives of competition and cooperation by: 1) modeling and solving the conflicting situation of revenue, user ratings and service quality, to improve the providers position in the market and increase the future users' demand; 2) proposing a user-centric game theoretical framework to allow the new and smaller cloud providers to have a share in the market and increase users satisfaction through providing high quality and added-value services; 3) motivating the cloud providers to adopt a coopetition behavior through a novel, fully distributed blockchain-based federation's structure that enables them to trade their computing resources through smart contracts; 4) introducing a new role of oracle as a verifier agent to monitor the quality of service and report to the smart contract agents deployed on the blockchain while optimizing the cost of using oracles; and 5) developing a Bayesian bandit learning oracles reliability mechanism to select the oracles smartly and optimize the cost and reliability of utilized oracles. All of the contributions are validated by simulations and implementations using real-world data

    Optimizing Resource Management in Cloud Analytics Services

    Get PDF
    The fundamental challenge in the cloud today is how to build and optimize machine learning and data analytical services. Machine learning and data analytical platforms are changing computing infrastructure from expensive private data centers to easily accessible online services. These services pack user requests as jobs and run them on thousands of machines in parallel in geo-distributed clusters. The scale and the complexity of emerging jobs lead to increasing challenges for the clusters at all levels, from power infrastructure to system architecture and corresponding software framework design. These challenges come in many forms. Today's clusters are built on commodity hardware and hardware failures are unavoidable. Resource competition, network congestion, and mixed generations of hardware make the hardware environment complex and hard to model and predict. Such heterogeneity becomes a crucial roadblock for efficient parallelization on both the task level and job level. Another challenge comes from the increasing complexity of the applications. For example, machine learning services run jobs made up of multiple tasks with complex dependency structures. This complexity leads to difficulties in framework designs. The scale, especially when services span geo-distributed clusters, leads to another important hurdle for cluster design. Challenges also come from the power infrastructure. Power infrastructure is very expensive and accounts for more than 20% of the total costs to build a cluster. Power sharing optimization to maximize the facility utilization and smooth peak hour usages is another roadblock for cluster design. In this thesis, we focus on solutions for these challenges at the task level, on the job level, with respect to the geo-distributed data cloud design and for power management in colocation data centers. At the task level, a crucial hurdle to achieving predictable performance is stragglers, i.e., tasks that take significantly longer than expected to run. At this point, speculative execution has been widely adopted to mitigate the impact of stragglers in simple workloads. We apply straggler mitigation for approximation jobs for the first time. We present GRASS, which carefully uses speculation to mitigate the impact of stragglers in approximation jobs. GRASS's design is based on the analysis of a model we develop to capture the optimal speculation levels for approximation jobs. Evaluations with production workloads from Facebook and Microsoft Bing in an EC2 cluster of 200 nodes show that GRASS increases accuracy of deadline-bound jobs by 47% and speeds up error-bound jobs by 38%. Moving from task level to job level, task level speculation mechanisms are designed and operated independently of job scheduling when, in fact, scheduling a speculative copy of a task has a direct impact on the resources available for other jobs. Thus, we present Hopper, a job-level speculation-aware scheduler that integrates the tradeoffs associated with speculation into job scheduling decisions based on a model generalized from the task-level speculation model. We implement both centralized and decentralized prototypes of the Hopper scheduler and show that 50% (66%) improvements over state-of-the-art centralized (decentralized) schedulers and speculation strategies can be achieved through the coordination of scheduling and speculation. As computing resources move from local clusters to geo-distributed cloud services, we are expecting the same transformation for data storage. We study two crucial pieces of a geo-distributed data cloud system: data acquisition and data placement. Starting from developing the optimal algorithm for the case of a data cloud made up of a single data center, we propose a near-optimal, polynomial-time algorithm for a geo-distributed data cloud in general. We show, via a case study, that the resulting design, Datum, is near-optimal (within 1.6%) in practical settings. Efficient power management is a fundamental challenge for data centers when providing reliable services. Power oversubscription in data centers is very common and may occasionally trigger an emergency when the aggregate power demand exceeds the capacity. We study power capping solutions for handling such emergencies in a colocation data center, where the operator supplies power to multiple tenants. We propose a novel market mechanism based on supply function bidding, called COOP, to financially incentivize and coordinate tenants' power reduction for minimizing total performance loss while satisfying multiple power capping constraints. We demonstrate that COOP is "win-win", increasing the operator's profit (through oversubscription) and reducing tenants' costs (through financial compensation for their power reduction during emergencies).</p

    The Shape of Your Cloud: How to Design and Run Polylithic Cloud Applications

    Get PDF
    Nowadays the major trend in IT dictates deploying applications in the cloud, cutting the monolithic software into small, easily manageable and developable components, and running them in a microservice scheme. With these choices come the questions: which cloud service types to choose from the several available options, and how to distribute the monolith in order to best resonate with the selected cloud features. We propose a model that presents monolithic applications in a novel way and focuses on key properties that are crucial in the development of cloud-native applications. The model focuses on the organization of scaling units, and it accounts for the cost of provisioned resources in scale-out periods and invocation delays among the application components. We analyze dis-aggregated monolithic applications that are deployed in the cloud, offering both Container-as-a-Service (CaaS) and Function-as-a-Service (FaaS) platforms. We showcase the efficiency of our proposed optimization solution by presenting the reduction in operation costs as an illustrative example. We propose to group similarly low scale components together in CaaS, while running dynamically scaled components in FaaS. By doing so, the price is decreased as unnecessary memory provisioning is eliminated, while application response time does not show any degradation

    AN ADAPTABILITY-DRIVEN ECONOMIC MODEL FOR SERVICE PROFITABILITY

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Cost optimization for data placement strategies in an analytical cloud service

    Get PDF
    Analyzing a large amount of business-relevant data in near-realtime in order to assist decision making became a crucial requirement for many businesses in the last years. Therefore, all major database system vendors offer solutions that assist customers in this requirement with systems that are specially tuned for accelerating analytical workloads. Before the decision is made to buy such a huge and expensive solution, customers are interested in getting a detailed workload analysis in order to estimate potential benefits. Therefore, a more agile solution is desirable having lower barriers to entry that allows customers to assess analytical solutions for their workloads and lets data scientists experiment with available data on test systems before rolling out valuable analytical reports on a production system. In such a scenario where separate systems are deployed for handling transactional workloads of daily customers business and conducting business analytics on either a cloud service or a dedicated accelerator appliance, data management and placement strategies are of high importance. Multiple approaches exist for keeping the data set in-sync and guaranteeing data coherence with unique characteristics regarding important metrics that impact query performance, such as the latency when data will be propagated, achievable throughputs for larger data volumes, or the amount of required CPU to detect and deploy data changes. So the important heuristics are analyzed and evolved in order to develop a general model for data placement and maintenance strategies. Based on this theoretical model, a prototype is also implemented that predicts these metrics
    corecore