363 research outputs found

    A Deep Learning Approach for Amazon EC2 Spot Price Prediction

    Get PDF
    © 2018 IEEE. Spot Instances (SI) represent one of the ways cloud service providers use to deal with idle resources in off-peak periods, where these resources are being auctioned at low prices to customers with limited budgets in a dynamic manner. However, SI are poorly utilized due to issues like out-of-bid failures and bidding complexity. Thus, effective SI price models are of great importance to customers in order to plan their bidding strategies. This paper proposes a deep learning approach for Amazon EC2 SI price prediction, which is a time-series analysis (TSA) problem. The proposed Long Short-Term Memory (LSTM) approach is compared with a well-known classical (i.e., non deep learning) approach for TSA, which is AutoRegressive Integrated Moving Average (ARIMA), using different accuracy measures commonly used in TSA. The results show the superiority of the LSTM approach compared with the ARIMA approach in many aspects

    AWS PredSpot: Machine Learning for Predicting the Price of Spot Instances in AWS Cloud

    Get PDF
    Elastic Cloud Compute (EC2) is one of the most well-known services provided by Amazon for provisioning cloud computing resources, also known as instances. Besides the classical on-demand scheme, where users purchase compute capacity at a fixed cost, EC2 supports so-called spot instances, which are offered following a bidding scheme, where users can save up to 90% of the cost of the on-demand instance. EC2 spot instances can be a useful alternative for attaining an important reduction in infrastructure cost, but designing bidding policies can be a difficult task, since bidding under their cost will either prevent users from provisioning instances or losing those that they already own. Towards this extent, accurate forecasting of spot instance prices can be of an outstanding interest for designing working bidding policies. In this paper, we propose the use of different machine learning techniques to estimate the future price of EC2 spot instances. These include linear, ridge and lasso regressions, multilayer perceptrons, K-nearest neighbors, extra trees and random forests. The obtained performance varies significantly between instances types, and root mean squared errors ranges between values very close to zero up to values over 60 in some of the most expensive instances. Still, we can see that for most of the instances, forecasting performance is remarkably good, encouraging further research in this field of study

    Reducing the price of resource provisioning using EC2 spot instances with prediction models

    Get PDF
    The increasing demand of computing resources has boosted the use of cloud computing providers. This has raised a new dimension in which the connections between resource usage and costs have to be considered from an organizational perspective. As a part of its EC2 service, Amazon introduced spot instances (SI) as a cheap public infrastructure, but at the price of not ensuring reliability of the service. On the Amazon SI model, hired instances can be abruptly terminated by the service provider when necessary. The interface for managing SI is based on a bidding strategy that depends on non-public Amazon pricing strategies, which makes complicated for users to apply any scheduling or resource provisioning strategy based on such (cheaper) resources. Although it is believed that the use of the EC2 SIs infrastructure can reduce costs for final users, a deep review of literature concludes that their characteristics and possibilities have not yet been deeply explored. In this work we present a framework for the analysis of the EC2 SIs infrastructure that uses the price history of such resources in order to classify the SI availability zones and then generate price prediction models adapted to each class. The proposed models are validated through a formal experimentation process. As a result, these models are applied to generate resource provisioning plans that get the optimal price when using the SI infrastructure in a real scenario. Finally, the recent changes that Amazon has introduced in the SI model and how this work can adapt to these changes is discussed

    HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges

    Full text link
    High Performance Computing (HPC) clouds are becoming an alternative to on-premise clusters for executing scientific applications and business analytics services. Most research efforts in HPC cloud aim to understand the cost-benefit of moving resource-intensive applications from on-premise environments to public cloud platforms. Industry trends show hybrid environments are the natural path to get the best of the on-premise and cloud resources---steady (and sensitive) workloads can run on on-premise resources and peak demand can leverage remote resources in a pay-as-you-go manner. Nevertheless, there are plenty of questions to be answered in HPC cloud, which range from how to extract the best performance of an unknown underlying platform to what services are essential to make its usage easier. Moreover, the discussion on the right pricing and contractual models to fit small and large users is relevant for the sustainability of HPC clouds. This paper brings a survey and taxonomy of efforts in HPC cloud and a vision on what we believe is ahead of us, including a set of research challenges that, once tackled, can help advance businesses and scientific discoveries. This becomes particularly relevant due to the fast increasing wave of new HPC applications coming from big data and artificial intelligence.Comment: 29 pages, 5 figures, Published in ACM Computing Surveys (CSUR

    Rolling Window Time Series Prediction Using MapReduce

    Get PDF
    Prediction of time series data is an important application in many domains. Despite their inherent advantages, traditional databases and MapReduce methodology are not ideally suited for this type of processing due to dependencies introduced by the sequential nature of time series. In this thesis a novel framework is presented to facilitate retrieval and rolling window prediction of irregularly sampled large-scale time series data. By introducing a new index pool data structure, processing of time series can be efficiently parallelised. The proposed framework is implemented in R programming environment and utilises Hadoop to support parallelisation and fault tolerance. A systematic multi-predictor selection model is designed and applied, in order to choose the best-fit algorithm for different circumstances. Additionally, the boosting method is deployed as a post-processing to further optimise the predictive results. Experimental results on a cloud-based platform indicate that the proposed framework scales linearly up to 32-nodes, and performs efficiently with a relatively optimised prediction
    corecore