5 research outputs found

    Cache policies for cloud-based systems: To keep or not to keep

    Full text link
    In this paper, we study cache policies for cloud-based caching. Cloud-based caching uses cloud storage services such as Amazon S3 as a cache for data items that would have been recomputed otherwise. Cloud-based caching departs from classical caching: cloud resources are potentially infinite and only paid when used, while classical caching relies on a fixed storage capacity and its main monetary cost comes from the initial investment. To deal with this new context, we design and evaluate a new caching policy that minimizes the overall cost of a cloud-based system. The policy takes into account the frequency of consumption of an item and the cloud cost model. We show that this policy is easier to operate, that it scales with the demand and that it outperforms classical policies managing a fixed capacity.Comment: Proceedings of IEEE International Conference on Cloud Computing 2014 (CLOUD 14

    Data storage security and privacy in cloud computing: A comprehensive survey

    Get PDF
    Cloud Computing is a form of distributed computing wherein resources and application platforms are distributed over the Internet through on demand and pay on utilization basis. Data Storage is main feature that cloud data centres are provided to the companies/organizations to preserve huge data. But still few organizations are not ready to use cloud technology due to lack of security. This paper describes the different techniques along with few security challenges, advantages and also disadvantages. It also provides the analysis of data security issues and privacy protection affairs related to cloud computing by preventing data access from unauthorized users, managing sensitive data, providing accuracy and consistency of data store

    A Cost-Effective Strategy for Intermediate Data Storage in Scientific Cloud Workflow Systems

    No full text
    Abstract—Many scientific workflows are data intensive where a large volume of intermediate data is generated during their execution. Some valuable intermediate data need to be stored for sharing or reuse. Traditionally, they are selectively stored according to the system storage capacity, determined manually. As doing science on cloud has become popular nowadays, more intermediate data can be stored in scientific cloud workflows based on a pay-foruse model. In this paper, we build an Intermediate data Dependency Graph (IDG) from the data provenances in scientific workflows. Based on the IDG, we develop a novel intermediate data storage strategy that can reduce the cost of the scientific cloud workflow system by automatically storing the most appropriate intermediate datasets in the cloud storage. We utilise Amazon’s cost model and apply the strategy to an astrophysics pulsar searching scientific workflow for evaluation. The results show that our strategy can reduce the overall cost of scientific cloud workflow execution significantly

    Partitioning workflow applications over federated clouds to meet non-functional requirements

    Get PDF
    PhD ThesisWith cloud computing, users can acquire computer resources when they need them on a pay-as-you-go business model. Because of this, many applications are now being deployed in the cloud, and there are many di erent cloud providers worldwide. Importantly, all these various infrastructure providers o er services with di erent levels of quality. For example, cloud data centres are governed by the privacy and security policies of the country where the centre is located, while many organisations have created their own internal \private cloud" to meet security needs. With all this varieties and uncertainties, application developers who decide to host their system in the cloud face the issue of which cloud to choose to get the best operational conditions in terms of price, reliability and security. And the decision becomes even more complicated if their application consists of a number of distributed components, each with slightly di erent requirements. Rather than trying to identify the single best cloud for an application, this thesis considers an alternative approach, that is, combining di erent clouds to meet users' non-functional requirements. Cloud federation o ers the ability to distribute a single application across two or more clouds, so that the application can bene t from the advantages of each one of them. The key challenge for this approach is how to nd the distribution (or deployment) of application components, which can yield the greatest bene ts. In this thesis, we tackle this problem and propose a set of algorithms, and a framework, to partition a work ow-based application over federated clouds in order to exploit the strengths of each cloud. The speci c goal is to split a distributed application structured as a work ow such that the security and reliability requirements of each component are met, whilst the overall cost of execution is minimised. To achieve this, we propose and evaluate a cloud broker for partitioning a work ow application over federated clouds. The broker integrates with the e-Science Central cloud platform to automatically deploy a work ow over public and private clouds. We developed a deployment planning algorithm to partition a large work ow appli- - i - cation across federated clouds so as to meet security requirements and minimise the monetary cost. A more generic framework is then proposed to model, quantify and guide the partitioning and deployment of work ows over federated clouds. This framework considers the situation where changes in cloud availability (including cloud failure) arise during work ow execution

    End-to-End Trust Fulfillment of Big Data Workflow Provisioning over Competing Clouds

    Get PDF
    Cloud Computing has emerged as a promising and powerful paradigm for delivering data- intensive, high performance computation, applications and services over the Internet. Cloud Computing has enabled the implementation and success of Big Data, a relatively recent phenomenon consisting of the generation and analysis of abundant data from various sources. Accordingly, to satisfy the growing demands of Big Data storage, processing, and analytics, a large market has emerged for Cloud Service Providers, offering a myriad of resources, platforms, and infrastructures. The proliferation of these services often makes it difficult for consumers to select the most suitable and trustworthy provider to fulfill the requirements of building complex workflows and applications in a relatively short time. In this thesis, we first propose a quality specification model to support dual pre- and post-cloud workflow provisioning, consisting of service provider selection and workflow quality enforcement and adaptation. This model captures key properties of the quality of work at different stages of the Big Data value chain, enabling standardized quality specification, monitoring, and adaptation. Subsequently, we propose a two-dimensional trust-enabled framework to facilitate end-to-end Quality of Service (QoS) enforcement that: 1) automates cloud service provider selection for Big Data workflow processing, and 2) maintains the required QoS levels of Big Data workflows during runtime through dynamic orchestration using multi-model architecture-driven workflow monitoring, prediction, and adaptation. The trust-based automatic service provider selection scheme we propose in this thesis is comprehensive and adaptive, as it relies on a dynamic trust model to evaluate the QoS of a cloud provider prior to taking any selection decisions. It is a multi-dimensional trust model for Big Data workflows over competing clouds that assesses the trustworthiness of cloud providers based on three trust levels: (1) presence of the most up-to-date cloud resource verified capabilities, (2) reputational evidence measured by neighboring users and (3) a recorded personal history of experiences with the cloud provider. The trust-based workflow orchestration scheme we propose aims to avoid performance degradation or cloud service interruption. Our workflow orchestration approach is not only based on automatic adaptation and reconfiguration supported by monitoring, but also on predicting cloud resource shortages, thus preventing performance degradation. We formalize the cloud resource orchestration process using a state machine that efficiently captures different dynamic properties of the cloud execution environment. In addition, we use a model checker to validate our monitoring model in terms of reachability, liveness, and safety properties. We evaluate both our automated service provider selection scheme and cloud workflow orchestration, monitoring and adaptation schemes on a workflow-enabled Big Data application. A set of scenarios were carefully chosen to evaluate the performance of the service provider selection, workflow monitoring and the adaptation schemes we have implemented. The results demonstrate that our service selection outperforms other selection strategies and ensures trustworthy service provider selection. The results of evaluating automated workflow orchestration further show that our model is self-adapting, self-configuring, reacts efficiently to changes and adapts accordingly while enforcing QoS of workflows
    corecore