3,930 research outputs found

    Efficient cloud computing system operation strategies

    Get PDF
    Cloud computing systems have emerged as a new paradigm of computing systems by providing on demand based services which utilize large size computing resources. Service providers offer Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) to users depending on their demand and users pay only for the user resources. The Cloud system has become a successful business model and is expanding its scope through collaboration with various applications such as big data processing, Internet of Things (IoT), robotics, and 5G networks. Cloud computing systems are composed of large numbers of computing, network, and storage devices across the geographically distributed area and multiple tenants employ the cloud systems simultaneously with heterogeneous resource requirements. Thus, efficient operation of cloud computing systems is extremely difficult for service providers. In order to maximize service providers\u27 profit, the cloud systems should be able to serve large numbers of tenants while minimizing the OPerational EXpenditure (OPEX). For serving as many tenants as possible tenants using limited resources, the service providers should implement efficient resource allocation for users\u27 requirements. At the same time, cloud infrastructure consumes a significant amount of energy. According to recent disclosures, Google data centers consumed nearly 300 million watts and Facebook\u27s data centers consumed 60 million watts. Explosive traffic demand for data centers will keep increasing because of expansion of mobile and cloud traffic requirements. If service providers do not develop efficient ways for energy management in their infrastructures, this will cause significant power consumption in running their cloud infrastructures. In this thesis, we consider optimal datasets allocation in distributed cloud computing systems. Our objective is to minimize processing time and cost. Processing time includes virtual machine processing time, communication time, and data transfer time. In distributed Cloud systems, communication time and data transfer time are important component of processing time because data centers are distributed geographically. If we place data sets far from each other, this increases the communication and data transfer time. The cost objective includes virtual machine cost, communication cost, and data transfer cost. Cloud service providers charge for virtual machine usage according to usage time of virtual machine. Communication cost and transfer cost are charged based on transmission speed of data and data set size. The problem of allocating data sets to VMs in distributed heterogeneous clouds is formulated as a linear programming model with two objectives: the cost and processing time. After finding optimal solutions of each objective function, we use a heuristic approach to find the Pareto front of multi-objective linear programming problem. In the simulation experiment, we consider a heterogeneous cloud infrastructure with five different types of cloud service provider resource information, and we optimize data set placement by guaranteeing Pareto optimality of the solutions. Also, this thesis proposes an adaptive data center activation model that consolidates adaptive activation of switches and hosts simultaneously integrated with a statistical request prediction algorithm. The learning algorithm predicts user requests in predetermined interval by using a cyclic window learning algorithm. Then the data center activates an optimal number of switches and hosts in order to minimize power consumption that is based on prediction. We designed an adaptive data center activation model by using a cognitive cycle composed of three steps: data collection, prediction, and activation. In the request prediction step, the prediction algorithm forecasts a Poisson distribution parameter lambda in every determined interval by using Maximum Likelihood Estimation (MLE) and Local Linear Regression (LLR) methods. Then, adaptive activation of the data center is implemented with the predicted parameter in every interval. The adaptive activation model is formulated as a Mixed Integer Linear Programming (MILP) model. Switches and hosts are modeled as M/M/1 and M/M/c queues. In order to minimize power consumption of data centers, the model minimizes the number of activated switches, hosts, and memory modules while guaranteeing Quality of Service (QoS). Since the problem is NP-hard, we use the Simulated Annealing algorithm to solve the model. We employ Google cluster trace data to simulate our prediction model. Then, the predicted data is employed to test adaptive activation model and observed energy saving rate in every interval. In the experiment, we could observe that the adaptive activation model saves 30 to 50% of energy compared to the full operation state of data center in practical utilization rates of data centers. Network Function Virtualization (NFV) emerged as a game changer in network market for efficient operation of the network infrastructure. Since NFV transforms the dedicated physical devices designed for specific network function to software-based Virtual Machines (VMs), the network operators expect to reduce a significant Capital Expenditure (CAPEX) and Operational Expenditure (OPEX). Softwarized VMs can be implemented on any commodity servers, so network operators can design flexible and scalable network architecture through efficient VM placement and migration algorithms. In this thesis, we study a joint problem of Virtualized Network Function (VNF) resource allocation and NFV-Service Chain (NFV-SC) placement problem in Software Defined Network (SDN) based hyper-scale distributed cloud computing infrastructure. The objective of the problem is minimizing the power consumption of the infrastructure while enforcing Service Level Agreement (SLA) of users. We employ an M/G/1/K queuing network approximation analysis for the NFV-SC model. The communication time between VNFs is considered in the NFV-SC placement because it influences the performance of NFV-SC in the highly distributed infrastructure environment. The joint problem is modeled by a Mixed Integer Non-linear Programming (MINP) model. However, the problem is intractable in large size infrastructures due to NP-hardness of the problem. We therefore propose a heuristic algorithm which splits the problem into two sub-problems: resource allocation and the NFV-SC embedding. In the numerical analysis, we could observe that the proposed algorithm outperforms the traditional bin packing algorithms in terms of power consumption and SLA assurance. In this thesis, we propose efficient cloud infrastructure management strategies from a single data center point of view to hyper-scale distributed cloud computing infrastructure for profitable cloud system operation. The management schemes are proposed with various objectives such as Quality of Service (Qos), performance, latency, and power consumption. We use efficient mathematical modeling strategies such as Linear Programming (LP), Mixed Integer Linear Programming (MILP), Mixed Integer Non-linear Programming(MINP), convex programming, queuing theory, and probabilistic modeling strategies and prove the efficiency of the proposed strategies through various simulations

    Predictive modeling of PV energy production: How to set up the learning task for a better prediction?

    Get PDF
    In this paper, we tackle the problem of power prediction of several photovoltaic (PV) plants spread over an extended geographic area and connected to a power grid. The paper is intended to be a comprehensive study of one-day ahead forecast of PV energy production along several dimensions of analysis: i) The consideration of the spatio-temporal autocorrelation, which characterizes geophysical phenomena, to obtain more accurate predictions.ii) The learning setting to be considered, i.e. using simple output prediction for each hour or structured output prediction for each day. iii) The learning algorithms: We compare artificial neural networks, most often used for PV prediction forecast, and regression trees for learning adaptive models. The results obtained on two PV power plant datasets show that: taking into account spatio/temporal autocorrelation is beneficial; the structured output prediction setting significantly outperforms the non-structured output prediction setting; and regression trees provide better models than artificial neural networks

    Requests Prediction in Cloud with a Cyclic Window Learning Algorithm

    Get PDF
    Automatic resource scaling is one advantage of cloud systems. Cloud systems are able to scale the number of physical machines depending on user requests. Therefore, accurate request prediction brings a great improvement in cloud systems\u27 performance. If we can make accurate requests prediction, the appropriate number of physical machines that can accommodate predicted amount of requests can be activated and cloud systems will save more energy by preventing excessive activation of physical machines. Also, cloud systems can implement advanced load distribution with accurate requests prediction. We propose a prediction model that predicts probability distribution parameters of requests for each time interval. Maximum Likelihood Estimation (MLE) and Local Linear Regression (LLR) are used to implement this algorithm. An evaluation of the proposed algorithm is performed with the Google cluster-trace data. The prediction is achieved in terms of the number of task arrivals, CPU requests, and memory resource requests. Then the accuracy of prediction is measured with Mean Absolute Percentage Error(MAPE) and Normalized Mean Squared Error (NMSE)

    In-Network Distributed Solar Current Prediction

    Get PDF
    Long-term sensor network deployments demand careful power management. While managing power requires understanding the amount of energy harvestable from the local environment, current solar prediction methods rely only on recent local history, which makes them susceptible to high variability. In this paper, we present a model and algorithms for distributed solar current prediction, based on multiple linear regression to predict future solar current based on local, in-situ climatic and solar measurements. These algorithms leverage spatial information from neighbors and adapt to the changing local conditions not captured by global climatic information. We implement these algorithms on our Fleck platform and run a 7-week-long experiment validating our work. In analyzing our results from this experiment, we determined that computing our model requires an increased energy expenditure of 4.5mJ over simpler models (on the order of 10^{-7}% of the harvested energy) to gain a prediction improvement of 39.7%.Comment: 28 pages, accepted at TOSN and awaiting publicatio

    Green demand aware fog computing : a prediction-based dynamic resource provisioning approach

    Get PDF
    Fog computing could potentially cause the next paradigm shift by extending cloud services to the edge of the network, bringing resources closer to the end-user. With its close proximity to end-users and its distributed nature, fog computing can significantly reduce latency. With the appearance of more and more latency-stringent applications, in the near future, we will witness an unprecedented amount of demand for fog computing. Undoubtedly, this will lead to an increase in the energy footprint of the network edge and access segments. To reduce energy consumption in fog computing without compromising performance, in this paper we propose the Green-Demand-Aware Fog Computing (GDAFC) solution. Our solution uses a prediction technique to identify the working fog nodes (nodes serve when request arrives), standby fog nodes (nodes take over when the computational capacity of the working fog nodes is no longer sufficient), and idle fog nodes in a fog computing infrastructure. Additionally, it assigns an appropriate sleep interval for the fog nodes, taking into account the delay requirement of the applications. Results obtained based on the mathematical formulation show that our solution can save energy up to 65% without deteriorating the delay requirement performance. © 2022 by the authors. Licensee MDPI, Basel, Switzerland

    The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing: Extended Survey

    Full text link
    Graph processing is becoming increasingly prevalent across many application domains. In spite of this prevalence, there is little research about how graphs are actually used in practice. We performed an extensive study that consisted of an online survey of 89 users, a review of the mailing lists, source repositories, and whitepapers of a large suite of graph software products, and in-person interviews with 6 users and 2 developers of these products. Our online survey aimed at understanding: (i) the types of graphs users have; (ii) the graph computations users run; (iii) the types of graph software users use; and (iv) the major challenges users face when processing their graphs. We describe the participants' responses to our questions highlighting common patterns and challenges. Based on our interviews and survey of the rest of our sources, we were able to answer some new questions that were raised by participants' responses to our online survey and understand the specific applications that use graph data and software. Our study revealed surprising facts about graph processing in practice. In particular, real-world graphs represent a very diverse range of entities and are often very large, scalability and visualization are undeniably the most pressing challenges faced by participants, and data integration, recommendations, and fraud detection are very popular applications supported by existing graph software. We hope these findings can guide future research
    corecore