4 research outputs found

    ‎An Artificial Intelligence Framework for Supporting Coarse-Grained Workload Classification in Complex Virtual Environments

    Get PDF
    Cloud-based machine learning tools for enhanced Big Data applications}‎, ‎where the main idea is that of predicting the ``\emph{next}'' \emph{workload} occurring against the target Cloud infrastructure via an innovative \emph{ensemble-based approach} that combines the effectiveness of different well-known \emph{classifiers} in order to enhance the whole accuracy of the final classification‎, ‎which is very relevant at now in the specific context of \emph{Big Data}‎. ‎The so-called \emph{workload categorization problem} plays a critical role in improving the efficiency and reliability of Cloud-based big data applications‎. ‎Implementation-wise‎, ‎our method proposes deploying Cloud entities that participate in the distributed classification approach on top of \emph{virtual machines}‎, ‎which represent classical ``commodity'' settings for Cloud-based big data applications‎. ‎Given a number of known reference workloads‎, ‎and an unknown workload‎, ‎in this paper we deal with the problem of finding the reference workload which is most similar to the unknown one‎. ‎The depicted scenario turns out to be useful in a plethora of modern information system applications‎. ‎We name this problem as \emph{coarse-grained workload classification}‎, ‎because‎, ‎instead of characterizing the unknown workload in terms of finer behaviors‎, ‎such as CPU‎, ‎memory‎, ‎disk‎, ‎or network intensive patterns‎, ‎we classify the whole unknown workload as one of the (possible) reference workloads‎. ‎Reference workloads represent a category of workloads that are relevant in a given applicative environment‎. ‎In particular‎, ‎we focus our attention on the classification problem described above in the special case represented by \emph{virtualized environments}‎. ‎Today‎, ‎\emph{Virtual Machines} (VMs) have become very popular because they offer important advantages to modern computing environments such as cloud computing or server farms‎. ‎In virtualization frameworks‎, ‎workload classification is very useful for accounting‎, ‎security reasons‎, ‎or user profiling‎. ‎Hence‎, ‎our research makes more sense in such environments‎, ‎and it turns out to be very useful in a special context like Cloud Computing‎, ‎which is emerging now‎. ‎In this respect‎, ‎our approach consists of running several machine learning-based classifiers of different workload models‎, ‎and then deriving the best classifier produced by the \emph{Dempster-Shafer Fusion}‎, ‎in order to magnify the accuracy of the final classification‎. ‎Experimental assessment and analysis clearly confirm the benefits derived from our classification framework‎. ‎The running programs which produce unknown workloads to be classified are treated in a similar way‎. ‎A fundamental aspect of this paper concerns the successful use of data fusion in workload classification‎. ‎Different types of metrics are in fact fused together using the Dempster-Shafer theory of evidence combination‎, ‎giving a classification accuracy of slightly less than 80%80\%‎. ‎The acquisition of data from the running process‎, ‎the pre-processing algorithms‎, ‎and the workload classification are described in detail‎. ‎Various classical algorithms have been used for classification to classify the workloads‎, ‎and the results are compared‎

    Modelling energy efficiency and performance trade-offs

    Get PDF
    PhD ThesisPower and energy consumption in data centres is a huge concern for data centre providers. As a result, this work considers the modelling and analysis of policy and scheduling schemes using Markovian processing algebra known as PEPA. The first emphasis was on modelling an energy policy in PEPA that dynamically controls the powering servers ON or OFF. The focus is to identify and reflect the trade-off between saving energy (by powering down servers) and performance cost. While powering down servers saves energy, it could increase the performance cost. The research analyses the effect of the policy on energy consumption and performance cost, with different combinations of dynamic and static servers used in the policy against different scenarios, including changes in job arrival rate, job arrival duration and the time needed by servers to be powered On and start process jobs. The result gave interesting outcomes because every scenario is unique, and therefore, no server combinations were found to give low energy and high performance in all situations. The second focus was to consider the impact of scheduler’s choice on performance and energy under unknown service demands. Three algorithms were looked at: task assignment based on guessing size (TAGS), the shortest queue strategy and random allocation. These policies were modelled using PEPA to derive numerical solutions in a two servers system. The performance was analysed considering throughput, average response time and servers’ utilisation. At the same time, the energy consumption was in terms of total energy consumption and energy consumption per job. The intention was to analyse the performance and energy consumption in a homogeneous and heterogeneous environment, and the environment was assumed to be homogeneous in the beginning. However, the service distribution was considered either a negative exponential (hence relatively low variance) or a two-phase hyper-exponential (relatively high variance) in each policy. In all cases, the arrival process has been assumed to be a Poisson stream, and the maximum queue lengths are finite (maximum size is 10 jobs). The performance results showed that TAGS performs worse under exponential - vii - distribution and the best under two-phase hyper-exponential. TAGS produce higher throughput and lower job loss when service demand has an H2 distribution. Our results show that servers running under TAGS consume more energy than other policies regarding total energy consumption and energy per job under exponential distribution. In contrast, TAGS consumes less energy per job than the random allocation when the arrival rate is high, and the job size is variable (two-phase hyper-exponential). In a heterogeneous environment and based on our results on the homogeneous environment, the performance metrics and energy consumption was analysed only under twophase hyper-exponential. TAGS works well in all server configurations and achieves greater throughput than the shortest queue or weighted random, even when the second server’s speed was reduced by 40% of the first server’s in TAGS. TAGS outperforms both the shortest queue and weighted random, whether their second server is faster or slower than the TAGS second server. The system’s heterogeneity did not significantly improve or decrease TAGS throughput results. Whether the second server is faster or slower, even when the arrival rate is less than 75% of the system capacity, it approximately showed no effect. On the other hand, heterogeneity of the system has a notable effect on the throughput of the shortest queue and weighted random. The decrease or increase in throughput follows the trend of the second server performance capability. In terms of total energy consumption, for all scheduling schemes, when the second server is slower than the first server, the energy consumption is the highest among all scenarios for each arrival rate. TAGS was the worst and consumed higher energy than both the shortest queue strategy and weighted random allocation. However, in terms of energy per job, when servers are identical, or server2 is faster, it was observed that the shortest queue is the optimal strategy as long as the incoming jobs rate does not exceed 70% of the system capacity ( arrival rate <15). Furthermore, the TAGS was the best strategy when the incoming task rate exceeds 70% of the system capacity. So, as more jobs are produced, the energy per job decreases eventually. Choosing the energy policy or scheduling algorithm will impact energy consumption and performance either negatively or positively

    An Energy-Efficient Multi-Cloud Service Broker for Green Cloud Computing Environment

    Get PDF
    The heavy demands on cloud computing resources have led to a substantial growth in energy consumption of the data transferred between cloud computing parties (i.e., providers, datacentres, users, and services) and in datacentre’s services due to the increasing loads on these services. From one hand, routing and transferring large amounts of data into a datacentre located far from the user’s geographical location consume more energy than just processing and storing the same data on the cloud datacentre. On the other hand, when a cloud user submits a job (in the form of a set of functional and non-functional requirements) to a cloud service provider (aka, datacentre) via a cloud services broker; the broker becomes responsible to find the best-fit service to the user request based mainly on the user’s requirements and Quality of Service (QoS) (i.e., response time, latency). Hence, it becomes a high necessity to locate the lowest energy consumption route between the user and the designated datacentre; and the minimum possible number of most energy efficient services that satisfy the user request. In fact, finding the most energy-efficient route to the datacentre, and most energy efficient service(s) to the user are the biggest challenges of multi-cloud broker’s environment. This thesis presents and evaluates a novel multi-cloud broker solution that contains three innovative models and their associated algorithms. The first one is aimed at finding the most energy efficient route, among multiple possible routes, between the user and cloud datacentre. The second model is to find and provide the lowest possible number of most energy efficient services in order to minimise data exchange based on a bin-packing approach. The third model creates an energy-aware composition plan by integrating the most energy efficient services, in order to fulfil user requirements. The results demonstrated a favourable performance of these models in terms of selecting the most energy efficient route and reaching the least possible number of services for an optimum and energy efficient composition
    corecore