1,563 research outputs found

    Dimensionerings- en werkverdelingsalgoritmen voor lambda grids

    Get PDF
    Grids bestaan uit een verzameling reken- en opslagelementen die geografisch verspreid kunnen zijn, maar waarvan men de gezamenlijke capaciteit wenst te benutten. Daartoe dienen deze elementen verbonden te worden met een netwerk. Vermits veel wetenschappelijke applicaties gebruik maken van een Grid, en deze applicaties doorgaans grote hoeveelheden data verwerken, is het noodzakelijk om een netwerk te voorzien dat dergelijke grote datastromen op betrouwbare wijze kan transporteren. Optische transportnetwerken lenen zich hier uitstekend toe. Grids die gebruik maken van dergelijk netwerk noemt men lambda Grids. Deze thesis beschrijft een kader waarin het ontwerp en dimensionering van optische netwerken voor lambda Grids kunnen beschreven worden. Ook wordt besproken hoe werklast kan verdeeld worden op een Grid eens die gedimensioneerd is. Een groot deel van de resultaten werd bekomen door simulatie, waarbij gebruik gemaakt wordt van een eigen Grid simulatiepakket dat precies focust op netwerk- en Gridelementen. Het ontwerp van deze simulator, en de daarbijhorende implementatiekeuzes worden dan ook uitvoerig toegelicht in dit werk

    Task assignment in server farms under realistic workload conditions

    Get PDF
    Server farms have become very popular in recent years since they effectively address the problem of large delays, a common problem faced by many organisations whose systems receive high volumes of traffic. Recently, there has been a wide use of these server farms in two main areas, namely, Web hosting and scientific computing. The performance of such server farms is highly reliant on the underlying task assignment policy, a specific set of rules that defines how the incoming tasks are assigned to and processed at hosts. The aim of a task assignment policy is to optimise certain performance criteria such as the expected waiting time and slowdown. One of the key factors that affect the performance of these policies is the service time distribution of tasks. There is extensive evidence indicating that the service times of modern computer workloads closely follow heavy-tailed distributions that possess high variance. However, in certain environments, the service time distributions of tasks are unknown. Imposing parametric assumptions in such cases can lead to inaccurate and unreliable inferences. Considerable efforts have been made in recent years to devise efficient policies. Although these policies perform well under specific workload conditions, they have several major limitations. These include the assumption of known service times, inability to efficiently assign tasks in time sharing server farms, poor performance under changing workload conditions and poor performance under multiple server farms. This thesis aims at proposing novel task assignment policies for assigning tasks in server farms under two main classes of realistic workload conditions, namely, the heavy-tailed and arbitrary service time distributions. Arbitrary service time distributions are assumed, for cases where the underlying service time distribution of tasks is unknown. First we investigate ways to optimise the performance in a time-sharing server. We concentrate on a particular scheduling policy called multi-level time sharing policy (MLTP). We provide an extensive performance analysis of MTLP and show that MLTP can result in significant performance improvements under certain traffic conditions. Second we investigate how to improve the performance in time sharing server farms using MLTP. Three task assignment policies are proposed for time sharing server farms. Third we investigate how to design efficient task assignment policies to assign tasks in multiple server farms. We propose MCTPM which is based on a multi-tier host architecture. MCTPM supports preemptive task migration and it controls the traffic flow into server farms via a global dispatching device so as to optimise the performance. Finally, we investigate ways to design adaptive task assignment policies that make no assumptions regarding the underlying service time distribution of tasks. We propose a novel task assignment policy, called ADAPT-POLICY, which is based on a set of static-based task assignment policies. ADAPT-POLICY is based on a set of policies for the server farm and it adaptively changes the task assignment policy to suit the most recent traffic conditions. The experimental performance analysis of ADAPT-POLICY shows that ADAPT-POLICY outperforms other policies under a range of traffic conditions

    Self-management for large-scale distributed systems

    Get PDF
    Autonomic computing aims at making computing systems self-managing by using autonomic managers in order to reduce obstacles caused by management complexity. This thesis presents results of research on self-management for large-scale distributed systems. This research was motivated by the increasing complexity of computing systems and their management. In the first part, we present our platform, called Niche, for programming self-managing component-based distributed applications. In our work on Niche, we have faced and addressed the following four challenges in achieving self-management in a dynamic environment characterized by volatile resources and high churn: resource discovery, robust and efficient sensing and actuation, management bottleneck, and scale. We present results of our research on addressing the above challenges. Niche implements the autonomic computing architecture, proposed by IBM, in a fully decentralized way. Niche supports a network-transparent view of the system architecture simplifying the design of distributed self-management. Niche provides a concise and expressive API for self-management. The implementation of the platform relies on the scalability and robustness of structured overlay networks. We proceed by presenting a methodology for designing the management part of a distributed self-managing application. We define design steps that include partitioning of management functions and orchestration of multiple autonomic managers. In the second part, we discuss robustness of management and data consistency, which are necessary in a distributed system. Dealing with the effect of churn on management increases the complexity of the management logic and thus makes its development time consuming and error prone. We propose the abstraction of Robust Management Elements, which are able to heal themselves under continuous churn. Our approach is based on replicating a management element using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. For data consistency, we propose a majority-based distributed key-value store supporting multiple consistency levels that is based on a peer-to-peer network. The store enables the tradeoff between high availability and data consistency. Using majority allows avoiding potential drawbacks of a master-based consistency control, namely, a single-point of failure and a potential performance bottleneck. In the third part, we investigate self-management for Cloud-based storage systems with the focus on elasticity control using elements of control theory and machine learning. We have conducted research on a number of different designs of an elasticity controller, including a State-Space feedback controller and a controller that combines feedback and feedforward control. We describe our experience in designing an elasticity controller for a Cloud-based key-value store using state-space model that enables to trade-off performance for cost. We describe the steps in designing an elasticity controller. We continue by presenting the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores that combines feedforward and feedback control

    Towards Loosely-Coupled Programming on Petascale Systems

    Full text link
    We have extended the Falkon lightweight task execution framework to make loosely coupled programming on petascale systems a practical and useful programming model. This work studies and measures the performance factors involved in applying this approach to enable the use of petascale systems by a broader user community, and with greater ease. Our work enables the execution of highly parallel computations composed of loosely coupled serial jobs with no modifications to the respective applications. This approach allows a new-and potentially far larger-class of applications to leverage petascale systems, such as the IBM Blue Gene/P supercomputer. We present the challenges of I/O performance encountered in making this model practical, and show results using both microbenchmarks and real applications from two domains: economic energy modeling and molecular dynamics. Our benchmarks show that we can scale up to 160K processor-cores with high efficiency, and can achieve sustained execution rates of thousands of tasks per second.Comment: IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SuperComputing/SC) 200

    Climbing Up Cloud Nine: Performance Enhancement Techniques for Cloud Computing Environments

    Get PDF
    With the transformation of cloud computing technologies from an attractive trend to a business reality, the need is more pressing than ever for efficient cloud service management tools and techniques. As cloud technologies continue to mature, the service model, resource allocation methodologies, energy efficiency models and general service management schemes are not yet saturated. The burden of making this all tick perfectly falls on cloud providers. Surely, economy of scale revenues and leveraging existing infrastructure and giant workforce are there as positives, but it is far from straightforward operation from that point. Performance and service delivery will still depend on the providers’ algorithms and policies which affect all operational areas. With that in mind, this thesis tackles a set of the more critical challenges faced by cloud providers with the purpose of enhancing cloud service performance and saving on providers’ cost. This is done by exploring innovative resource allocation techniques and developing novel tools and methodologies in the context of cloud resource management, power efficiency, high availability and solution evaluation. Optimal and suboptimal solutions to the resource allocation problem in cloud data centers from both the computational and the network sides are proposed. Next, a deep dive into the energy efficiency challenge in cloud data centers is presented. Consolidation-based and non-consolidation-based solutions containing a novel dynamic virtual machine idleness prediction technique are proposed and evaluated. An investigation of the problem of simulating cloud environments follows. Available simulation solutions are comprehensively evaluated and a novel design framework for cloud simulators covering multiple variations of the problem is presented. Moreover, the challenge of evaluating cloud resource management solutions performance in terms of high availability is addressed. An extensive framework is introduced to design high availability-aware cloud simulators and a prominent cloud simulator (GreenCloud) is extended to implement it. Finally, real cloud application scenarios evaluation is demonstrated using the new tool. The primary argument made in this thesis is that the proposed resource allocation and simulation techniques can serve as basis for effective solutions that mitigate performance and cost challenges faced by cloud providers pertaining to resource utilization, energy efficiency, and client satisfaction

    Separation Framework: An Enabler for Cooperative and D2D Communication for Future 5G Networks

    Get PDF
    Soaring capacity and coverage demands dictate that future cellular networks need to soon migrate towards ultra-dense networks. However, network densification comes with a host of challenges that include compromised energy efficiency, complex interference management, cumbersome mobility management, burdensome signaling overheads and higher backhaul costs. Interestingly, most of the problems, that beleaguer network densification, stem from legacy networks' one common feature i.e., tight coupling between the control and data planes regardless of their degree of heterogeneity and cell density. Consequently, in wake of 5G, control and data planes separation architecture (SARC) has recently been conceived as a promising paradigm that has potential to address most of aforementioned challenges. In this article, we review various proposals that have been presented in literature so far to enable SARC. More specifically, we analyze how and to what degree various SARC proposals address the four main challenges in network densification namely: energy efficiency, system level capacity maximization, interference management and mobility management. We then focus on two salient features of future cellular networks that have not yet been adapted in legacy networks at wide scale and thus remain a hallmark of 5G, i.e., coordinated multipoint (CoMP), and device-to-device (D2D) communications. After providing necessary background on CoMP and D2D, we analyze how SARC can particularly act as a major enabler for CoMP and D2D in context of 5G. This article thus serves as both a tutorial as well as an up to date survey on SARC, CoMP and D2D. Most importantly, the article provides an extensive outlook of challenges and opportunities that lie at the crossroads of these three mutually entangled emerging technologies.Comment: 28 pages, 11 figures, IEEE Communications Surveys & Tutorials 201

    Performance Modeling of Softwarized Network Services Based on Queuing Theory with Experimental Validation

    Get PDF
    Network Functions Virtualization facilitates the automation of the scaling of softwarized network services (SNSs). However, the realization of such a scenario requires a way to determine the needed amount of resources so that the SNSs performance requisites are met for a given workload. This problem is known as resource dimensioning, and it can be efficiently tackled by performance modeling. In this vein, this paper describes an analytical model based on an open queuing network of G/G/m queues to evaluate the response time of SNSs. We validate our model experimentally for a virtualized Mobility Management Entity (vMME) with a three-tiered architecture running on a testbed that resembles a typical data center virtualization environment. We detail the description of our experimental setup and procedures. We solve our resulting queueing network by using the Queueing Networks Analyzer (QNA), Jackson’s networks, and Mean Value Analysis methodologies, and compare them in terms of estimation error. Results show that, for medium and high workloads, the QNA method achieves less than half of error compared to the standard techniques. For low workloads, the three methods produce an error lower than 10%. Finally, we show the usefulness of the model for performing the dynamic provisioning of the vMME experimentally.This work has been partially funded by the H2020 research and innovation project 5G-CLARITY (Grant No. 871428)National research project 5G-City: TEC2016-76795-C6-4-RSpanish Ministry of Education, Culture and Sport (FPU Grant 13/04833). We would also like to thank the reviewers for their valuable feedback to enhance the quality and contribution of this wor
    • …
    corecore