20 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationAs the base of the software stack, system-level software is expected to provide ecient and scalable storage, communication, security and resource management functionalities. However, there are many computationally expensive functionalities at the system level, such as encryption, packet inspection, and error correction. All of these require substantial computing power. What's more, today's application workloads have entered gigabyte and terabyte scales, which demand even more computing power. To solve the rapidly increased computing power demand at the system level, this dissertation proposes using parallel graphics pro- cessing units (GPUs) in system software. GPUs excel at parallel computing, and also have a much faster development trend in parallel performance than central processing units (CPUs). However, system-level software has been originally designed to be latency-oriented. GPUs are designed for long-running computation and large-scale data processing, which are throughput-oriented. Such mismatch makes it dicult to t the system-level software with the GPUs. This dissertation presents generic principles of system-level GPU computing developed during the process of creating our two general frameworks for integrating GPU computing in storage and network packet processing. The principles are generic design techniques and abstractions to deal with common system-level GPU computing challenges. Those principles have been evaluated in concrete cases including storage and network packet processing applications that have been augmented with GPU computing. The signicant performance improvement found in the evaluation shows the eectiveness and eciency of the proposed techniques and abstractions. This dissertation also presents a literature survey of the relatively young system-level GPU computing area, to introduce the state of the art in both applications and techniques, and also their future potentials

    Optimal resource allocation for multi-queue systems with a shared server pool

    Get PDF
    We study optimal allocation of servers for a system with multiple service facilities and with a shared pool of servers. Each service facility poses a constraint on the maximum expected sojourn time of a job. A central decision maker can dynamically allocate servers to each facility, where adding more servers results in faster processing speeds but against higher utilization costs. The objective is to dynamically allocate the servers over the different facilities such that the sojourn-time constraints are met at minimal costs. This situation occurs frequently in practice, e.g., in Grid systems for real-time image processing (iris scans, fingerprints). We model this problem as a Markov decision process and derive structural properties of the relative value function. These properties, which are hard to derive for multi-dimensional systems, give a full characterization of the optimal policy. We demonstrate the effectiveness of these policies by extensive numerical experiments

    VNF performance modelling : from stand-alone to chained topologies

    Get PDF
    One of the main incentives for deploying network functions on a virtualized or cloud-based infrastructure, is the ability for on-demand orchestration and elastic resource scaling following the workload demand. This can also be combined with a multi-party service creation cycle: the service provider sources various network functions from different vendors or developers, and combines them into a modular network service. This way, multiple virtual network functions (VNFs) are connected into more complex topologies called service chains. Deployment speed is important here, and it is therefore beneficial if the service provider can limit extra validation testing of the combined service chain, and rely on the provided profiling results of the supplied single VNFs. Our research shows that it is however not always evident to accurately predict the performance of a total service chain, from the isolated benchmark or profiling tests of its discrete network functions. To mitigate this, we propose a two-step deployment workflow: First, a general trend estimation for the chain performance is derived from the stand-alone VNF profiling results, together with an initial resource allocation. This information then optimizes the second phase, where online monitored data of the service chain is used to quickly adjust the estimated performance model where needed. Our tests show that this can lead to a more efficient VNF chain deployment, needing less scaling iterations to meet the chain performance specification, while avoiding the need for a complete proactive and time-consuming VNF chain validation

    Hybrid token-CDMA MAC protocol for wireless networks.

    Get PDF
    Thesis (Ph.D.)-University of KwaZulu-Natal, Durban, 2009.Ad hoc networks are commonly known to implement IEEE 802.11 standard as their medium access control (MAC) protocol. It is well known that token passing MAC schemes outperform carrier-sense-multiple-access (CSMA) schemes, therefore, token passing MAC protocols have gained popularity in recent years. In recent years, the research extends the concept of token passing ' scheme to wireless settings since they have the potential of achieving higher channel utilization than CSMA type schemes. In this thesis, a hybrid Token-CDMA MAC protocol that is based on a token passing scheme with the incorporation of code division multiple access (CDMA) is introduced. Using a dynamic code distribution algorithm and a modified leaky-bucket policing system, the hybrid protocol is able to provide both Quality of Service (QoS) and high network resource utilization, while ensuring the stability of a network. This thesis begins with the introduction of a new MAC protocol based on a token-passing strategy. The input traffic model used in the simulation is a two-state Markov Modulated Poisson Process (MMPP). The data rate QoS is enforced by implementing a modified leaky bucket mechanism in the proposed MAC scheme. The simulation also takes into account channel link errors caused by the wireless link by implementing a multi-layered Gilbert-Elliot model. The performance of the proposed MAC scheme is examined by simulation, and compared to the performance of other MAC protocols published in the literature. Simulation results demonstrate that the proposed hybrid MAC scheme is effective in decreasing packet delay and significantly shortens the length of the queue. The thesis continues with the discussion of the analytical model for the hybrid Token CDMA protocol. The proposed MAC scheme is analytically modelled as a multiserver multiqueue (MSMQ) system with a gated service discipline. The analytical model is categorized into three sections viz. the vacation model, the input model and the buffer model. The throughput and delay performance are then computed and shown to closely match the simulation results. Lastly, cross-layer optimization between the physical (PHY) and MAC layers for the hybrid token-CDMA scheme is discussed. The proposed joint PHY -MAC approach is based on the interaction between the two layers in order to enable the stations to dynamically adjust the transmission parameters resulting in reduced mutual interference and optimum system performance

    Managing the demand for public housing

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Urban Studies and Planning, 1984.MICROFICHE COPY AVAILABLE IN ARCHIVES AND ROTCH.Bibliography: leaves 285-286.by Edward Harris Kaplan.Ph.D

    A Multi-Swarm PSO Approach to Large-Scale Task Scheduling in a Sustainable Supply Chain Datacenter

    Get PDF
    Supply chain management is a vital part of ensuring service quality and production efficiency in industrial applications. With the development of cloud computing and data intelligence in modern industries, datacenters have become an important basic support for intelligent applications. However, the increase in the number and complexity of tasks makes datacenters face increasingly heavy task processing demands. Therefore, there are problems of long task completion time and long response time in the task scheduling process of the data center. A multi-swarm particle swarm optimization task scheduling approach based on load balancing is proposed in this paper, aiming to reduce the maximum completion time and response time in task scheduling. The proposed approach improves the fitness evaluation function of particle swarms to facilitate load balancing. The new adaptive inertia weight and initialization method design can improve the search efficiency and convergence speed of particles. Meanwhile, the multi-swarm design can avoid the problem of particles falling into local optimum as much as possible. Finally, the proposed algorithm is verified experimentally using the task dataset released by Alibaba datacenter, and compared with other benchmark algorithms. The results show that the proposed algorithm can improve the task scheduling performance of the datacenter in supply chain management when dealing with different workloads and changes in the number of elastic machines

    Operating System Support for High-Performance Solid State Drives

    Get PDF

    Optimal resource allocation algorithms for cloud computing

    Get PDF
    Cloud computing is emerging as an important platform for business, personal and mobile computing applications. We consider a stochastic model of a cloud computing cluster, where jobs arrive according to a random process and request virtual machines (VMs), which are specified in terms of resources such as CPU, memory and storage space. The jobs are first routed to one of the servers when they arrive and are queued at the servers. Each server then chooses a set of jobs from its queues so that it has enough resources to serve all of them simultaneously. There are many design issues associated with such systems. One important issue is the resource allocation problem, i.e., the design of algorithms for load balancing among servers, and algorithms for scheduling VM configurations. Given our model of a cloud, we define its capacity, i.e., the maximum rates at which jobs can be processed in such a system. An algorithm is said to be throughput-optimal if it can stabilize the system whenever the load is within the capacity region. We show that the widely-used Best-Fit scheduling algorithm is not throughput-optimal. We first consider the problem where the jobs need to be scheduled nonpreemptively on servers. Under the assumptions that the job sizes are known and bounded, we present algorithms that achieve any arbitrary fraction of the capacity region of the cloud. We then relax these assumptions and present a load balancing and scheduling algorithm that is throughput optimal when job sizes are unknown. In this case, job sizes (durations) are modeled as random variables with possibly unbounded support. Delay is a more important metric then throughput optimality in practice. However, analysis of delay of resource allocation algorithms is difficult, so we study the system in the asymptotic limit as the load approaches the boundary of the capacity region. This limit is called the heavy traffic regime. Assuming that the jobs can be preempted once after several time slots, we present delay optimal resource allocation algorithms in the heavy traffic regime. We study delay performance of our algorithms through simulations
    corecore