20 research outputs found
Doctor of Philosophy
dissertationAs the base of the software stack, system-level software is expected to provide ecient and scalable storage, communication, security and resource management functionalities. However, there are many computationally expensive functionalities at the system level, such as encryption, packet inspection, and error correction. All of these require substantial computing power. What's more, today's application workloads have entered gigabyte and terabyte scales, which demand even more computing power. To solve the rapidly increased computing power demand at the system level, this dissertation proposes using parallel graphics pro- cessing units (GPUs) in system software. GPUs excel at parallel computing, and also have a much faster development trend in parallel performance than central processing units (CPUs). However, system-level software has been originally designed to be latency-oriented. GPUs are designed for long-running computation and large-scale data processing, which are throughput-oriented. Such mismatch makes it dicult to t the system-level software with the GPUs. This dissertation presents generic principles of system-level GPU computing developed during the process of creating our two general frameworks for integrating GPU computing in storage and network packet processing. The principles are generic design techniques and abstractions to deal with common system-level GPU computing challenges. Those principles have been evaluated in concrete cases including storage and network packet processing applications that have been augmented with GPU computing. The signicant performance improvement found in the evaluation shows the eectiveness and eciency of the proposed techniques and abstractions. This dissertation also presents a literature survey of the relatively young system-level GPU computing area, to introduce the state of the art in both applications and techniques, and also their future potentials
Optimal resource allocation for multi-queue systems with a shared server pool
We study optimal allocation of servers for a system with multiple service facilities and
with a shared pool of servers. Each service facility poses a constraint on the maximum
expected sojourn time of a job. A central decision maker can dynamically allocate servers
to each facility, where adding more servers results in faster processing speeds but against
higher utilization costs. The objective is to dynamically allocate the servers over the
different facilities such that the sojourn-time constraints are met at minimal costs. This
situation occurs frequently in practice, e.g., in Grid systems for real-time image processing
(iris scans, fingerprints). We model this problem as a Markov decision process and derive
structural properties of the relative value function. These properties, which are hard to
derive for multi-dimensional systems, give a full characterization of the optimal policy.
We demonstrate the effectiveness of these policies by extensive numerical experiments
VNF performance modelling : from stand-alone to chained topologies
One of the main incentives for deploying network functions on a virtualized or cloud-based infrastructure, is the ability for on-demand orchestration and elastic resource scaling following the workload demand. This can also be combined with a multi-party service creation cycle: the service provider sources various network functions from different vendors or developers, and combines them into a modular network service. This way, multiple virtual network functions (VNFs) are connected into more complex topologies called service chains. Deployment speed is important here, and it is therefore beneficial if the service provider can limit extra validation testing of the combined service chain, and rely on the provided profiling results of the supplied single VNFs. Our research shows that it is however not always evident to accurately predict the performance of a total service chain, from the isolated benchmark or profiling tests of its discrete network functions. To mitigate this, we propose a two-step deployment workflow: First, a general trend estimation for the chain performance is derived from the stand-alone VNF profiling results, together with an initial resource allocation. This information then optimizes the second phase, where online monitored data of the service chain is used to quickly adjust the estimated performance model where needed. Our tests show that this can lead to a more efficient VNF chain deployment, needing less scaling iterations to meet the chain performance specification, while avoiding the need for a complete proactive and time-consuming VNF chain validation
Hybrid token-CDMA MAC protocol for wireless networks.
Thesis (Ph.D.)-University of KwaZulu-Natal, Durban, 2009.Ad hoc networks are commonly known to implement IEEE 802.11 standard as their medium
access control (MAC) protocol. It is well known that token passing MAC schemes
outperform carrier-sense-multiple-access (CSMA) schemes, therefore, token passing MAC
protocols have gained popularity in recent years. In recent years, the research extends the
concept of token passing ' scheme to wireless settings since they have the potential of
achieving higher channel utilization than CSMA type schemes.
In this thesis, a hybrid Token-CDMA MAC protocol that is based on a token passing scheme
with the incorporation of code division multiple access (CDMA) is introduced. Using a
dynamic code distribution algorithm and a modified leaky-bucket policing system, the
hybrid protocol is able to provide both Quality of Service (QoS) and high network resource
utilization, while ensuring the stability of a network. This thesis begins with the introduction
of a new MAC protocol based on a token-passing strategy. The input traffic model used in
the simulation is a two-state Markov Modulated Poisson Process (MMPP). The data rate
QoS is enforced by implementing a modified leaky bucket mechanism in the proposed MAC
scheme. The simulation also takes into account channel link errors caused by the wireless
link by implementing a multi-layered Gilbert-Elliot model. The performance of the proposed
MAC scheme is examined by simulation, and compared to the performance of other MAC
protocols published in the literature. Simulation results demonstrate that the proposed hybrid
MAC scheme is effective in decreasing packet delay and significantly shortens the length of
the queue.
The thesis continues with the discussion of the analytical model for the hybrid Token CDMA
protocol. The proposed MAC scheme is analytically modelled as a multiserver
multiqueue (MSMQ) system with a gated service discipline. The analytical model is
categorized into three sections viz. the vacation model, the input model and the buffer model.
The throughput and delay performance are then computed and shown to closely match the
simulation results. Lastly, cross-layer optimization between the physical (PHY) and MAC
layers for the hybrid token-CDMA scheme is discussed. The proposed joint PHY -MAC
approach is based on the interaction between the two layers in order to enable the stations to
dynamically adjust the transmission parameters resulting in reduced mutual interference and
optimum system performance
Managing the demand for public housing
Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Urban Studies and Planning, 1984.MICROFICHE COPY AVAILABLE IN ARCHIVES AND ROTCH.Bibliography: leaves 285-286.by Edward Harris Kaplan.Ph.D
A Multi-Swarm PSO Approach to Large-Scale Task Scheduling in a Sustainable Supply Chain Datacenter
Supply chain management is a vital part of ensuring service quality and production efficiency in industrial applications. With the development of cloud computing and data intelligence in modern industries, datacenters have become an important basic support for intelligent applications. However, the increase in the number and complexity of tasks makes datacenters face increasingly heavy task processing demands. Therefore, there are problems of long task completion time and long response time in the task scheduling process of the data center. A multi-swarm particle swarm optimization task scheduling approach based on load balancing is proposed in this paper, aiming to reduce the maximum completion time and response time in task scheduling. The proposed approach improves the fitness evaluation function of particle swarms to facilitate load balancing. The new adaptive inertia weight and initialization method design can improve the search efficiency and convergence speed of particles. Meanwhile, the multi-swarm design can avoid the problem of particles falling into local optimum as much as possible. Finally, the proposed algorithm is verified experimentally using the task dataset released by Alibaba datacenter, and compared with other benchmark algorithms. The results show that the proposed algorithm can improve the task scheduling performance of the datacenter in supply chain management when dealing with different workloads and changes in the number of elastic machines
Optimal resource allocation algorithms for cloud computing
Cloud computing is emerging as an important platform for business, personal and mobile computing applications. We consider a stochastic model of a cloud computing cluster, where jobs arrive according to a random process and request virtual machines (VMs), which are specified in terms of resources
such as CPU, memory and storage space. The jobs are first routed to one of the servers when they arrive and are queued at the servers. Each server then
chooses a set of jobs from its queues so that it has enough resources to serve all of them simultaneously.
There are many design issues associated with such systems. One important issue is the resource allocation problem, i.e., the design of algorithms for load
balancing among servers, and algorithms for scheduling VM configurations. Given our model of a cloud, we define its capacity, i.e., the maximum rates at which jobs can be processed in such a system. An algorithm is said
to be throughput-optimal if it can stabilize the system whenever the load is within the capacity region. We show that the widely-used Best-Fit scheduling
algorithm is not throughput-optimal.
We first consider the problem where the jobs need to be scheduled nonpreemptively on servers. Under the assumptions that the job sizes are known
and bounded, we present algorithms that achieve any arbitrary fraction of the capacity region of the cloud. We then relax these assumptions and present
a load balancing and scheduling algorithm that is throughput optimal when job sizes are unknown. In this case, job sizes (durations) are modeled as
random variables with possibly unbounded support.
Delay is a more important metric then throughput optimality in practice. However, analysis of delay of resource allocation algorithms is difficult, so we
study the system in the asymptotic limit as the load approaches the boundary of the capacity region. This limit is called the heavy traffic regime. Assuming
that the jobs can be preempted once after several time slots, we present delay optimal resource allocation algorithms in the heavy traffic regime. We study
delay performance of our algorithms through simulations