164,447 research outputs found
FCFS Parallel Service Systems and Matching Models
We consider three parallel service models in which customers of several types
are served by several types of servers subject to a bipartite compatibility
graph, and the service policy is first come first served. Two of the models
have a fixed set of servers. The first is a queueing model in which arriving
customers are assigned to the longest idling compatible server if available, or
else queue up in a single queue, and servers that become available pick the
longest waiting compatible customer, as studied by Adan and Weiss, 2014. The
second is a redundancy service model where arriving customers split into copies
that queue up at all the compatible servers, and are served in each queue on
FCFS basis, and leave the system when the first copy completes service, as
studied by Gardner et al., 2016. The third model is a matching queueing model
with a random stream of arriving servers. Arriving customers queue in a single
queue and arriving servers match with the first compatible customer and leave
immediately with the customer, or they leave without a customer. The last model
is relevant to organ transplants, to housing assignments, to adoptions and many
other situations.
We study the relations between these models, and show that they are closely
related to the FCFS infinite bipartite matching model, in which two infinite
sequences of customers and servers of several types are matched FCFS according
to a bipartite compatibility graph, as studied by Adan et al., 2017. We also
introduce a directed bipartite matching model in which we embed the queueing
systems. This leads to a generalization of Burke's theorem to parallel service
systems
Recursive Hardware-as-a-Service (rHaaS) and Fast Provisioning
Hardware as a Service (HaaS) is a new service being developed by the Massachusetts Open Cloud (MOC) to allow physical servers to be allocated to clients in the same way that virtual servers are in existing IaaS clouds.
This poster describes the new recursive HaaS project and the fast provisioning customization we are developing. Recursive HaaS allows a HaaS service to be layered on top of an existing one. It will allow testing of new features at performance and scale without affecting the production service. It will also allow clients to host their own HaaS on top of a base HaaS to provide, potentially customized, services to their users.
An example customization we are developing is a fast provisioning service that can be used between tenants that have some degree of trust in each other. It will allow nodes to be moved between customers (and a service installed) in seconds, rather than the minutes required by base HaaS
Scheduling Storms and Streams in the Cloud
Motivated by emerging big streaming data processing paradigms (e.g., Twitter
Storm, Streaming MapReduce), we investigate the problem of scheduling graphs
over a large cluster of servers. Each graph is a job, where nodes represent
compute tasks and edges indicate data-flows between these compute tasks. Jobs
(graphs) arrive randomly over time, and upon completion, leave the system. When
a job arrives, the scheduler needs to partition the graph and distribute it
over the servers to satisfy load balancing and cost considerations.
Specifically, neighboring compute tasks in the graph that are mapped to
different servers incur load on the network; thus a mapping of the jobs among
the servers incurs a cost that is proportional to the number of "broken edges".
We propose a low complexity randomized scheduling algorithm that, without
service preemptions, stabilizes the system with graph arrivals/departures; more
importantly, it allows a smooth trade-off between minimizing average
partitioning cost and average queue lengths. Interestingly, to avoid service
preemptions, our approach does not rely on a Gibbs sampler; instead, we show
that the corresponding limiting invariant measure has an interpretation
stemming from a loss system.Comment: 14 page
ARES: Adaptive, Reconfigurable, Erasure coded, atomic Storage
Atomicity or strong consistency is one of the fundamental, most intuitive,
and hardest to provide primitives in distributed shared memory emulations. To
ensure survivability, scalability, and availability of a storage service in the
presence of failures, traditional approaches for atomic memory emulation, in
message passing environments, replicate the objects across multiple servers.
Compared to replication based algorithms, erasure code-based atomic memory
algorithms has much lower storage and communication costs, but usually, they
are harder to design. The difficulty of designing atomic memory algorithms
further grows, when the set of servers may be changed to ensure survivability
of the service over software and hardware upgrades, while avoiding service
interruptions. Atomic memory algorithms for performing server reconfiguration,
in the replicated systems, are very few, complex, and are still part of an
active area of research; reconfigurations of erasure-code based algorithms are
non-existent.
In this work, we present ARES, an algorithmic framework that allows
reconfiguration of the underlying servers, and is particularly suitable for
erasure-code based algorithms emulating atomic objects. ARES introduces new
configurations while keeping the service available. To use with ARES we also
propose a new, and to our knowledge, the first two-round erasure code based
algorithm TREAS, for emulating multi-writer, multi-reader (MWMR) atomic objects
in asynchronous, message-passing environments, with near-optimal communication
and storage costs. Our algorithms can tolerate crash failures of any client and
some fraction of servers, and yet, guarantee safety and liveness property.
Moreover, by bringing together the advantages of ARES and TREAS, we propose an
optimized algorithm where new configurations can be installed without the
objects values passing through the reconfiguration clients
C2MS: Dynamic Monitoring and Management of Cloud Infrastructures
Server clustering is a common design principle employed by many organisations
who require high availability, scalability and easier management of their
infrastructure. Servers are typically clustered according to the service they
provide whether it be the application(s) installed, the role of the server or
server accessibility for example. In order to optimize performance, manage load
and maintain availability, servers may migrate from one cluster group to
another making it difficult for server monitoring tools to continuously monitor
these dynamically changing groups. Server monitoring tools are usually
statically configured and with any change of group membership requires manual
reconfiguration; an unreasonable task to undertake on large-scale cloud
infrastructures.
In this paper we present the Cloudlet Control and Management System (C2MS); a
system for monitoring and controlling dynamic groups of physical or virtual
servers within cloud infrastructures. The C2MS extends Ganglia - an open source
scalable system performance monitoring tool - by allowing system administrators
to define, monitor and modify server groups without the need for server
reconfiguration. In turn administrators can easily monitor group and individual
server metrics on large-scale dynamic cloud infrastructures where roles of
servers may change frequently. Furthermore, we complement group monitoring with
a control element allowing administrator-specified actions to be performed over
servers within service groups as well as introduce further customized
monitoring metrics. This paper outlines the design, implementation and
evaluation of the C2MS.Comment: Proceedings of the The 5th IEEE International Conference on Cloud
Computing Technology and Science (CloudCom 2013), 8 page
Does Tipping Help to Attract and Retain Better Service Workers?
A survey of several hundred restaurant servers in the United States found that servers’ attitudes toward working for tips and average tip sizes were weakly related (at best) to their service-orientation, intended job-tenure, and occupational-tenure. These findings suggest that tipping does not substantially help to attract and retain more service-oriented workers. Restaurateurs can eliminate tipping at their restaurants without fear that doing so will reduce the quality of their wait-staff
Holistic Resource Management for Sustainable and Reliable Cloud Computing:An Innovative Solution to Global Challenge
Minimizing the energy consumption of servers within cloud computing systems is of upmost importance to cloud providers towards reducing operational costs and enhancing service sustainability by consolidating services onto fewer active servers. Moreover, providers must also provision high levels of availability and reliability, hence cloud services are frequently replicated across servers that subsequently increases server energy consumption and resource overhead. These two objectives can present a potential conflict within cloud resource management decision making that must balance between service consolidation and replication to minimize energy consumption whilst maximizing server availability and reliability, respectively. In this paper, we propose a cuckoo optimization-based energy-reliability aware resource scheduling technique (CRUZE) for holistic management of cloud computing resources including servers, networks, storage, and cooling systems. CRUZE clusters and executes heterogeneous workloads on provisioned cloud resources and enhances the energy-efficiency and reduces the carbon footprint in datacenters without adversely affecting cloud service reliability. We evaluate the effectiveness of CRUZE against existing state-of-the-art solutions using the CloudSim toolkit. Results indicate that our proposed technique is capable of reducing energy consumption by 20.1% whilst improving reliability and CPU utilization by 17.1% and 15.7% respectively without affecting other Quality of Service parameters
Analysis of Multiserver Retrial Queueing System: A Martingale Approach and an Algorithm of Solution
The paper studies a multiserver retrial queueing system with servers.
Arrival process is a point process with strictly stationary and ergodic
increments. A customer arriving to the system occupies one of the free servers.
If upon arrival all servers are busy, then the customer goes to the secondary
queue, orbit, and after some random time retries more and more to occupy a
server. A service time of each customer is exponentially distributed random
variable with parameter . A time between retrials is exponentially
distributed with parameter for each customer. Using a martingale
approach the paper provides an analysis of this system. The paper establishes
the stability condition and studies a behavior of the limiting queue-length
distributions as increases to infinity. As , the paper
also proves the convergence of appropriate queue-length distributions to those
of the associated `usual' multiserver queueing system without retrials. An
algorithm for numerical solution of the equations, associated with the limiting
queue-length distribution of retrial systems, is provided.Comment: To appear in "Annals of Operations Research" 141 (2006) 19-52.
Replacement corrects a small number of misprint
Deceit: A flexible distributed file system
Deceit, a distributed file system (DFS) being developed at Cornell, focuses on flexible file semantics in relation to efficiency, scalability, and reliability. Deceit servers are interchangeable and collectively provide the illusion of a single, large server machine to any clients of the Deceit service. Non-volatile replicas of each file are stored on a subset of the file servers. The user is able to set parameters on a file to achieve different levels of availability, performance, and one-copy serializability. Deceit also supports a file version control mechanism. In contrast with many recent DFS efforts, Deceit can behave like a plain Sun Network File System (NFS) server and can be used by any NFS client without modifying any client software. The current Deceit prototype uses the ISIS Distributed Programming Environment for all communication and process group management, an approach that reduces system complexity and increases system robustness
- …