15,653 research outputs found
Video Streaming in Distributed Erasure-coded Storage Systems: Stall Duration Analysis
The demand for global video has been burgeoning across industries. With the
expansion and improvement of video-streaming services, cloud-based video is
evolving into a necessary feature of any successful business for reaching
internal and external audiences. This paper considers video streaming over
distributed systems where the video segments are encoded using an erasure code
for better reliability thus being the first work to our best knowledge that
considers video streaming over erasure-coded distributed cloud systems. The
download time of each coded chunk of each video segment is characterized and
ordered statistics over the choice of the erasure-coded chunks is used to
obtain the playback time of different video segments. Using the playback times,
bounds on the moment generating function on the stall duration is used to bound
the mean stall duration. Moment generating function based bounds on the ordered
statistics are also used to bound the stall duration tail probability which
determines the probability that the stall time is greater than a pre-defined
number. These two metrics, mean stall duration and the stall duration tail
probability, are important quality of experience (QoE) measures for the end
users. Based on these metrics, we formulate an optimization problem to jointly
minimize the convex combination of both the QoE metrics averaged over all
requests over the placement and access of the video content. The non-convex
problem is solved using an efficient iterative algorithm. Numerical results
show significant improvement in QoE metrics for cloud-based video as compared
to the considered baselines.Comment: 18 pages, accepted to IEEE/ACM Transactions on Networkin
Design and Implementation of Intelligent Community System Based on Thin Client and Cloud Computing
With the continuous development of science and technology, the intelligent
development of community system becomes a trend. Meanwhile, smart mobile
devices and cloud computing technology are increasingly used in intelligent
information systems; however, smart mobile devices such as smartphone and smart
pad, also known as thin clients, limited by either their capacities (CPU,
memory or battery) or their network resources, do not always meet users'
satisfaction in using mobile services. Mobile cloud computing, in which
resource-rich virtual machines of smart mobile device are provided to a
customer as a service, can be terrific solution for expanding the limitation of
real smart mobile device, but the resources utilization rate is low and the
information cannot be shared easily. To address the problems above, this paper
proposes an information system for intelligent community, which is composed of
thin clients, wide band network and cloud computing servers. On one hand, the
thin clients with the characteristics of energy efficiency, high robustness and
high computing capacity can efficiently avoid the problems encountered in the
PC architecture and mobile devices. On the other hand, the cloud computing
servers in the proposed information system solve the problems of resource
sharing barriers. Finally, the system is built in real environments to evaluate
the performance. We deploy the proposed system in a community with more than
2000 residents, and it is demonstrated that the proposed system is robust and
efficient
Kubernetes as an Availability Manager for Microservice Applications
The move towards the microservice based architecture is well underway. In
this architectural style, small and loosely coupled modules are developed,
deployed, and scaled independently to compose cloud-native applications.
However, for carrier-grade service providers to migrate to the microservices
architectural style, availability remains a concern. Kubernetes is an open
source platform that defines a set of building blocks which collectively
provide mechanisms for deploying, maintaining, scaling, and healing
containerized microservices. Thus, Kubernetes hides the complexity of
microservice orchestration while managing their availability. In a preliminary
work we evaluated Kubernetes, using its default configuration, from the
availability perspective in a private cloud settings. In this paper, we
investigate more architectures and conduct more experiments to evaluate the
availability that Kubernetes delivers for its managed microservices. We present
different architectures for public and private clouds. We evaluate the
availability achievable through the healing capability of Kubernetes. We
investigate the impact of adding redundancy on the availability of microservice
based applications. We conduct experiments under the default configuration of
Kubernetes as well as under its most responsive one. We also perform a
comparative evaluation with the Availability Management Framework (AMF), which
is a proven solution as a middleware service for managing high-availability.
The results of our investigations show that in certain cases, the service
outage for applications managed with Kubernetes is significantly high.Comment: paper submitted to Journal of Network and Computer Application
Common Metrics for Analyzing, Developing and Managing Telecommunication Networks
The metrics play increasingly fundamental role in the design, development,
deployment and operation of telecommunication systems. Despite their
importance, the studies of metrics are usually limited to a narrow area or a
well-defined objective. Our study aims to more broadly survey the metrics that
are commonly used for analyzing, developing and managing telecommunication
networks in order to facilitate understanding of the current metrics landscape.
The metrics are simple abstractions of systems, and they directly influence how
the systems are perceived by different stakeholders. However, defining and
using metrics for telecommunication systems with ever increasing complexity is
a complicated matter which has not been so far systematically and
comprehensively considered in the literature. The common metrics sources are
identified, and how the metrics are used and selected is discussed. The most
commonly used metrics for telecommunication systems are categorized and
presented as energy and power metrics, quality-of-service metrics,
quality-of-experience metrics, security metrics, and reliability and resilience
metrics. Finally, the research directions and recommendations how the metrics
can evolve, and be defined and used more effectively are outlined.Comment: 5 figures, 18 table
Improving Robustness of Heterogeneous Serverless Computing Systems Via Probabilistic Task Pruning
Cloud-based serverless computing is an increasingly popular computing
paradigm. In this paradigm, different services have diverse computing
requirements that justify deploying an inconsistently Heterogeneous Computing
(HC) system to efficiently process them. In an inconsistently HC system, each
task needed for a given service, potentially exhibits different execution times
on each type of machine. An ideal resource allocation system must be aware of
such uncertainties in execution times and be robust against them, so that
Quality of Service (QoS) requirements of users are met. This research aims to
maximize the robustness of an HC system utilized to offer a serverless
computing system, particularly when the system is oversubscribed. Our strategy
to maximize robustness is to develop a task pruning mechanism that can be added
to existing task-mapping heuristics without altering them. Pruning tasks with a
low probability of meeting their deadlines improves the likelihood of other
tasks meeting their deadlines, thereby increasing system robustness and overall
QoS. To evaluate the impact of the pruning mechanism, we examine it on various
configurations of heterogeneous and homogeneous computing systems. Evaluation
results indicate a considerable improvement (up to 35%) in the system
robustness.Comment: IPDPSW '1
A Comparative Taxonomy and Survey of Public Cloud Infrastructure Vendors
An increasing number of technology enterprises are adopting cloud-native
architectures to offer their web-based products, by moving away from
privately-owned data-centers and relying exclusively on cloud service
providers. As a result, cloud vendors have lately increased, along with the
estimated annual revenue they share. However, in the process of selecting a
provider's cloud service over the competition, we observe a lack of universal
common ground in terms of terminology, functionality of services and billing
models. This is an important gap especially under the new reality of the
industry where each cloud provider has moved towards his own service taxonomy,
while the number of specialized services has grown exponentially. This work
discusses cloud services offered by four dominant, in terms of their current
market share, cloud vendors. We provide a taxonomy of their services and
sub-services that designates major service families namely computing, storage,
databases, analytics, data pipelines, machine learning, and networking. The aim
of such clustering is to indicate similarities, common design approaches and
functional differences of the offered services. The outcomes are essential both
for individual researchers, and bigger enterprises in their attempt to identify
the set of cloud services that will utterly meet their needs without
compromises. While we acknowledge the fact that this is a dynamic industry,
where new services arise constantly, and old ones experience important updates,
this study paints a solid image of the current offerings and gives prominence
to the directions that cloud service providers are following
Leveraging Deep Learning to Improve the Performance Predictability of Cloud Microservices
Performance unpredictability is a major roadblock towards cloud adoption, and
has performance, cost, and revenue ramifications. Predictable performance is
even more critical as cloud services transition from monolithic designs to
microservices. Detecting QoS violations after they occur in systems with
microservices results in long recovery times, as hotspots propagate and amplify
across dependent services. We present Seer, an online cloud performance
debugging system that leverages deep learning and the massive amount of tracing
data cloud systems collect to learn spatial and temporal patterns that
translate to QoS violations. Seer combines lightweight distributed RPC-level
tracing, with detailed low-level hardware monitoring to signal an upcoming QoS
violation, and diagnose the source of unpredictable performance. Once an
imminent QoS violation is detected, Seer notifies the cluster manager to take
action to avoid performance degradation altogether. We evaluate Seer both in
local clusters, and in large-scale deployments of end-to-end applications built
with microservices with hundreds of users. We show that Seer correctly
anticipates QoS violations 91% of the time, and avoids the QoS violation to
begin with in 84% of cases. Finally, we show that Seer can identify
application-level design bugs, and provide insights on how to better architect
microservices to achieve predictable performance
Massivizing Computer Systems: a Vision to Understand, Design, and Engineer Computer Ecosystems through and beyond Modern Distributed Systems
Our society is digital: industry, science, governance, and individuals
depend, often transparently, on the inter-operation of large numbers of
distributed computer systems. Although the society takes them almost for
granted, these computer ecosystems are not available for all, may not be
affordable for long, and raise numerous other research challenges. Inspired by
these challenges and by our experience with distributed computer systems, we
envision Massivizing Computer Systems, a domain of computer science focusing on
understanding, controlling, and evolving successfully such ecosystems. Beyond
establishing and growing a body of knowledge about computer ecosystems and
their constituent systems, the community in this domain should also aim to
educate many about design and engineering for this domain, and all people about
its principles. This is a call to the entire community: there is much to
discover and achieve
Big Data Computing Using Cloud-Based Technologies, Challenges and Future Perspectives
The excessive amounts of data generated by devices and Internet-based sources
at a regular basis constitute, big data. This data can be processed and
analyzed to develop useful applications for specific domains. Several
mathematical and data analytics techniques have found use in this sphere. This
has given rise to the development of computing models and tools for big data
computing. However, the storage and processing requirements are overwhelming
for traditional systems and technologies. Therefore, there is a need for
infrastructures that can adjust the storage and processing capability in
accordance with the changing data dimensions. Cloud Computing serves as a
potential solution to this problem. However, big data computing in the cloud
has its own set of challenges and research issues. This chapter surveys the big
data concept, discusses the mathematical and data analytics techniques that can
be used for big data and gives taxonomy of the existing tools, frameworks and
platforms available for different big data computing models. Besides this, it
also evaluates the viability of cloud-based big data computing, examines
existing challenges and opportunities, and provides future research directions
in this field
When Social Sensing Meets Edge Computing: Vision and Challenges
This paper overviews the state of the art, research challenges, and future
opportunities in an emerging research direction: Social Sensing based Edge
Computing (SSEC). Social sensing has emerged as a new sensing application
paradigm where measurements about the physical world are collected from humans
or from devices on their behalf. The advent of edge computing pushes the
frontier of computation, service, and data along the cloud-to-things continuum.
The merging of these two technical trends generates a set of new research
challenges that need to be addressed. In this paper, we first define the new
SSEC paradigm that is motivated by a few underlying technology trends. We then
present a few representative real-world case studies of SSEC applications and
several key research challenges that exist in those applications. Finally, we
envision a few exciting research directions in future SSEC. We hope this paper
will stimulate discussions of this emerging research direction in the
community.Comment: This manuscript has been accepted to ICCCN 201
- …