3,621 research outputs found
Stochastic Reward Net-based Modeling Approach for Availability Quantification of Data Center Systems
Availability quantification and prediction of IT infrastructure in data centers are of paramount importance for online business enterprises. In this chapter, we present comprehensive availability models for practical case studies in order to demonstrate a state-space stochastic reward net model for typical data center systems for quantitative assessment of system availability. We present stochastic reward net models of a virtualized server system, a data center network based on DCell topology, and a conceptual data center for disaster tolerance. The systems are then evaluated against various metrics of interest, including steady state availability, downtime and downtime cost, and sensitivity analysis
Model-based sensitivity analysis of IaaS cloud availability
The increasing shift of various critical services towards Infrastructure-as-a-Service (IaaS) cloud data centers (CDCs) creates a need for analyzing CDCs’ availability, which is affected by various factors including repair policy and system parameters. This paper aims to apply analytical modeling and sensitivity analysis techniques to investigate the impact of these factors on the availability of a large-scale IaaS CDC, which (1) consists of active and two kinds of standby physical machines (PMs), (2) allows PM moving among active and two kinds of standby PM pools, and (3) allows active and two kinds of standby PMs to have different mean repair times. Two repair policies are considered: (P1) all pools share a repair station and (P2) each pool uses its own repair station. We develop monolithic availability models for each repair policy by using Stochastic Reward Nets and also develop the corresponding scalable two-level models in order to overcome the monolithic model''s limitations, caused by the large-scale feature of a CDC and the complicated interactions among CDC components. We also explore how to apply differential sensitivity analysis technique to conduct parametric sensitivity analysis in the case of interacting sub-models. Numerical results of monolithic models and simulation results are used to verify the approximate accuracy of interacting sub-models, which are further applied to examine the sensitivity of the large-scale CDC availability with respect to repair policy and system parameters
A Taxonomy for Management and Optimization of Multiple Resources in Edge Computing
Edge computing is promoted to meet increasing performance needs of
data-driven services using computational and storage resources close to the end
devices, at the edge of the current network. To achieve higher performance in
this new paradigm one has to consider how to combine the efficiency of resource
usage at all three layers of architecture: end devices, edge devices, and the
cloud. While cloud capacity is elastically extendable, end devices and edge
devices are to various degrees resource-constrained. Hence, an efficient
resource management is essential to make edge computing a reality. In this
work, we first present terminology and architectures to characterize current
works within the field of edge computing. Then, we review a wide range of
recent articles and categorize relevant aspects in terms of 4 perspectives:
resource type, resource management objective, resource location, and resource
use. This taxonomy and the ensuing analysis is used to identify some gaps in
the existing research. Among several research gaps, we found that research is
less prevalent on data, storage, and energy as a resource, and less extensive
towards the estimation, discovery and sharing objectives. As for resource
types, the most well-studied resources are computation and communication
resources. Our analysis shows that resource management at the edge requires a
deeper understanding of how methods applied at different levels and geared
towards different resource types interact. Specifically, the impact of mobility
and collaboration schemes requiring incentives are expected to be different in
edge architectures compared to the classic cloud solutions. Finally, we find
that fewer works are dedicated to the study of non-functional properties or to
quantifying the footprint of resource management techniques, including
edge-specific means of migrating data and services.Comment: Accepted in the Special Issue Mobile Edge Computing of the Wireless
Communications and Mobile Computing journa
Fault-Tolerance in the Scope of Cloud Computing
Fault-tolerance methods are required to ensure high availability and high reliability in cloud computing environments. In this survey, we address fault-tolerance in the scope of cloud computing. Recently, cloud computing-based environments have presented new challenges to support fault-tolerance and opened new paths to develop novel strategies, architectures, and standards. We provide a detailed background of cloud computing to establish a comprehensive understanding of the subject, from basic to advanced. We then highlight fault-tolerance components and system-level metrics and identify the needs and applications of fault-tolerance in cloud computing. Furthermore, we discuss state-of-the-art proactive and reactive approaches to cloud computing fault-tolerance. We further structure and discuss current research efforts on cloud computing fault-tolerance architectures and frameworks. Finally, we conclude by enumerating future research directions specific to cloud computing fault-tolerance development.publishe
- …