1,468 research outputs found
Datacenter Traffic Control: Understanding Techniques and Trade-offs
Datacenters provide cost-effective and flexible access to scalable compute
and storage resources necessary for today's cloud computing needs. A typical
datacenter is made up of thousands of servers connected with a large network
and usually managed by one operator. To provide quality access to the variety
of applications and services hosted on datacenters and maximize performance, it
deems necessary to use datacenter networks effectively and efficiently.
Datacenter traffic is often a mix of several classes with different priorities
and requirements. This includes user-generated interactive traffic, traffic
with deadlines, and long-running traffic. To this end, custom transport
protocols and traffic management techniques have been developed to improve
datacenter network performance.
In this tutorial paper, we review the general architecture of datacenter
networks, various topologies proposed for them, their traffic properties,
general traffic control challenges in datacenters and general traffic control
objectives. The purpose of this paper is to bring out the important
characteristics of traffic control in datacenters and not to survey all
existing solutions (as it is virtually impossible due to massive body of
existing research). We hope to provide readers with a wide range of options and
factors while considering a variety of traffic control mechanisms. We discuss
various characteristics of datacenter traffic control including management
schemes, transmission control, traffic shaping, prioritization, load balancing,
multipathing, and traffic scheduling. Next, we point to several open challenges
as well as new and interesting networking paradigms. At the end of this paper,
we briefly review inter-datacenter networks that connect geographically
dispersed datacenters which have been receiving increasing attention recently
and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
Dynamic Thermal and Power Management: From Computers to Buildings
Thermal and power management have become increasingly important for both computing and physical systems. Computing systems from real-time embedded systems to data centers require effective thermal and power management to prevent overheating and save energy. In the mean time, as a major consumer of energy buildings face challenges to reduce the energy consumption for air conditioning while maintaining comfort of occupants. In this dissertation we investigate dynamic thermal and power management for computer systems and buildings. (1) We present thermal control under utilization bound (TCUB), a novel control-theoretic thermal management algorithm designed for single core real-time embedded systems. A salient feature of TCUB is to maintain both desired processor temperature and real-time performance. (2) To address unique challenges posed by multicore processors, we develop the real-time multicore thermal control (RT-MTC) algorithm. RT-MTC employs a feedback control loop to enforce the desired temperature and CPU utilization of the multicore platform via dynamic frequency and voltage scaling. (3) We research dynamic thermal management for real-time services running on server clusters. We develop the control-theoretic thermal balancing (CTB) to dynamically balance temperature of servers via distributing clients\u27 service requests to servers. Next, (4) we propose CloudPowerCap, a power cap management system for virtualized cloud computing infrastructure. The novelty of CloudPowerCap lies in an integrated approach to coordinate power budget management and resource management in a cloud computing environment. Finally we expand our research to physical environment by exploring several fundamental problems of thermal and power management on buildings. We analyze spatial and temporal data acquired from an real-world auditorium instrumented by a multi-modal sensor network. We propose a data mining technique to determine the appropriate number and location of temperature sensors for estimating the spatiotemporal temperature distribution of the auditorium. Furthermore, we explore the potential energy savings that can be achieved through occupancy-based HVAC scheduling based on real occupancy data of the auditorium
Energy-efficient Transitional Near-* Computing
Studies have shown that communication networks, devices accessing the Internet, and data centers account for 4.6% of the worldwide electricity consumption.
Although data centers, core network equipment, and mobile devices are getting more energy-efficient, the amount of data that is being processed, transferred, and stored is vastly increasing.
Recent computer paradigms, such as fog and edge computing, try to improve this situation by processing data near the user, the network, the devices, and the data itself.
In this thesis, these trends are summarized under the new term near-* or near-everything computing.
Furthermore, a novel paradigm designed to increase the energy efficiency of near-* computing is proposed: transitional computing.
It transfers multi-mechanism transitions, a recently developed paradigm for a highly adaptable future Internet, from the field of communication systems to computing systems.
Moreover, three types of novel transitions are introduced to achieve gains in energy efficiency in near-* environments, spanning from private Infrastructure-as-a-Service (IaaS) clouds, Software-defined Wireless Networks (SDWNs) at the edge of the network, Disruption-Tolerant Information-Centric Networks (DTN-ICNs) involving mobile devices, sensors, edge devices as well as programmable components on a mobile System-on-a-Chip (SoC).
Finally, the novel idea of transitional near-* computing for emergency response applications is presented
to assist rescuers and affected persons during an emergency event or a disaster, although connections to cloud services and social networks might be disturbed by network outages, and network bandwidth and battery power of mobile devices might be limited
Optimizing Virtual Machine I/O Performance in Cloud Environments
Maintaining closeness between data sources and data consumers is crucial for workload I/O performance. In cloud environments, this kind of closeness can be violated by system administrative events and storage architecture barriers. VM migration events are frequent in cloud environments. VM migration changes VM runtime inter-connection or cache contexts, significantly degrading VM I/O performance. Virtualization is the backbone of cloud platforms. I/O virtualization adds additional hops to workload data access path, prolonging I/O latencies. I/O virtualization overheads cap the throughput of high-speed storage devices and imposes high CPU utilizations and energy consumptions to cloud infrastructures. To maintain the closeness between data sources and workloads during VM migration, we propose Clique, an affinity-aware migration scheduling policy, to minimize the aggregate wide area communication traffic during storage migration in virtual cluster contexts. In host-side caching contexts, we propose Successor to recognize warm pages and prefetch them into caches of destination hosts before migration completion. To bypass the I/O virtualization barriers, we propose VIP, an adaptive I/O prefetching framework, which utilizes a virtual I/O front-end buffer for prefetching so as to avoid the on-demand involvement of I/O virtualization stacks and accelerate the I/O response. Analysis on the traffic trace of a virtual cluster containing 68 VMs demonstrates that Clique can reduce inter-cloud traffic by up to 40%. Tests of MPI Reduce_scatter benchmark show that Clique can keep VM performance during migration up to 75% of the non-migration scenario, which is more than 3 times of the Random VM choosing policy. In host-side caching environments, Successor performs better than existing cache warm-up solutions and achieves zero VM-perceived cache warm-up time with low resource costs. At system level, we conducted comprehensive quantitative analysis on I/O virtualization overheads. Our trace replay based simulation demonstrates the effectiveness of VIP for data prefetching with ignorable additional cache resource costs
Networking - A Statistical Physics Perspective
Efficient networking has a substantial economic and societal impact in a
broad range of areas including transportation systems, wired and wireless
communications and a range of Internet applications. As transportation and
communication networks become increasingly more complex, the ever increasing
demand for congestion control, higher traffic capacity, quality of service,
robustness and reduced energy consumption require new tools and methods to meet
these conflicting requirements. The new methodology should serve for gaining
better understanding of the properties of networking systems at the macroscopic
level, as well as for the development of new principled optimization and
management algorithms at the microscopic level. Methods of statistical physics
seem best placed to provide new approaches as they have been developed
specifically to deal with non-linear large scale systems. This paper aims at
presenting an overview of tools and methods that have been developed within the
statistical physics community and that can be readily applied to address the
emerging problems in networking. These include diffusion processes, methods
from disordered systems and polymer physics, probabilistic inference, which
have direct relevance to network routing, file and frequency distribution, the
exploration of network structures and vulnerability, and various other
practical networking applications.Comment: (Review article) 71 pages, 14 figure
Advances in Grid Computing
This book approaches the grid computing with a perspective on the latest achievements in the field, providing an insight into the current research trends and advances, and presenting a large range of innovative research papers. The topics covered in this book include resource and data management, grid architectures and development, and grid-enabled applications. New ideas employing heuristic methods from swarm intelligence or genetic algorithm and quantum encryption are considered in order to explain two main aspects of grid computing: resource management and data management. The book addresses also some aspects of grid computing that regard architecture and development, and includes a diverse range of applications for grid computing, including possible human grid computing system, simulation of the fusion reaction, ubiquitous healthcare service provisioning and complex water systems
Recommended from our members
Resource allocation optimization in large scale distributed systems
We studied the problem of resource allocation in large scale distributed applications such as Online Social Networks (OSN) and Cloud Computing. In such settings, resource allocation schemes need to efficient as well as adaptive to the time-varying environments. The abstract resource allocation problem concerns with how to optimally use resources for different tasks. In the context of this dissertation, the resources are servers and the tasks are (a) the virtual machines in the cloud computing setting, and or users for on-line social network applications. It is well-known that the general resource allocation problem is NP-hard. Therefore, in this dissertation, we study a number of heuristic algorithms designed for two primary objectives: 1) achieve reliability via load balancing among resource providers and 2) minimizing the energy consumption by reducing unnecessary intercommunication loads among the servers.
Specifically, the dissertation has three main components. The first component deals with optimal assignment of user data to servers to maximize load balance and minimize power consumption. In this component, we propose a novel Distributed Perturbed Greedy Search (DPGS) algorithm which combine both deterministic search and random search to speed the convergence while avoiding local optimum. The empirical shows that the DPGS has a fast convergence rate to the near optimal solution even when the environment changes. The second component deals with the analysis on the convergence rates of a general simulated annealing algorithm via the notion of adiabatic time. We then apply the results to characterize the convergence rates for simulated annealing algorithm when applied to the optimal assignment in the component one. Finally, the third component of the dissertation is concerned with optimal assignment of virtual machines to servers in the context of cloud computing, in order to minimize the energy subject to a given performance requirement. We show that the problem can be approximated well as a convex problem, and propose convex relaxation technique to find the optimal solution
- …