6,413 research outputs found
Datacenter Traffic Control: Understanding Techniques and Trade-offs
Datacenters provide cost-effective and flexible access to scalable compute
and storage resources necessary for today's cloud computing needs. A typical
datacenter is made up of thousands of servers connected with a large network
and usually managed by one operator. To provide quality access to the variety
of applications and services hosted on datacenters and maximize performance, it
deems necessary to use datacenter networks effectively and efficiently.
Datacenter traffic is often a mix of several classes with different priorities
and requirements. This includes user-generated interactive traffic, traffic
with deadlines, and long-running traffic. To this end, custom transport
protocols and traffic management techniques have been developed to improve
datacenter network performance.
In this tutorial paper, we review the general architecture of datacenter
networks, various topologies proposed for them, their traffic properties,
general traffic control challenges in datacenters and general traffic control
objectives. The purpose of this paper is to bring out the important
characteristics of traffic control in datacenters and not to survey all
existing solutions (as it is virtually impossible due to massive body of
existing research). We hope to provide readers with a wide range of options and
factors while considering a variety of traffic control mechanisms. We discuss
various characteristics of datacenter traffic control including management
schemes, transmission control, traffic shaping, prioritization, load balancing,
multipathing, and traffic scheduling. Next, we point to several open challenges
as well as new and interesting networking paradigms. At the end of this paper,
we briefly review inter-datacenter networks that connect geographically
dispersed datacenters which have been receiving increasing attention recently
and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
Evaluator services for optimised service placement in distributed heterogeneous cloud infrastructures
Optimal placement of demanding real-time interactive applications in a distributed heterogeneous cloud very quickly results in a complex tradeoff between the application constraints and resource capabilities. This requires very detailed information of the various requirements and capabilities of the applications and available resources. In this paper, we present a mathematical model for the service optimization problem and study the concept of evaluator services as a flexible and efficient solution for this complex problem. An evaluator service is a service probe that is deployed in particular runtime environments to assess the feasibility and cost-effectiveness of deploying a specific application in such environment. We discuss how this concept can be incorporated in a general framework such as the FUSION architecture and discuss the key benefits and tradeoffs for doing evaluator-based optimal service placement in widely distributed heterogeneous cloud environments
Real-Time Data Processing With Lambda Architecture
Data has evolved immensely in recent years, in type, volume and velocity. There are several frameworks to handle the big data applications. The project focuses on the Lambda Architecture proposed by Marz and its application to obtain real-time data processing. The architecture is a solution that unites the benefits of the batch and stream processing techniques. Data can be historically processed with high precision and involved algorithms without loss of short-term information, alerts and insights. Lambda Architecture has an ability to serve a wide range of use cases and workloads that withstands hardware and human mistakes. The layered architecture enhances loose coupling and flexibility in the system. This a huge benefit that allows understanding the trade-offs and application of various tools and technologies across the layers. There has been an advancement in the approach of building the LA due to improvements in the underlying tools. The project demonstrates a simplified architecture for the LA that is maintainable
R^3: On-device Real-Time Deep Reinforcement Learning for Autonomous Robotics
Autonomous robotic systems, like autonomous vehicles and robotic search and
rescue, require efficient on-device training for continuous adaptation of Deep
Reinforcement Learning (DRL) models in dynamic environments. This research is
fundamentally motivated by the need to understand and address the challenges of
on-device real-time DRL, which involves balancing timing and algorithm
performance under memory constraints, as exposed through our extensive
empirical studies. This intricate balance requires co-optimizing two pivotal
parameters of DRL training -- batch size and replay buffer size. Configuring
these parameters significantly affects timing and algorithm performance, while
both (unfortunately) require substantial memory allocation to achieve
near-optimal performance.
This paper presents R^3, a holistic solution for managing timing, memory, and
algorithm performance in on-device real-time DRL training. R^3 employs (i) a
deadline-driven feedback loop with dynamic batch sizing for optimizing timing,
(ii) efficient memory management to reduce memory footprint and allow larger
replay buffer sizes, and (iii) a runtime coordinator guided by heuristic
analysis and a runtime profiler for dynamically adjusting memory resource
reservations. These components collaboratively tackle the trade-offs in
on-device DRL training, improving timing and algorithm performance while
minimizing the risk of out-of-memory (OOM) errors.
We implemented and evaluated R^3 extensively across various DRL frameworks
and benchmarks on three hardware platforms commonly adopted by autonomous
robotic systems. Additionally, we integrate R^3 with a popular realistic
autonomous car simulator to demonstrate its real-world applicability.
Evaluation results show that R^3 achieves efficacy across diverse platforms,
ensuring consistent latency performance and timing predictability with minimal
overhead.Comment: Accepted by RTSS 202
Improving the Scalability of DPWS-Based Networked Infrastructures
The Devices Profile for Web Services (DPWS) specification enables seamless
discovery, configuration, and interoperability of networked devices in various
settings, ranging from home automation and multimedia to manufacturing
equipment and data centers. Unfortunately, the sheer simplicity of event
notification mechanisms that makes it fit for resource-constrained devices,
makes it hard to scale to large infrastructures with more stringent
dependability requirements, ironically, where self-configuration would be most
useful. In this report, we address this challenge with a proposal to integrate
gossip-based dissemination in DPWS, thus maintaining compatibility with
original assumptions of the specification, and avoiding a centralized
configuration server or custom black-box middleware components. In detail, we
show how our approach provides an evolutionary and non-intrusive solution to
the scalability limitations of DPWS and experimentally evaluate it with an
implementation based on the the Web Services for Devices (WS4D) Java Multi
Edition DPWS Stack (JMEDS).Comment: 28 pages, Technical Repor
- …