Search CORE

205 research outputs found

A Reconfigurable High-Performance Optical Data Center Architecture

Author: Liu Chong
Subramaniam Suresh
Xu Maotong
Publication venue
Publication date: 18/01/2019
Field of study

Optical data center network architectures are becoming attractive because of their low energy consumption, large bandwidth, and low cabling complexity. In\cite{Xu1605:PODCA}, an AWGR-based passive optical data center architecture (PODCA) is presented. Compared with other optical data center architectures, e.g., DOS \cite{ye2010scalable}, Proteus \cite{singla2010proteus}, and Petabit \cite{xia2010petabit}, PODCA can save up to 90

\%

on power consumption and 88

\%

in cost. Also, average latency can be low as 9

\mu

s at close to 100

\%

throughput. However, PODCA is not reconfigurable and cannot optimize the network topology to dynamic traffic. In this paper, we present a novel, scalable and flexible reconfigurable architecture called RODCA. RODCA is built on and augments PODCA with a flexible localized intra-cluster optical network. With the reconfigurable intra-cluster network, racks with mutually large traffic can be located within the same cluster, and share the large bandwidth of the intra-cluster network. We present an algorithm for DCN topology reconfiguration, and present simulation results to demonstrate the effectiveness of reconfiguration

arXiv.org e-Print Archive

On the Optimality of Scheduling Dependent MapReduce Tasks on Heterogeneous Machines

Author: Aggarwal Vaneet
Lan Tian
Subramaniam Suresh
Xu Maotong
Publication venue
Publication date: 27/11/2017
Field of study

MapReduce is the most popular big-data computation framework, motivating many research topics. A MapReduce job consists of two successive phases, i.e. map phase and reduce phase. Each phase can be divided into multiple tasks. A reduce task can only start when all the map tasks finish processing. A job is successfully completed when all its map and reduce tasks are complete. The task of optimally scheduling the different tasks on different servers to minimize the weighted completion time is an open problem, and is the focus of this paper. In this paper, we give an approximation ratio with a competitive ratio

2(1+(m-1)/D)+1

, where

m

is the number of servers and

D\ge 1

is the task-skewness product. We implement the proposed algorithm on Hadoop framework, and compare with three baseline schedulers. Results show that our DMRS algorithm can outperform baseline schedulers by up to

82\%

arXiv.org e-Print Archive

Chronos: A Unifying Optimization Framework for Speculative Execution of Deadline-critical MapReduce Jobs

Author: Alamro Sultan
Lan Tian
Subramaniam Suresh
Xu Maotong
Publication venue
Publication date: 16/04/2018
Field of study

Meeting desired application deadlines in cloud processing systems such as MapReduce is crucial as the nature of cloud applications is becoming increasingly mission-critical and deadline-sensitive. It has been shown that the execution times of MapReduce jobs are often adversely impacted by a few slow tasks, known as stragglers, which result in high latency and deadline violations. While a number of strategies have been developed in existing work to mitigate stragglers by launching speculative or clone task attempts, none of them provides a quantitative framework that optimizes the speculative execution for offering guaranteed Service Level Agreements (SLAs) to meet application deadlines. In this paper, we bring several speculative scheduling strategies together under a unifying optimization framework, called Chronos, which defines a new metric, Probability of Completion before Deadlines (PoCD), to measure the probability that MapReduce jobs meet their desired deadlines. We systematically analyze PoCD for popular strategies including Clone, Speculative-Restart, and Speculative-Resume, and quantify their PoCD in closed-form. The result illuminates an important tradeoff between PoCD and the cost of speculative execution, measured by the total (virtual) machine time required under different strategies. We propose an optimization problem to jointly optimize PoCD and execution cost in different strategies, and develop an algorithmic solution that is guaranteed to be optimal. Chronos is prototyped on Hadoop MapReduce and evaluated against three baseline strategies using both experiments and trace-driven simulations, achieving 50% net utility increase with up to 80% PoCD and 88% cost improvements

arXiv.org e-Print Archive

Trading Off Computation with Transmission in Status Update Systems

Author: Ozel Omur
Subramaniam Suresh
Zou Peng
Publication venue
Publication date: 01/07/2019
Field of study

This paper is motivated by emerging edge computing applications in which generated data are pre-processed at the source and then transmitted to an edge server. In such a scenario, there is typically a tradeoff between the amount of pre-processing and the amount of data to be transmitted. We model such a system by considering two non-preemptive queues in tandem whose service times are independent over time but the transmission service time is dependent on the computation service time in mean value. The first queue is in M/GI/1/1 form with a single server, memoryless exponential arrivals, general independent service and no extra buffer to save incoming status update packets. The second queue is in GI/M/1/2* form with a single server receiving packets from the first queue, memoryless service and a single data buffer to save incoming packets. Additionally, mean service times of the first and second queues are dependent through a deterministic monotonic function. We perform stationary distribution analysis in this system and obtain closed form expressions for average age of information (AoI) and average peak AoI. Our numerical results illustrate the analytical findings and highlight the tradeoff between average AoI and average peak AoI generated by the tandem nature of the queueing system with dependent service times

arXiv.org e-Print Archive

On Age and Value of Information in Status Update Systems

Author: Ozel Omur
Subramaniam Suresh
Zou Peng
Publication venue
Publication date: 31/03/2020
Field of study

Motivated by the inherent value of packets arising in many cyber-physical applications (e.g., due to precision of the information content or an alarm message), we consider status update systems with update packets carrying values as well as their generation time stamps. Once generated, a status update packet has a random initial value and a deterministic deadline after which it is not useful (ultimate staleness). In our model, value of a packet decreases in time (even after reception) starting from its generation to ultimate staleness when it vanishes. The value of information (VoI) at the receiver is additive in that the VoI is the sum of the current values of all packets held by the receiver. We investigate various queuing disciplines under potential dependence between value and service time and provide closed form expressions for average VoI at the receiver. Numerical results illustrate the average VoI for different scenarios and the contrast between average age of information (AoI) and average VoI

arXiv.org e-Print Archive

Optimizing Information Freshness Through Computation-Transmission Tradeoff and Queue Management in Edge Computing

Author: Ozel Omur
Subramaniam Suresh
Zou Peng
Publication venue
Publication date: 03/12/2019
Field of study

Edge computing applications typically require generated data to be preprocessed at the source and then transmitted to an edge server. In such cases, transmission time and preprocessing time are coupled, yielding a tradeoff between them to achieve the targeted objective. This paper presents analysis of such a system with the objective of optimizing freshness of received data at the edge server. We model this system as two queues in tandem whose service times are independent over time but the transmission service time is monotonically dependent on the computation service time in mean value. This dependence captures the natural decrease in transmission time due to lower offloaded computation. We analyze various queue management schemes in this tandem queue where the first queue has a single server, Poisson packet arrivals, general independent service and no extra buffer to save incoming status update packets. The second queue has a single server receiving packets from the first queue and service is memoryless. We consider the second queue in two forms: (i) No data buffer and (ii) One unit data buffer and last come first serve with discarding. We analyze various non-preemptive as well as preemptive cases. We perform stationary distribution analysis and obtain closed form expressions for average age of information (AoI) and average peak AoI. Our numerical results illustrate analytical findings on how computation and transmission times could be traded off to optimize AoI and reveal a consequent tradeoff between average AoI and average peak AoI.Comment: arXiv admin note: substantial text overlap with arXiv:1907.0092

arXiv.org e-Print Archive

Waiting before Serving: A Companion to Packet Management in Status Update Systems

Author: Ozel Omur
Subramaniam Suresh
Zou Peng
Publication venue
Publication date: 22/04/2019
Field of study

In this paper, we explore the potential of server waiting before packet transmission in improving the Age of Information (AoI) in status update systems. We consider a non-preemptive queue with Poisson arrivals and independent general service distribution and we incorporate waiting before serving in two packet management schemes: M/GI/1/1 and M/GI/1/

2^*

. In M/GI/1/1 scheme, the server waits for a deterministic time immediately after a packet enters the server. In M/GI/1/

2^*

scheme, depending on idle or busy system state, the server waits for a deterministic time before starting service of the packet. In both cases, if a potential newer arrival is captured existing packet is discarded. Different from most existing works, we analyze AoI evolution by indexing the incoming packets, which is enabled by an alternative method of partitioning the area under the evolution of instantaneous AoI to calculate its time average. We obtain expressions for average and average peak AoI for both queueing disciplines with waiting. Our numerical results demonstrate that waiting before service can bring significant improvement in average age, particularly, for heavy-tailed service distributions. This improvement comes at the expense of an increase in average peak AoI. We highlight the trade-off between average and average peak AoI generated by waiting before serving

arXiv.org e-Print Archive

On double-link failure recovery in WDM optical networks

Author: Hongsik Choi
Hyeong-ah Choi
Suresh Subramaniam
Publication venue
Publication date: 01/01/2002
Field of study

Abstract — Network survivability is a crucial requirement in high-speed optical networks. Typical approaches of providing survivability have considered the failure of a single component such as a link or a node. In this paper, we consider a failure model in which any two links in the network may fail in an arbitrary order. Three loopback methods of recovering from double-link failures are presented. The first two methods require the identification of the failed links, while the third one does not. However, precomputing the backup paths for the third method is more difficult than for the first two. A heuristic algorithm that pre-computes backup paths for links is presented. Numerical results comparing the performance of our algorithm with other approaches suggests that it is possible to achieve ¢¡£¡¥¤ recovery from double-link failures with a modest increase in backup capacity. Index Terms—Wavelength division multiplexing (WDM), loopback recovery, restoration, double-link failures, 3-edge-connected graph. I

CiteSeerX

A Hierarchical WDM-based Scalable Data Center Network Architecture

Author: Diakonikolas Jelena
Modiano Eytan
Subramaniam Suresh
Xu Maotong
Publication venue
Publication date: 25/01/2019
Field of study

Massive data centers are at the heart of the Internet. The rapid growth of Internet traffic and the abundance of rich data-driven applications have raised the need for enormous network bandwidth. Towards meeting this growing traffic demand, optical interconnects have gained significant attention, as they can provide high throughput, low latency, and scalability. In particular, optical Wavelength Division Multiplexing (WDM) provides the possibility to build data centers comprising of millions of servers, while providing hundreds of terabits per second bandwidth. In this paper, we propose a WDM-based Reconfigurable Hierarchical Optical Data Center Architecture (RHODA) that can satisfy future Internet traffic demands. To improve scalability, our DCN architecture is hierarchical, as it groups server racks into clusters. Cluster membership is reconfigurable through the use of optical switches. Each cluster enables heavy-traffic communication among the racks within. To support varying traffic patterns, the inter-cluster network topology and link capacities are also reconfigurable, which is achieved through the use of optical space switches and Wavelength Selective Switches (WSSs). Our simulation results demonstrate that in terms of average hop distance, RHODA outperforms OSA, FatTree and WaveCube by up to 81%, 66% and 60%, respectively

arXiv.org e-Print Archive

Maintaining Information Freshness in Power-Efficient Status Update Systems

Author: Ozel Omur
Rafiee Parisa
Subramaniam Suresh
Zou Peng
Publication venue
Publication date: 30/03/2020
Field of study

This paper is motivated by emerging edge computing systems which consist of sensor nodes that acquire and process information and then transmit status updates to an edge receiver for possible further processing. As power is a scarce resource at the sensor nodes, the system is modeled as a tandem computation-transmission queue with power-efficient computing. Jobs arrive at the computation server with rate

\lambda

as a Poisson process with no available data buffer. The computation server can be in one of three states: (i) OFF: the server is turned off and no jobs are observed or processed, (ii) ON-Idle: the server is turned on but there is no job in the server, (iii) ON-Busy: the server is turned on and a job is processed in the server. These states cost zero, one and

p_c

units of power, respectively. Under a long-term power constraint, the computation server switches from one state to another in sequence: first a deterministic

T_o

time units in OFF state, then waiting for a job arrival in ON-Idle state and then in ON-Busy state for an independent identically distributed compute time duration. The transmission server has a single unit data buffer to save incoming packets and applies last come first serve with discarding as well as a packet deadline to discard a sitting packet for maintaining information freshness, which is measured by the Age of Information (AoI). Additionally, there is a monotonic functional relation between the mean time spent in ON-Busy state and the mean transmission time. We obtain closed-form expressions for average AoI and average peak AoI. Our numerical results illustrate various regimes of operation for best AoI performances optimized over packet deadlines with relation to power efficiency

arXiv.org e-Print Archive