10,638 research outputs found
Data-Driven Robust Optimization
The last decade witnessed an explosion in the availability of data for
operations research applications. Motivated by this growing availability, we
propose a novel schema for utilizing data to design uncertainty sets for robust
optimization using statistical hypothesis tests. The approach is flexible and
widely applicable, and robust optimization problems built from our new sets are
computationally tractable, both theoretically and practically. Furthermore,
optimal solutions to these problems enjoy a strong, finite-sample probabilistic
guarantee. \edit{We describe concrete procedures for choosing an appropriate
set for a given application and applying our approach to multiple uncertain
constraints. Computational evidence in portfolio management and queuing confirm
that our data-driven sets significantly outperform traditional robust
optimization techniques whenever data is available.Comment: 38 pages, 15 page appendix, 7 figures. This version updated as of
Oct. 201
Recommended from our members
COST-EFFICIENT RESOURCE PROVISIONING FOR CLOUD-ENABLED SCHEDULERS
Since the last decade, public cloud platforms are rapidly becoming de-facto computing platform for our society. To support the wide range of users and their diverse applications, public cloud platforms started to offer the same VMs under many purchasing options that differ across their cost, performance, availability, and time commitments. Popular purchasing options include on-demand, reserved, and transient VM types. Reserved VMs require long time commitments, whereas users can acquire and release the on-demand (and transient) VMs at any time. While transient VMs cost significantly less than on-demand VMs, platforms may revoke them at any time. In general, the stronger the commitment, i.e., longer and less flexible, the lower the price. However, longer and less flexible time commitments can increase cloud costs for users if future workloads cannot utilize the VMs they committed to buying. Interestingly, this wide range of purchasing options provide opportunities for cost savings. However, large cloud customers often find it challenging to choose the right mix of purchasing options to minimize their long-term costs while retaining the ability to adjust their capacity up and down in response to workload variations. Thus, optimizing the cloud costs requires users to select a mix of VM purchasing options based on their short- and long-term expectation of workload utilization. Notably, hybrid clouds combine multiple VM purchasing options or private clusters with public cloud VMs to optimize the cloud costs based on their workload expectations. In this thesis, we address the challenge of choosing a mix of different VM purchasing options in the context of large cloud customers and thereby optimizing their cloud costs. To this end, we make the following contributions: (i) design and implement a container orchestration platform (using Kubernetes) to optimize the cost of executing mixed interactive and batch workloads on cloud platforms using on-demand and transient VMs, (ii) develop simple analytical models for different straggler mitigation techniques to better understand the cost of synchronization in distributed machine learning workloads and compare their cost and performance on on-demand and transient VMs, (iii) design multiple policies to optimize long-term cloud costs by selecting a mix of VM purchasing options based on short- and long-term expectations of workload utilization (with no job waiting), (iv) introduce the concept of waiting policy for cloud-enabled schedulers, and show that provisioning long-term resources (e.g., reserved VMs) to optimize the cloud costs is dependent on it, and (v) design and implement speculative execution and ML-based waiting time predictions (for waiting policies) to show that optimizing job waiting in the cloud is possible without accurate job runtime predictions
Cooperative Deep Reinforcement Learning for Multiple-Group NB-IoT Networks Optimization
NarrowBand-Internet of Things (NB-IoT) is an emerging cellular-based
technology that offers a range of flexible configurations for massive IoT radio
access from groups of devices with heterogeneous requirements. A configuration
specifies the amount of radio resources allocated to each group of devices for
random access and for data transmission. Assuming no knowledge of the traffic
statistics, the problem is to determine, in an online fashion at each
Transmission Time Interval (TTI), the configurations that maximizes the
long-term average number of IoT devices that are able to both access and
deliver data. Given the complexity of optimal algorithms, a Cooperative
Multi-Agent Deep Neural Network based Q-learning (CMA-DQN) approach is
developed, whereby each DQN agent independently control a configuration
variable for each group. The DQN agents are cooperatively trained in the same
environment based on feedback regarding transmission outcomes. CMA-DQN is seen
to considerably outperform conventional heuristic approaches based on load
estimation.Comment: Submitted for conference publicatio
Metascheduling of HPC Jobs in Day-Ahead Electricity Markets
High performance grid computing is a key enabler of large scale collaborative
computational science. With the promise of exascale computing, high performance
grid systems are expected to incur electricity bills that grow super-linearly
over time. In order to achieve cost effectiveness in these systems, it is
essential for the scheduling algorithms to exploit electricity price
variations, both in space and time, that are prevalent in the dynamic
electricity price markets. In this paper, we present a metascheduling algorithm
to optimize the placement of jobs in a compute grid which consumes electricity
from the day-ahead wholesale market. We formulate the scheduling problem as a
Minimum Cost Maximum Flow problem and leverage queue waiting time and
electricity price predictions to accurately estimate the cost of job execution
at a system. Using trace based simulation with real and synthetic workload
traces, and real electricity price data sets, we demonstrate our approach on
two currently operational grids, XSEDE and NorduGrid. Our experimental setup
collectively constitute more than 433K processors spread across 58 compute
systems in 17 geographically distributed locations. Experiments show that our
approach simultaneously optimizes the total electricity cost and the average
response time of the grid, without being unfair to users of the local batch
systems.Comment: Appears in IEEE Transactions on Parallel and Distributed System
- …