396 research outputs found
Staffing and Scheduling to Differentiate Service in Many-Server Service Systems
This dissertation contributes to the study of a queueing system with a single pool of multiple homogeneous servers to which multiple classes of customers arrive in independent streams. The objective is to devise appropriate staffing and scheduling policies to achieve specified class-dependent service levels expressed in terms of tail probability of delays. Here staffing and scheduling are concerned with specifying a time-varying number of servers and assigning newly idle servers to a waiting customer from one of K classes, respectively. For this purpose, we propose new staffing-and-scheduling solutions under the critically-loaded and overloaded regimes. In both cases, the proposed solutions are both time dependent (coping with the time variability in the arrival pattern) and state dependent (capturing the stochastic variability in service and arrival times). We prove heavy-traffic limit theorems to substantiate the effectiveness of our proposed staffing and scheduling policies. We also conduct computer simulation experiments to provide engineering confirmation and practical insight
Extracting Reward Functions from Diffusion Models
Diffusion models have achieved remarkable results in image generation, and
have similarly been used to learn high-performing policies in sequential
decision-making tasks. Decision-making diffusion models can be trained on
lower-quality data, and then be steered with a reward function to generate
near-optimal trajectories. We consider the problem of extracting a reward
function by comparing a decision-making diffusion model that models low-reward
behavior and one that models high-reward behavior; a setting related to inverse
reinforcement learning. We first define the notion of a relative reward
function of two diffusion models and show conditions under which it exists and
is unique. We then devise a practical learning algorithm for extracting it by
aligning the gradients of a reward function -- parametrized by a neural network
-- to the difference in outputs of both diffusion models. Our method finds
correct reward functions in navigation environments, and we demonstrate that
steering the base model with the learned reward functions results in
significantly increased performance in standard locomotion benchmarks. Finally,
we demonstrate that our approach generalizes beyond sequential decision-making
by learning a reward-like function from two large-scale image generation
diffusion models. The extracted reward function successfully assigns lower
rewards to harmful images
A stochastic analysis of resource sharing with logarithmic weights
The paper investigates the properties of a class of resource allocation
algorithms for communication networks: if a node of this network has
requests to transmit, then it receives a fraction of the capacity proportional
to , the logarithm of its current load. A detailed fluid scaling
analysis of such a network with two nodes is presented. It is shown that the
interaction of several time scales plays an important role in the evolution of
such a system, in particular its coordinates may live on very different time
and space scales. As a consequence, the associated stochastic processes turn
out to have unusual scaling behaviors. A heavy traffic limit theorem for the
invariant distribution is also proved. Finally, we present a generalization to
the resource sharing algorithm for which the function is replaced by an
increasing function. Possible generalizations of these results with nodes
or with the function replaced by another slowly increasing function are
discussed.Comment: Published at http://dx.doi.org/10.1214/14-AAP1057 in the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …