6 research outputs found
Understanding heavy tails in a bounded world or, is a truncated heavy tail heavy or not?
We address the important question of the extent to which random variables and
vectors with truncated power tails retain the characteristic features of random
variables and vectors with power tails. We define two truncation regimes, soft
truncation regime and hard truncation regime, and show that, in the soft
truncation regime, truncated power tails behave, in important respects, as if
no truncation took place. On the other hand, in the hard truncation regime much
of "heavy tailedness" is lost. We show how to estimate consistently the tail
exponent when the tails are truncated, and suggest statistical tests to decide
on whether the truncation is soft or hard. Finally, we apply our methods to two
recent data sets arising from computer networks
Characterizing Heavy-Tailed Distributions Induced by Retransmissions
Consider a generic data unit of random size L that needs to be transmitted
over a channel of unit capacity. The channel availability dynamics is modeled
as an i.i.d. sequence {A, A_i},i>0 that is independent of L. During each period
of time that the channel becomes available, say A_i, we attempt to transmit the
data unit. If L<A_i, the transmission was considered successful; otherwise, we
wait for the next available period and attempt to retransmit the data from the
beginning. We investigate the asymptotic properties of the number of
retransmissions N and the total transmission time T until the data is
successfully transmitted. In the context of studying the completion times in
systems with failures where jobs restart from the beginning, it was shown that
this model results in power law and, in general, heavy-tailed delays. The main
objective of this paper is to uncover the detailed structure of this class of
heavy-tailed distributions induced by retransmissions. More precisely, we study
how the functional dependence between P[L>x] and P[A>x] impacts the
distributions of N and T. In particular, we discover several functional
criticality points that separate classes of different functional behavior of
the distribution of N. We also discuss the engineering implications of our
results on communication networks since retransmission strategy is a
fundamental component of the existing network protocols on all communication
layers, from the physical to the application one.Comment: 39 pages, 2 figure
Propagation of Input Tail Uncertainty in Rare-Event Estimation: A Light versus Heavy Tail Dichotomy
We consider the estimation of small probabilities or other risk quantities
associated with rare but catastrophic events. In the model-based literature,
much of the focus has been devoted to efficient Monte Carlo computation or
analytical approximation assuming the model is accurately specified. In this
paper, we study a distinct direction on the propagation of model uncertainty
and how it impacts the reliability of rare-event estimates. Specifically, we
consider the basic setup of the exceedance of i.i.d. sum, and investigate how
the lack of tail information of each input summand can affect the output
probability. We argue that heavy-tailed problems are much more vulnerable to
input uncertainty than light-tailed problems, reasoned through their large
deviations behaviors and numerical evidence. We also investigate some
approaches to quantify model errors in this problem using a combination of the
bootstrap and extreme value theory, showing some positive outcomes but also
uncovering some statistical challenges
Data-Driven Methods and Applications for Optimization under Uncertainty and Rare-Event Simulation
For most of decisions or system designs in practice, there exist chances of severe hazards or system failures that can be catastrophic. The occurrence of such hazards is usually uncertain, and hence it is important to measure and analyze the associated risks. As a powerful tool for estimating risks, rare-event simulation techniques are used to improve the efficiency of the estimation when the risk occurs with an extremely small probability. Furthermore, one can utilize the risk measurements to achieve better decisions or designs. This can be achieved by modeling the task into a chance constrained optimization problem, which optimizes an objective with a controlled risk level. However, recent problems in practice have become more data-driven and hence brought new challenges to the existing literature in these two domains. In this dissertation, we will discuss challenges and remedies in data-driven problems for rare-event simulation and chance constrained problems. We propose a robust optimization based framework for approaching chance constrained optimization problems under a data-driven setting. We also analyze the impact of tail uncertainty in data-driven rare-event simulation tasks.
On the other hand, due to recent breakthroughs in machine learning techniques, the development of intelligent physical systems, e.g. autonomous vehicles, have been actively investigated. Since these systems can cause catastrophes to public safety, the evaluation of their machine learning components and system performance is crucial. This dissertation will cover problems arising in the evaluation of such systems. We propose an importance sampling scheme for estimating rare events defined by machine learning predictors. Lastly, we discuss an application project in evaluating the safety of autonomous vehicle driving algorithms.PHDIndustrial & Operations EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163270/1/zhyhuang_1.pd