70 research outputs found
Risk-averse multi-armed bandits and game theory
The multi-armed bandit (MAB) and game theory literature is mainly focused on the expected cumulative reward and the expected payoffs in a game, respectively. In contrast, the rewards and the payoffs are often random variables whose expected values only capture a vague idea of the overall distribution. The focus of this dissertation is to study the fundamental limits of the existing bandits and game theory problems in a risk-averse framework and propose new ideas that address the shortcomings. The author believes that human beings are mostly risk-averse, so studying multi-armed bandits and game theory from the point of view of risk aversion, rather than expected reward/payoff, better captures reality. In this manner, a specific class of multi-armed bandits, called explore-then-commit bandits, and stochastic games are studied in this dissertation, which are based on the notion of Risk-Averse Best Action Decision with Incomplete Information (R-ABADI, Abadi is the maiden name of the author's mother). The goal of the classical multi-armed bandits is to exploit the arm with the maximum score defined as the expected value of the arm reward. Instead, we propose a new definition of score that is derived from the joint distribution of all arm rewards and captures the reward of an arm relative to those of all other arms. We use a similar idea for games and propose a risk-averse R-ABADI equilibrium in game theory that is possibly different from the Nash equilibrium. The payoff distributions are taken into account to derive the risk-averse equilibrium, while the expected payoffs are used to find the Nash equilibrium. The fundamental properties of games, e.g. pure and mixed risk-averse R-ABADI equilibrium and strict dominance, are studied in the new framework and the results are expanded to finite-time games. Furthermore, the stochastic congestion games are studied from a risk-averse perspective and three classes of equilibria are proposed for such games. It is shown by examples that the risk-averse behavior of travelers in a stochastic congestion game can improve the price of anarchy in Pigou and Braess networks. Furthermore, the Braess paradox does not occur to the extent proposed originally when travelers are risk-averse.
We also study an online affinity scheduling problem with no prior knowledge of the task arrival rates and processing rates of different task types on different servers. We propose the Blind GB-PANDAS algorithm that utilizes an exploration-exploitation scheme to load balance incoming tasks on servers in an online fashion. We prove that Blind GB-PANDAS is throughput optimal, i.e. it stabilizes the system as long as the task arrival rates are inside the capacity region. The Blind GB-PANDAS algorithm is compared to FCFS, Max-Weight, and c-mu-rule algorithms in terms of average task completion time through simulations, where the same exploration-exploitation approach as Blind GB-PANDAS is used for Max-Weight and c--rule. The extensive simulations show that the Blind GB-PANDAS algorithm conspicuously outperforms the three other algorithms at high loads
Risk-Aware Linear Bandits: Theory and Applications in Smart Order Routing
Motivated by practical considerations in machine learning for financial
decision-making, such as risk-aversion and large action space, we initiate the
study of risk-aware linear bandits. Specifically, we consider regret
minimization under the mean-variance measure when facing a set of actions whose
rewards can be expressed as linear functions of (initially) unknown parameters.
Driven by the variance-minimizing G-optimal design, we propose the Risk-Aware
Explore-then-Commit (RISE) algorithm and the Risk-Aware Successive Elimination
(RISE++) algorithm. Then, we rigorously analyze their regret upper bounds to
show that, by leveraging the linear structure, the algorithms can dramatically
reduce the regret when compared to existing methods. Finally, we demonstrate
the performance of the algorithms by conducting extensive numerical experiments
in a synthetic smart order routing setup. Our results show that both RISE and
RISE++ can outperform the competing methods, especially in complex
decision-making scenarios
Fast and Regret Optimal Best Arm Identification: Fundamental Limits and Low-Complexity Algorithms
This paper considers a stochastic multi-armed bandit (MAB) problem with dual
objectives: (i) quick identification and commitment to the optimal arm, and
(ii) reward maximization throughout a sequence of consecutive rounds.
Though each objective has been individually well-studied, i.e., best arm
identification for (i) and regret minimization for (ii), the simultaneous
realization of both objectives remains an open problem, despite its practical
importance. This paper introduces \emph{Regret Optimal Best Arm Identification}
(ROBAI) which aims to achieve these dual objectives. To solve ROBAI with both
pre-determined stopping time and adaptive stopping time requirements, we
present the algorithm and its variants respectively, which not
only achieve asymptotic optimal regret in both Gaussian and general bandits,
but also commit to the optimal arm in rounds with
pre-determined stopping time and rounds with adaptive
stopping time. We further characterize lower bounds on the commitment time
(equivalent to sample complexity) of ROBAI, showing that and
its variants are sample optimal with pre-determined stopping time, and almost
sample optimal with adaptive stopping time. Numerical results confirm our
theoretical analysis and reveal an interesting ``over-exploration'' phenomenon
carried by classic algorithms, such that has
smaller regret even though it stops exploration much earlier than
( versus ), which suggests
over-exploration is unnecessary and potentially harmful to system performance
A Novel Approach to the Behavioral Aspects of Cybersecurity
The Internet and cyberspace are inseparable aspects of everyone's life.
Cyberspace is a concept that describes widespread, interconnected, and online
digital technology. Cyberspace refers to the online world that is separate from
everyday reality. Since the internet is a recent advance in human lives, there
are many unknown and unpredictable aspects to it that sometimes can be
catastrophic to users in financial aspects, high-tech industry, and healthcare.
Cybersecurity failures are usually caused by human errors or their lack of
knowledge. According to the International Business Machines Corporation (IBM)
X-Force Threat Intelligence Index in 2020, around 8.5 billion records were
compromised in 2019 due to failures of insiders, which is an increase of more
than 200 percent compared to the compromised records in 2018. In another survey
performed by the Ernst and Young Global Information Security during 2018-2019,
it is reported that 34% of the organizations stated that employees who are
inattentive or do not have the necessary knowledge are the principal
vulnerabilities of cybersecurity, and 22% of the organizations indicated that
phishing is the main threat to them. Inattentive users are one of the reasons
for data breaches and cyberattacks. The National Cyber Security Centre (NCSC)
in the United Kingdom observed that 23.2 million users who were victims of
cybersecurity attacks used a carelessly selected password, which is 123456, as
their account password. The Annual Cybersecurity Report published by Cisco in
2018 announced that phishing and spear phishing emails are the root causes of
many cybersecurity attacks in recent years. Hence, enhancing the cybersecurity
behaviors of both personal users and organizations can protect vulnerable users
from cyber threats. Both human factors and technological aspects of
cybersecurity should be addressed in organizations for a safer environment
Developing Hybrid Machine Learning Models to Assign Health Score to Railcar Fleets for Optimal Decision Making
A large amount of data is generated during the operation of a railcar fleet,
which can easily lead to dimensional disaster and reduce the resiliency of the
railcar network. To solve these issues and offer predictive maintenance, this
research introduces a hybrid fault diagnosis expert system method that combines
density-based spatial clustering of applications with noise (DBSCAN) and
principal component analysis (PCA). Firstly, the DBSCAN method is used to
cluster categorical data that are similar to one another within the same group.
Secondly, PCA algorithm is applied to reduce the dimensionality of the data and
eliminate redundancy in order to improve the accuracy of fault diagnosis.
Finally, we explain the engineered features and evaluate the selected models by
using the Gain Chart and Area Under Curve (AUC) metrics. We use the hybrid
expert system model to enhance maintenance planning decisions by assigning a
health score to the railcar system of the North American Railcar Owner (NARO).
According to the experimental results, our expert model can detect 96.4% of
failures within 50% of the sample. This suggests that our method is effective
at diagnosing failures in railcars fleet.Comment: 21 pages, 7 figures, 3 table
Entrepreneurial Operations Management
In the presence of tight capital, time and talent constraints, many traditional operational challenges are reinforced (and sometimes redefined) in the entrepreneurial setting. This dissertation addresses some of these challenges by examining theoretically and experimentally several problems in entrepreneurship and innovation for which the existing literature offers little guidance. The dissertation is organized into three chapters.
When tight time-to-market constraints are binding an important question in product development is how much time a development team should spend on generating new ideas and designs vs executing the idea, and who should make that decision. In the first chapter of this dissertation I develop an experimental approach to examining this question. Entrepreneurial ventures can have limited (often zero) cash inflow and limited access to capital, and so use equity ownership to compensate founders and early employees. In the second chapter I focus on the challenges of equity-based incentive design, examining the effects of contract form (equal vs non-equal equity splits) and time (upfront vs. delayed contracting) on effort and value generation in startups. In "technology-push" (relative to "demand-pull") innovation, technology teams often develop a new capability that may find voice in a wide range of industrial settings. However, the team may lack the appropriate marketing budget to explore each in great depth, or even all of them at any depth. In the third chapter I study entrepreneurial market identification, developing and testing search strategies for choosing a market for a new technology when the number of potential markets is large but the search budget is small.PHDBusiness AdministrationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145946/1/ekagan_1.pd
- …