84,665 research outputs found
Quantitative Measures of Regret and Trust in Human-Robot Collaboration Systems
Human-robot collaboration (HRC) systems integrate the strengths of both humans and robots to improve the joint system performance. In this thesis, we focus on social human-robot interaction (sHRI) factors and in particular regret and trust. Humans experience regret during decision-making under uncertainty when they feel that a better result could be obtained if chosen differently. A framework to quantitatively measure regret is proposed in this thesis. We embed quantitative regret analysis into Bayesian sequential decision-making (BSD) algorithms for HRC shared vision tasks in both domain search and assembly tasks. The BSD method has been used for robot decision-making tasks, which however is proved to be very different from human decision-making patterns. Instead, regret theory qualitatively models human\u27s rational decision-making behaviors under uncertainty. Moreover, it has been shown that joint performance of a team will improve if all members share the same decision-making logic. Trust plays a critical role in determining the level of a human\u27s acceptance and hence utilization of a robot. A dynamic network based trust model combing the time series trust model is first implemented in a multi-robot motion planning task with a human-in-the-loop. However, in this model, the trust estimates for each robot is independent, which fails to model the correlative trust in multi-robot collaboration. To address this issue, the above model is extended to interdependent multi-robot Dynamic Bayesian Networks
Risk Minimization, Regret Minimization and Progressive Hedging Algorithms
This paper begins with a study on the dual representations of risk and regret
measures and their impact on modeling multistage decision making under
uncertainty. A relationship between risk envelopes and regret envelopes is
established by using the Lagrangian duality theory. Such a relationship opens a
door to a decomposition scheme, called progressive hedging, for solving
multistage risk minimization and regret minimization problems. In particular,
the classical progressive hedging algorithm is modified in order to handle a
new class of linkage constraints that arises from reformulations and other
applications of risk and regret minimization problems. Numerical results are
provided to show the efficiency of the progressive hedging algorithms.Comment: 21 pages, 2 figure
Generalizing the Min-Max Regret Criterion using Ordered Weighted Averaging
In decision making under uncertainty, several criteria have been studied to
aggregate the performance of a solution over multiple possible scenarios,
including the ordered weighted averaging (OWA) criterion and min-max regret.
This paper introduces a novel generalization of min-max regret, leveraging the
modeling power of OWA to enable a more nuanced expression of preferences in
handling regret values. This new OWA regret approach is studied both
theoretically and numerically. We derive several properties, including
polynomially solvable and hard cases, and introduce an approximation algorithm.
Through computational experiments using artificial and real-world data, we
demonstrate the advantages of our OWAR method over the conventional min-max
regret approach, alongside the effectiveness of the proposed clustering
heuristics
The role of anticipated regret in choosing for others
In everyday life, people sometimes find themselves making decisions on behalf of others, taking risks on another's behalf, accepting the responsibility for these choices and possibly suffering regret for what they could have done differently. Previous research has extensively studied how people deal with risk when making decisions for others or when being observed by others. Here, we asked whether making decisions for present others is affected by regret avoidance. We studied value-based decision making under uncertainty, manipulating both whether decisions benefited the participant or a partner (beneficiary effect) and whether the partner watched the participant's choices (audience effect) and their factual and counterfactual outcomes. Computational behavioural analysis revealed that participants were less mindful of regret (and more strongly driven by bigger risks) when choosing for others vs for themselves. Conversely, they chose more conservatively (regarding both regret and risk) when being watched vs alone. The effects of beneficiary and audience on anticipated regret counteracted each other, suggesting that participants' financial and reputational interests impacted the feeling of regret independently
Online Convex Optimization with Binary Constraints
We consider online optimization with binary decision variables and convex
loss functions. We design a new algorithm, binary online gradient descent
(bOGD) and bound its expected dynamic regret. We provide a regret bound that
holds for any time horizon and a specialized bound for finite time horizons.
First, we present the regret as the sum of the relaxed, continuous round
optimum tracking error and the rounding error of our update in which the former
asymptomatically decreases with time under certain conditions. Then, we derive
a finite-time bound that is sublinear in time and linear in the cumulative
variation of the relaxed, continuous round optima. We apply bOGD to demand
response with thermostatically controlled loads, in which binary constraints
model discrete on/off settings. We also model uncertainty and varying load
availability, which depend on temperature deadbands, lockout of cooling units
and manual overrides. We test the performance of bOGD in several simulations
based on demand response. The simulations corroborate that the use of
randomization in bOGD does not significantly degrade performance while making
the problem more tractable
Optimal Learning for Structured Bandits
We study structured multi-armed bandits, which is the problem of online
decision-making under uncertainty in the presence of structural information. In
this problem, the decision-maker needs to discover the best course of action
despite observing only uncertain rewards over time. The decision-maker is aware
of certain structural information regarding the reward distributions and would
like to minimize their regret by exploiting this information, where the regret
is its performance difference against a benchmark policy that knows the best
action ahead of time. In the absence of structural information, the classical
upper confidence bound (UCB) and Thomson sampling algorithms are well known to
suffer only minimal regret. As recently pointed out, neither algorithms are,
however, capable of exploiting structural information that is commonly
available in practice. We propose a novel learning algorithm that we call DUSA
whose worst-case regret matches the information-theoretic regret lower bound up
to a constant factor and can handle a wide range of structural information. Our
algorithm DUSA solves a dual counterpart of the regret lower bound at the
empirical reward distribution and follows its suggested play. Our proposed
algorithm is the first computationally viable learning policy for structured
bandit problems that has asymptotic minimal regret
Probabilistic hesitant fuzzy multiple attribute decisionmaking based on regret theory for the evaluation of venture capital projects
The selection of venture capital investment projects is one of the
most important decision-making activities for venture capitalists.
Due to the complexity of investment market and the limited cognition
of people, most of the venture capital investment decision
problems are highly uncertain and the venture capitalists are
often bounded rational under uncertainty. To address such problems,
this article presents an approach based on regret theory to
probabilistic hesitant fuzzy multiple attribute decision-making.
Firstly, when the information on the occurrence probabilities of
all the elements in the probabilistic hesitant fuzzy element
(P.H.F.E.) is unknown or partially known, two different mathematical
programming models based on water-filling theory and the
maximum entropy principle are provided to handle these complex
situations. Secondly, to capture the psychological behaviours
of venture capitalists, the regret theory is utilised to solve the
problem of selection of venture capital investment projects.
Finally, comparative analysis with the existing approaches is conducted
to demonstrate the feasibility and applicability of the proposed
method
- …