957 research outputs found
TSEC: a framework for online experimentation under experimental constraints
Thompson sampling is a popular algorithm for solving multi-armed bandit
problems, and has been applied in a wide range of applications, from website
design to portfolio optimization. In such applications, however, the number of
choices (or arms) can be large, and the data needed to make adaptive
decisions require expensive experimentation. One is then faced with the
constraint of experimenting on only a small subset of arms within
each time period, which poses a problem for traditional Thompson sampling. We
propose a new Thompson Sampling under Experimental Constraints (TSEC) method,
which addresses this so-called "arm budget constraint". TSEC makes use of a
Bayesian interaction model with effect hierarchy priors, to model correlations
between rewards on different arms. This fitted model is then integrated within
Thompson sampling, to jointly identify a good subset of arms for
experimentation and to allocate resources over these arms. We demonstrate the
effectiveness of TSEC in two problems with arm budget constraints. The first is
a simulated website optimization study, where TSEC shows noticeable
improvements over industry benchmarks. The second is a portfolio optimization
application on industry-based exchange-traded funds, where TSEC provides more
consistent and greater wealth accumulation over standard investment strategies
Exploration vs Exploitation vs Safety: Risk-averse Multi-Armed Bandits
Motivated by applications in energy management, this paper presents the
Multi-Armed Risk-Aware Bandit (MARAB) algorithm. With the goal of limiting the
exploration of risky arms, MARAB takes as arm quality its conditional value at
risk. When the user-supplied risk level goes to 0, the arm quality tends toward
the essential infimum of the arm distribution density, and MARAB tends toward
the MIN multi-armed bandit algorithm, aimed at the arm with maximal minimal
value. As a first contribution, this paper presents a theoretical analysis of
the MIN algorithm under mild assumptions, establishing its robustness
comparatively to UCB. The analysis is supported by extensive experimental
validation of MIN and MARAB compared to UCB and state-of-art risk-aware MAB
algorithms on artificial and real-world problems.Comment: 16 page
Corporate social responsibility in portfolio selection: A "goal games" against nature approach
Nowadays, there is an uprising social pressure on big companies to incorporate into their decision-making process elements of the so-called social responsibility. Among the many implications of this fact, one relevant one is the need to include this new element in classic portfolio selection models. This paper meets this challenge by formulating a model that combines goal programming with "goal games" against nature in a scenario where the social responsibility is defined through the introduction of a battery of sustainability indicators amalgamated into a synthetic index. In this way, we have obtained an efficient model that only implies solving a small number of linear programming problems. The proposed approach has been tested and illustrated by using a case study related to the selection of securities in international markets
- …