957 research outputs found

    TSEC: a framework for online experimentation under experimental constraints

    Full text link
    Thompson sampling is a popular algorithm for solving multi-armed bandit problems, and has been applied in a wide range of applications, from website design to portfolio optimization. In such applications, however, the number of choices (or arms) NN can be large, and the data needed to make adaptive decisions require expensive experimentation. One is then faced with the constraint of experimenting on only a small subset of KNK \ll N arms within each time period, which poses a problem for traditional Thompson sampling. We propose a new Thompson Sampling under Experimental Constraints (TSEC) method, which addresses this so-called "arm budget constraint". TSEC makes use of a Bayesian interaction model with effect hierarchy priors, to model correlations between rewards on different arms. This fitted model is then integrated within Thompson sampling, to jointly identify a good subset of arms for experimentation and to allocate resources over these arms. We demonstrate the effectiveness of TSEC in two problems with arm budget constraints. The first is a simulated website optimization study, where TSEC shows noticeable improvements over industry benchmarks. The second is a portfolio optimization application on industry-based exchange-traded funds, where TSEC provides more consistent and greater wealth accumulation over standard investment strategies

    Exploration vs Exploitation vs Safety: Risk-averse Multi-Armed Bandits

    Get PDF
    Motivated by applications in energy management, this paper presents the Multi-Armed Risk-Aware Bandit (MARAB) algorithm. With the goal of limiting the exploration of risky arms, MARAB takes as arm quality its conditional value at risk. When the user-supplied risk level goes to 0, the arm quality tends toward the essential infimum of the arm distribution density, and MARAB tends toward the MIN multi-armed bandit algorithm, aimed at the arm with maximal minimal value. As a first contribution, this paper presents a theoretical analysis of the MIN algorithm under mild assumptions, establishing its robustness comparatively to UCB. The analysis is supported by extensive experimental validation of MIN and MARAB compared to UCB and state-of-art risk-aware MAB algorithms on artificial and real-world problems.Comment: 16 page

    Corporate social responsibility in portfolio selection: A "goal games" against nature approach

    Full text link
    Nowadays, there is an uprising social pressure on big companies to incorporate into their decision-making process elements of the so-called social responsibility. Among the many implications of this fact, one relevant one is the need to include this new element in classic portfolio selection models. This paper meets this challenge by formulating a model that combines goal programming with "goal games" against nature in a scenario where the social responsibility is defined through the introduction of a battery of sustainability indicators amalgamated into a synthetic index. In this way, we have obtained an efficient model that only implies solving a small number of linear programming problems. The proposed approach has been tested and illustrated by using a case study related to the selection of securities in international markets
    corecore