97 research outputs found

    Solving Non-Stationary Bandit Problems by Random Sampling from Sibling Kalman Filters

    Get PDF
    The multi-armed bandit problem is a classical optimization problem where an agent sequentially pulls one of multiple arms attached to a gambling machine, with each pull resulting in a random reward. The reward distributions are unknown, and thus, one must balance between exploiting existing knowledge about the arms, and obtaining new information. Dynamically changing (non-stationary) bandit problems are particularly challenging because each change of the reward distributions may progressively degrade the performance of any fixed strategy. Although computationally intractable in many cases, Bayesian methods provide a standard for optimal decision making. This paper proposes a novel solution scheme for bandit problems with non-stationary normally distributed rewards. The scheme is inherently Bayesian in nature, yet avoids computational intractability by relying simply on updating the hyper parameters of sibling Kalman Filters, and on random sampling from these posteriors. Furthermore, it is able to track the better actions, thus supporting non-stationary bandit problems. Extensive experiments demonstrate that our scheme outperforms recently proposed bandit playing algorithms, not only in non-stationary environments, but in stationary environments also. Furthermore, our scheme is robust to inexact parameter settings. We thus believe that our methodology opens avenues for obtaining improved novel solutions

    Bandit strategies in social search: the case of the DARPA red balloon challenge

    Get PDF
    Collective search for people and information has tremendously benefited from emerging communication technologies that leverage the wisdom of the crowds, and has been increasingly influential in solving time-critical tasks such as the DARPA Network Challenge (DNC, also known as the Red Balloon Challenge). However, while collective search often invests significant resources in encouraging the crowd to contribute new information, the effort invested in verifying this information is comparable, yet often neglected in crowdsourcing models. This paper studies how the exploration-verification trade-off displayed by the teams modulated their success in the DNC, as teams had limited human resources that they had to divide between recruitment (exploration) and verification (exploitation). Our analysis suggests that team performance in the DNC can be modelled as a modified multi-armed bandit (MAB) problem, where information arrives to the team originating from sources of different levels of veracity that need to be assessed in real time. We use these insights to build a data-driven agent-based model, based on the DNC’s data, to simulate team performance. The simulation results match the observed teams’ behavior and demonstrate how to achieve the best balance between exploration and exploitation for general time-critical collective search tasks.</p

    Vivienne Westwood and the ethics of consuming fashion

    Get PDF
    Our paper examines ethical consumption using the case study of Vivienne Westwood, the fashion designer, and her eponymous firm, and shows how consumers of fashion might be considered ethical. The fashion industry has figured prominently in ethical debates, notably its role in encouraging overconsumption of resources and promoting an idealised lifestyle that is often neither materially nor psychically sustainable for consumers (Buchholz, 1998). We acknowledge this, yet suggest the purchase and use of clothing carries with it the potential to be ethical insofar as customers find themselves personally implicated with and caring for a designers' work

    European Space Agency experiments on thermodiffusion of fluid mixtures in space

    Get PDF
    Abstract.: This paper describes the European Space Agency (ESA) experiments devoted to study thermodiffusion of fluid mixtures in microgravity environment, where sedimentation and convection do not affect the mass flow induced by the Soret effect. First, the experiments performed on binary mixtures in the IVIDIL and GRADFLEX experiments are described. Then, further experiments on ternary mixtures and complex fluids performed in DCMIX and planned to be performed in the context of the NEUF-DIX project are presented. Finally, multi-component mixtures studied in the SCCO project are detailed

    Dynamic Pricing and Learning: Historical Origins, Current Research, and New Directions

    Full text link
    • …
    corecore