12 research outputs found

    Freshness-Aware Thompson Sampling

    Full text link
    To follow the dynamicity of the user's content, researchers have recently started to model interactions between users and the Context-Aware Recommender Systems (CARS) as a bandit problem where the system needs to deal with exploration and exploitation dilemma. In this sense, we propose to study the freshness of the user's content in CARS through the bandit problem. We introduce in this paper an algorithm named Freshness-Aware Thompson Sampling (FA-TS) that manages the recommendation of fresh document according to the user's risk of the situation. The intensive evaluation and the detailed analysis of the experimental results reveals several important discoveries in the exploration/exploitation (exr/exp) behaviour.Comment: 21st International Conference on Neural Information Processing. arXiv admin note: text overlap with arXiv:1409.772

    Bandit Models of Human Behavior: Reward Processing in Mental Disorders

    Full text link
    Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for multi-armed bandit problem, which extends the standard Thompson Sampling approach to incorporate reward processing biases associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. We demonstrate empirically that the proposed parametric approach can often outperform the baseline Thompson Sampling on a variety of datasets. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions.Comment: Conference on Artificial General Intelligence, AGI-1

    UCB1 Based Reinforcement Learning Model for Adaptive Energy Management in Buildings

    Get PDF
    This paper proposes a reinforcement learning model for intelligent energy management in buildings, using a UCB1 based approach. Energy management in buildings has become a critical task in recent years, due to the incentives to the increase of energy efficiency and renewable energy sources penetration. Managing the energy consumption, generation and storage in this domain, becomes, however, an arduous task, due to the large uncertainty of the different resources, adjacent to the dynamic characteristics of this environment. In this scope, reinforcement learning is a promising solution to provide adaptiveness to the energy management methods, by learning with the on-going changes in the environment. The model proposed in this paper aims at supporting decisions on the best actions to take in each moment, regarding buildings energy management. A UCB1 based algorithm is applied, and the results are compared to those of an EXP3 approach and a simple reinforcement learning algorithm. Results show that the proposed approach is able to achieve a higher quality of results, by reaching a higher rate of successful actions identification, when compared to the other considered reference approaches.This work has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 641794 (project DREAM-GO) and from Project SIMOCE (ANI|P2020 17690).info:eu-repo/semantics/publishedVersio

    A Multi-Armed Bandit to Smartly Select a Training Set from Big Medical Data

    Full text link
    With the availability of big medical image data, the selection of an adequate training set is becoming more important to address the heterogeneity of different datasets. Simply including all the data does not only incur high processing costs but can even harm the prediction. We formulate the smart and efficient selection of a training dataset from big medical image data as a multi-armed bandit problem, solved by Thompson sampling. Our method assumes that image features are not available at the time of the selection of the samples, and therefore relies only on meta information associated with the images. Our strategy simultaneously exploits data sources with high chances of yielding useful samples and explores new data regions. For our evaluation, we focus on the application of estimating the age from a brain MRI. Our results on 7,250 subjects from 10 datasets show that our approach leads to higher accuracy while only requiring a fraction of the training data.Comment: MICCAI 2017 Proceeding

    Contextual Simulated Annealing Q-Learning for Pre-negotiation of Agent-Based Bilateral Negotiations

    Get PDF
    Electricity markets are complex environments, which have been suffering continuous transformations due to the increase of renewable based generation and the introduction of new players in the system. In this context, players are forced to re-think their behavior and learn how to act in this dynamic environment in order to get as much benefit as possible from market negotiations. This paper introduces a new learning model to enable players identifying the expected prices of future bilateral agreements, as a way to improve the decision-making process in deciding the opponent players to approach for actual negotiations. The proposed model introduces a con-textual dimension in the well-known Q-Learning algorithm, and includes a simulated annealing process to accelerate the convergence process. The proposed model is integrated in a multi-agent decision support system for electricity market players negotiations, enabling the experimentation of results using real data from the Iberian electricity market.This work has received funding from the European Union's Horizon 2020 research and innovation programme under project DOMINOES (grant agreement No 771066) and from FEDER Funds through COMPETE program and from National Funds through FCT under the project UID/EEA/00760/2019.info:eu-repo/semantics/publishedVersio

    Case-based decision support system with contextual bandits learning for similarity retrieval model selection

    Get PDF
    Case-based reasoning has become one of the well-sought approaches that supports the development of personalized medicine. It trains on previous experience in form of resolved cases to provide solution to a new problem. In developing a case-based decision support system using case-based reasoning methodology, it is critical to have a good similarity retrieval model to retrieve the most similar cases to the query case. Various factors, including feature selection and weighting, similarity functions, case representation and knowledge model need to be considered in developing a similarity retrieval model. It is difficult to build a single most reliable similarity retrieval model, as this may differ according to the context of the user, demographic and query case. To address such challenge, the present work presents a case-based decision support system with multi-similarity retrieval models and propose contextual bandits learning algorithm to dynamically choose the most appropriate similarity retrieval model based on the context of the user, query patient and demographic data. The proposed framework is designed for DESIREE project, whose goal is to develop a web-based software ecosystem for the multidisciplinary management of primary breast cancer

    Contextual Bandits for Context-Based Information Retrieval

    No full text
    corecore