884,432 research outputs found

    Contextual Bandits with Cross-learning

    Full text link
    In the classical contextual bandits problem, in each round tt, a learner observes some context cc, chooses some action aa to perform, and receives some reward ra,t(c)r_{a,t}(c). We consider the variant of this problem where in addition to receiving the reward ra,t(c)r_{a,t}(c), the learner also learns the values of ra,t(c)r_{a,t}(c') for all other contexts cc'; i.e., the rewards that would have been achieved by performing that action under different contexts. This variant arises in several strategic settings, such as learning how to bid in non-truthful repeated auctions (in this setting the context is the decision maker's private valuation for each auction). We call this problem the contextual bandits problem with cross-learning. The best algorithms for the classical contextual bandits problem achieve O~(CKT)\tilde{O}(\sqrt{CKT}) regret against all stationary policies, where CC is the number of contexts, KK the number of actions, and TT the number of rounds. We demonstrate algorithms for the contextual bandits problem with cross-learning that remove the dependence on CC and achieve regret O(KT)O(\sqrt{KT}) (when contexts are stochastic with known distribution), O~(K1/3T2/3)\tilde{O}(K^{1/3}T^{2/3}) (when contexts are stochastic with unknown distribution), and O~(KT)\tilde{O}(\sqrt{KT}) (when contexts are adversarial but rewards are stochastic).Comment: 48 pages, 5 figure

    Pengaruh Pembelajaran Kontekstual berbasis Outing Class terhadap Maharah Kalam pada Siswa di MTs Muhammadiyah 1 Malang

    Get PDF
    The purpose of this study is to find out the implementation and impact of outing class-based contextual learning on increasing maharah kalam in class VIIB students at MTs Muhammadiyah 1 Malang. This study method is quantitative research with correlation approaches, meaning a sort of research that only focuses on whether or not there is a substantial relationship or impact between outing class-based contextual learning (variable X) and maharah kalam (variable Y). Researchers employ data analysis procedures such as normality test, linearity test, and simple linear regression test. The results of this research are: (1) The application of outing class-based contextual learning was carried out by inviting class VIIB students to learn by directly showing them the facilities at school according to the material to be studied, namely الْمَرَافِقُ الْمَدْرَسِيَّةُ. (2) The data analysis results showed that the contextual learning variable based on outing classes had a significant impact of 26.6% on the maharah kalam variable. It can be inferred that the maharah kalam of class VIIB pupils at MTs Muhammadiyah 1 Malang is impacted by outing class-based contextual learning

    Here today, gone tomorrow - adaptation to change in memory-guided visual search

    Get PDF
    Visual search for a target object can be facilitated by the repeated presentation of an invariant configuration of nontargets ('contextual cueing'). Here, we tested adaptation of learned contextual associations after a sudden, but permanent, relocation of the target. After an initial learning phase targets were relocated within their invariant contexts and repeatedly presented at new locations, before they returned to the initial locations. Contextual cueing for relocated targets was neither observed after numerous presentations nor after insertion of an overnight break. Further experiments investigated whether learning of additional, previously unseen context-target configurations is comparable to adaptation of existing contextual associations to change. In contrast to the lack of adaptation to changed target locations, contextual cueing developed for additional invariant configurations under identical training conditions. Moreover, across all experiments, presenting relocated targets or additional contexts did not interfere with contextual cueing of initially learned invariant configurations. Overall, the adaptation of contextual memory to changed target locations was severely constrained and unsuccessful in comparison to learning of an additional set of contexts, which suggests that contextual cueing facilitates search for only one repeated target location

    Learning Contextual Reward Expectations for Value Adaptation

    Get PDF
    Substantial evidence indicates that subjective value is adapted to the statistics of reward expected within a given temporal context. However, how these contextual expectations are learned is poorly understood. To examine such learning, we exploited a recent observation that participants performing a gambling task adjust their preferences as a function of context. We show that, in the absence of contextual cues providing reward information, an average reward expectation was learned from recent past experience. Learning dependent on contextual cues emerged when two contexts alternated at a fast rate, whereas both cue-independent and cue-dependent forms of learning were apparent when two contexts alternated at a slower rate. Motivated by these behavioral findings, we reanalyzed a previous fMRI data set to probe the neural substrates of learning contextual reward expectations. We observed a form of reward prediction error related to average reward such that, at option presentation, activity in ventral tegmental area/substantia nigra and ventral striatum correlated positively and negatively, respectively, with the actual and predicted value of options. Moreover, an inverse correlation between activity in ventral tegmental area/substantia nigra (but not striatum) and predicted option value was greater in participants showing enhanced choice adaptation to context. The findings help understanding the mechanisms underlying learning of contextual reward expectation

    Contextual Bandit Learning with Predictable Rewards

    Full text link
    Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on the action and context. We consider this problem under a realizability assumption: there exists a function in a (known) function class, always capable of predicting the expected reward, given the action and context. Under this assumption, we show three things. We present a new algorithm---Regressor Elimination--- with a regret similar to the agnostic setting (i.e. in the absence of realizability assumption). We prove a new lower bound showing no algorithm can achieve superior performance in the worst case even with the realizability assumption. However, we do show that for any set of policies (mapping contexts to actions), there is a distribution over rewards (given context) such that our new algorithm has constant regret unlike the previous approaches

    Effect of Contextual Learning Ability Against Students Understanding Math Concepts SMP

    Get PDF
    This study aims to determine whether or not there is the influence of contextual learning of math concepts students' comprehension ability. The subject of this study is the seventh grade students of SMP Negeri 10 Palembang. The research method used in this study is an experiment. The variables of this study was the ability of understanding the concept of students. Methods of data collection using a written test, the data obtained by using t test analysis. The results of this study found that there is the influence of contextual learning on the ability of junior high school students’ understanding of mathematical concepts. Key Words: Contextual Learning, understanding the concep
    corecore