586,246 research outputs found

    The Value-of-Information in Matching with Queues

    Full text link
    We consider the problem of \emph{optimal matching with queues} in dynamic systems and investigate the value-of-information. In such systems, the operators match tasks and resources stored in queues, with the objective of maximizing the system utility of the matching reward profile, minus the average matching cost. This problem appears in many practical systems and the main challenges are the no-underflow constraints, and the lack of matching-reward information and system dynamics statistics. We develop two online matching algorithms: Learning-aided Reward optimAl Matching (LRAM\mathtt{LRAM}) and Dual-LRAM\mathtt{LRAM} (DRAM\mathtt{DRAM}) to effectively resolve both challenges. Both algorithms are equipped with a learning module for estimating the matching-reward information, while DRAM\mathtt{DRAM} incorporates an additional module for learning the system dynamics. We show that both algorithms achieve an O(ϵ+δr)O(\epsilon+\delta_r) close-to-optimal utility performance for any ϵ>0\epsilon>0, while DRAM\mathtt{DRAM} achieves a faster convergence speed and a better delay compared to LRAM\mathtt{LRAM}, i.e., O(δz/ϵ+log(1/ϵ)2))O(\delta_{z}/\epsilon + \log(1/\epsilon)^2)) delay and O(δz/ϵ)O(\delta_z/\epsilon) convergence under DRAM\mathtt{DRAM} compared to O(1/ϵ)O(1/\epsilon) delay and convergence under LRAM\mathtt{LRAM} (δr\delta_r and δz\delta_z are maximum estimation errors for reward and system dynamics). Our results reveal that information of different system components can play very different roles in algorithm performance and provide a systematic way for designing joint learning-control algorithms for dynamic systems

    Learning in experimental 2×2 games

    Get PDF
    In this paper we introduce four new learning models: impulse balance learning, impulse matching learning, action-sampling learning, and payoff-sampling learning. With this models and together with the models of self- tuning EWA learning and reinforcement learning, we conduct simulations over 12 different 2×2 games and compare the results with experimental data obtained by Selten & Chmura (2008). Our results are two-fold: While the simulations, especially those with action-sampling learning and impulse matching learning successfully replicate the experimental data on the aggregate, they fail in describing the individual behavior. A simple inertia rule beats the learning models in describing individuals behavior.Learning, Action-sampling, Payo?-sampling, Impulse balance, Impulse matching, Reinforcement, self-tuning EWA, 2×2 games, Experimental data

    Learning in experimental 2 x 2 games

    Get PDF
    In this paper, we introduce two new learning models: impulse-matching learning and action-sampling learning. These two models together with the models of self-tuning EWA and reinforcement learning are applied to 12 different 2 x 2 games and their results are compared with the results from experimental data. We test whether the models are capable of replicating the aggregate distribution of behavior, as well as correctly predicting individuals' round-by-round behavior. Our results are two-fold: while the simulations with impulse-matching and action-sampling learning successfully replicate the experimental data on the aggregate level, individual behavior is best described by self-tuning EWA. Nevertheless, impulse-matching learning has the second highest score for the individual data. In addition, only self-tuning EWA and impulse-matching learning lead to better round-by-round predictions than the aggregate frequencies, which means they adjust their predictions correctly over time.learning, 2 x 2 games, Experimental data

    Context-Dependent Memory under Stressful Conditions: The Case of Skydiving

    Full text link
    Two experiments examined the effect of differing levels of emotional arousal on learning and memory for words in matching and mismatching contexts. In Experiment 1, experienced skydivers learned words either in the air or on the ground and recalled them in the same context or in the other context. Experiment 2 replicated the stimuli and design of the first experiment except that participants were shown a skydiving video in lieu of skydiving. Recall was poor in air-learning conditions with actual skydiving, but when lists were learned on land, recall was higher in the matching context than in the mismatching context. In the skydiving video experiment, recall was higher in matching learn-recall contexts regardless of the situation in which learning occurred. We propose that under extremely emotionally arousing circumstances, environmental and/or mood cues are unlikely to become encoded or linked to newly acquired information and thus cannot serve as cues to retrieval. Results can be applied to understanding variations in context-dependent memory in occupations (e.g., police, military special operations, and Special Weapons and Tactics teams) in which the worker experiences considerable emotional stress while learning or recalling new information
    corecore