1,099 research outputs found

    Online Meta-learning by Parallel Algorithm Competition

    Full text link
    The efficiency of reinforcement learning algorithms depends critically on a few meta-parameters that modulates the learning updates and the trade-off between exploration and exploitation. The adaptation of the meta-parameters is an open question in reinforcement learning, which arguably has become more of an issue recently with the success of deep reinforcement learning in high-dimensional state spaces. The long learning times in domains such as Atari 2600 video games makes it not feasible to perform comprehensive searches of appropriate meta-parameter values. We propose the Online Meta-learning by Parallel Algorithm Competition (OMPAC) method. In the OMPAC method, several instances of a reinforcement learning algorithm are run in parallel with small differences in the initial values of the meta-parameters. After a fixed number of episodes, the instances are selected based on their performance in the task at hand. Before continuing the learning, Gaussian noise is added to the meta-parameters with a predefined probability. We validate the OMPAC method by improving the state-of-the-art results in stochastic SZ-Tetris and in standard Tetris with a smaller, 10×\times10, board, by 31% and 84%, respectively, and by improving the results for deep Sarsa(λ\lambda) agents in three Atari 2600 games by 62% or more. The experiments also show the ability of the OMPAC method to adapt the meta-parameters according to the learning progress in different tasks.Comment: 15 pages, 10 figures. arXiv admin note: text overlap with arXiv:1702.0311

    How Fast Can We Play Tetris Greedily With Rectangular Pieces?

    Get PDF
    Consider a variant of Tetris played on a board of width ww and infinite height, where the pieces are axis-aligned rectangles of arbitrary integer dimensions, the pieces can only be moved before letting them drop, and a row does not disappear once it is full. Suppose we want to follow a greedy strategy: let each rectangle fall where it will end up the lowest given the current state of the board. To do so, we want a data structure which can always suggest a greedy move. In other words, we want a data structure which maintains a set of O(n)O(n) rectangles, supports queries which return where to drop the rectangle, and updates which insert a rectangle dropped at a certain position and return the height of the highest point in the updated set of rectangles. We show via a reduction to the Multiphase problem [P\u{a}tra\c{s}cu, 2010] that on a board of width w=Θ(n)w=\Theta(n), if the OMv conjecture [Henzinger et al., 2015] is true, then both operations cannot be supported in time O(n1/2ϵ)O(n^{1/2-\epsilon}) simultaneously. The reduction also implies polynomial bounds from the 3-SUM conjecture and the APSP conjecture. On the other hand, we show that there is a data structure supporting both operations in O(n1/2log3/2n)O(n^{1/2}\log^{3/2}n) time on boards of width nO(1)n^{O(1)}, matching the lower bound up to a no(1)n^{o(1)} factor.Comment: Correction of typos and other minor correction

    Maternal mental health and memory (re)consolidation following a traumatic childbirth

    Get PDF
    Objectives: The overall aim of this thesis was to contribute to the development of clinical interventions to prevent or reduce maternal symptoms of childbirth-related post-traumatic stress disorder (CB-PTSD). To do so, it relied on the literature on memory (re)consolidation, which corresponds to a set of processes potentially involved in the development and maintenance of CB-PTSD. The ambition of this thesis was to translate the research on memory (re)consolidation, mainly based on laboratory studies, into applied clinical proposals. Several avenues were explored: 1. Identifying factors that may modulate the consolidation of the traumatic birth memory (TBM) such as prenatal insomnia symptoms (Study 1), administration of nitrous oxide gas (N2O) or morphine during childbirth (Study 2), and CB-PTSD symptoms; and 2. Testing the effectiveness of brief visuospatial task-based interventions, which are assumed to interfere with the (re)consolidation of the TBM, in preventing (Study 3) or reducing (Study 4) CB-PTSD symptoms. Methods: Studies 1 (n = 1,610) and 2 (n = 2,070) were based on a prospective population-based cohort study (secondary data analyses), following women from pregnancy to eight weeks postpartum. Variables were measured via self-report questionnaires and patients' medical records. CB-PTSD was assessed at eight weeks postpartum. Study 3 (n = 144) is an ongoing multicentre, double-blind, randomised controlled trial (thus, results are not available yet). The intervention tested is delivered within six hours postpartum, and its effectiveness is primarily measured by a childbirth-related intrusive traumatic memories (ITMs) diary over the first week postpartum and an assessment of CB-PTSD symptoms at six weeks postpartum. Finally, Study 4 (n = 18) was a single-group pre-post study. The benefits of the intervention were measured with an ITMs diary over two weeks before and six weeks after the intervention, and CB-PTSD symptoms were measured with a self-report questionnaire, five days before and one month after the intervention. Results: In Study 1, prenatal insomnia symptoms were associated with CB-PTSD symptom severity, and this relationship was fully mediated by a negative subjective birth experience, as well as by postnatal insomnia symptoms. In Study 2, N2O administration during childbirth predicted less severe CB-PTSD symptoms. This was marginally the case with morphine. However, both analgesics predicted more CB-PTSD symptoms when combined with very severe pain during childbirth. Finally, participants in Study 4 reported a large reduction in their number of ITMs, and it persisted for up to six weeks post-intervention. Their CB-PTSD symptoms were also greatly reduced. Clinical implications: The results of this thesis suggest a number of avenues for preventing or reducing CB-PTSD symptoms through brief, simple, cost-effective, and innovative interventions. These could potentially be implemented throughout the perinatal period and notably pave the way for pharmacological (Study 2) or psychological (Studies 1 and 3) strategies for CB-PTSD prevention, for which there is currently no evidence-based intervention

    How to design good Tetris players

    Get PDF
    In this paper, we propose to use evolution- nary algorithms more specifically the covariance matrix adaptation evolution strategy to design artificial players for the game of Tetris. The learned strategies are among the best performing players at this time scoring several millions of lines. We also describe different mechanisms to reduce the evolution time which can be an important issue for this learning problem

    Interference-based methods to mitigate gambling craving: a proof-of-principle pilot study

    Get PDF
    Craving is central in the prognosis of gambling disorder. The elaborated intrusion theory (EIT) provides a sound framework to account for craving in addictive disorders, and interference methods inspired from the EIT have substantiated their effectiveness in mitigating substance and food-related cravings. The principle of these methods is to recruit the cognitive resources underlying craving (e.g., visuospatial skills, mental imagery) for another competitive and cognitively demanding task, thus reducing the vividness and overwhelming nature of craving. Here we conducted two experiments employing a between-subjects design to test the efficacy of interference methods for reducing laboratory-induced craving. In these experiments, gamblers (n = 38 for both experiments) first followed a craving induction procedure. They then performed either a visuospatial interference task (making a mental and vivid image of a bunch of keys [experiment 1] or playing the video game Tetris [experiment 2]; experimental conditions) or another task supposed not to recruit visuospatial skills and mental imagery (exploding bubble pack [experiment 1] or counting backwards [experiment 2]; control conditions). Results show that all methods successively mitigated induced craving. Although previous research evidenced the superiority of visuospatial tasks to reduce substance-related craving, our findings question their superiority in the context of gambling craving

    EEG and ECG nonlinear and spectral multiband analysis to explore the effect of videogames against anxiety

    Get PDF
    Currently, the use of video games has purposes that go beyond entertainment and has been gaining prominence in the health area. In this sense, it was hypothesized that it is possible to discriminate biological signals, namely electrocardiographic and electroencephalographic signals, collected from different participants stimulated through three different commercial video games, Tetris, Bejeweled and Energy. To test this hypothesis, a protocol was developed with the Trier Social Stress Test to induce and dose stress in the subjects to similar levels before each game session, in order to observe the effects of the three test games (3 study groups) at the physiological level. Initially collected at 2000 Hz, the signals were resampled to 500 Hz and filtered using a Butterworth low-pass filter. After filtering the signals, several representative features of the study signals were collected. These features consisted of a series of nonlinear metrics such as the Lyapunov exponent and Correlation Dimension, self-similarity metrics such as the Hurst exponent, and detrended fluctuation analysis, fractal dimensions - such as the Katz and Higuchi fractal dimensions - and metrics of signal chaos and activity, such as signal energy, Logarithmic entropy and Shannon entropy, and a number of spectral metrics for the EEG signal, which should be able to help identify any differences in the stress response. As a final result, a discrimination accuracy of 100% was obtained to discriminate the three study groups, using the top 20% of features selected by the F-score technique, using the coarse K Nearest Neighbor classifier.Atualmente, o uso de videojogos tem propósitos que vão além do entretenimento e tem vindo a ganhar destaque na área da saúde. Nesse sentido, foi formulada a hipótese de que é possível discriminar sinais biológicos, nomeadamente os sinais eletrocardiográficos e eletroencefalográficos, recolhidos de diferentes participantes estimulados através de três videojogos comerciais diferentes, Tetris, Bejeweled e Energy. Para testar esta hipótese foi desenvolvido um protocolo com o Trier Social Stress Test para induzir e dosear o stress nos sujeitos para níveis semelhantes antes de cada sessão de jogo, de forma a observar os efeitos dos três jogos de teste (3 grupos de estudo) a nível fisiológico. Recolhidos inicialmente a 2000 Hz, os sinais foram reamostrados a 500 Hz e filtrados utilizando um filtro passa-baixo de Butterworth. Após filtragem dos sinais, recolheram-se várias características representativas dos sinais de estudo. Estas características consistiram numa série de métricas não lineares, como o expoente de Lyapunov e a Dimensão de Correlação, métricas de auto similaridade como o exponente de Hurst e a análise de flutuação com trends removidas, dimensões fractais - como as dimensões fractais de Katz e Higuchi - e métricas de caos e atividade dos sinais, como a energia dos sinais, a entropia Logarítmica e a entropia de Shannon, e uma série de métricas espectrais para o sinal EEG, que devem ser capazes de ajudar a identificar qualquer diferença na resposta ao stress. Como resultado final obteve-se uma precisão de discriminação de 100% para discriminar os três grupos de estudo, utilizando as 20% das melhores características selecionadas pela técnica de F-score, recorrendo ao classificador coarse K Nearest Neighbor
    corecore