583 research outputs found

    Finding Approximate Nash Equilibria of Bimatrix Games via Payoff Queries

    Get PDF
    We study the deterministic and randomized query complexity of finding approximate equilibria in a k × k bimatrix game. We show that the deterministic query complexity of finding an ϵ-Nash equilibrium when ϵ < ½ is Ω(k2), even in zero-one constant-sum games. In combination with previous results [Fearnley et al. 2013], this provides a complete characterization of the deterministic query complexity of approximate Nash equilibria. We also study randomized querying algorithms. We give a randomized algorithm for finding a (3-√5/2 + ϵ)-Nash equilibrium using O(k.log k/ϵ2) payoff queries, which shows that the ½ barrier for deterministic algorithms can be broken by randomization. For well-supported Nash equilibria (WSNE), we first give a randomized algorithm for finding an ϵ-WSNE of a zero-sum bimatrix game using O(k.log k/ϵ4) payoff queries, and we then use this to obtain a randomized algorithm for finding a (⅔ + ϵ)-WSNE in a general bimatrix game using O(k.log k/ϵ4) payoff queries. Finally, we initiate the study of lower bounds against randomized algorithms in the context of bimatrix games, by showing that randomized algorithms require Ω(k2) payoff queries in order to find an ϵ-Nash equilibrium with ϵ < 1/4k, even in zero-one constant-sum games. In particular, this rules out query-efficient randomized algorithms for finding exact Nash equilibria

    An Empirical Study of Finding Approximate Equilibria in Bimatrix Games

    Full text link
    While there have been a number of studies about the efficacy of methods to find exact Nash equilibria in bimatrix games, there has been little empirical work on finding approximate Nash equilibria. Here we provide such a study that compares a number of approximation methods and exact methods. In particular, we explore the trade-off between the quality of approximate equilibrium and the required running time to find one. We found that the existing library GAMUT, which has been the de facto standard that has been used to test exact methods, is insufficient as a test bed for approximation methods since many of its games have pure equilibria or other easy-to-find good approximate equilibria. We extend the breadth and depth of our study by including new interesting families of bimatrix games, and studying bimatrix games upto size 2000×20002000 \times 2000. Finally, we provide new close-to-worst-case examples for the best-performing algorithms for finding approximate Nash equilibria

    Lenient multi-agent deep reinforcement learning

    Get PDF
    Much of the success of single agent deep reinforcement learning (DRL) in recent years can be attributed to the use of experience replay memories (ERM), which allow Deep Q-Networks (DQNs) to be trained efficiently through sampling stored state transitions. However, care is required when using ERMs for multi-agent deep reinforcement learning (MA-DRL), as stored transitions can become outdated because agents update their policies in parallel [11]. In this work we apply leniency [23] to MA-DRL. Lenient agents map state-action pairs to decaying temperature values that control the amount of leniency applied towards negative policy updates that are sampled from the ERM. This introduces optimism in the value-function update, and has been shown to facilitate cooperation in tabular fully-cooperative multi-agent reinforcement learning problems. We evaluate our Lenient-DQN (LDQN) empirically against the related Hysteretic-DQN (HDQN) algorithm [22] as well as a modified version we call scheduled-HDQN, that uses average reward learning near terminal states. Evaluations take place in extended variations of the Coordinated Multi-Agent Object Transportation Problem (CMOTP) [8] which include fully-cooperative sub-tasks and stochastic rewards. We find that LDQN agents are more likely to converge to the optimal policy in a stochastic reward CMOTP compared to standard and scheduled-HDQN agents

    The Longitudinal Properties of a Solar Energetic Particle Event Investigated Using Modern Solar Imaging

    Get PDF
    We use combined high-cadence, high-resolution, and multi-point imaging by the Solar-Terrestrial Relations Observatory (STEREO) and the Solar and Heliospheric Observatory to investigate the hour-long eruption of a fast and wide coronal mass ejection (CME) on 2011 March 21 when the twin STEREO spacecraft were located beyond the solar limbs. We analyze the relation between the eruption of the CME, the evolution of an Extreme Ultraviolet (EUV) wave, and the onset of a solar energetic particle (SEP) event measured in situ by the STEREO and near-Earth orbiting spacecraft. Combined ultraviolet and white-light images of the lower corona reveal that in an initial CME lateral "expansion phase," the EUV disturbance tracks the laterally expanding flanks of the CME, both moving parallel to the solar surface with speeds of ~450 km s^(–1). When the lateral expansion of the ejecta ceases, the EUV disturbance carries on propagating parallel to the solar surface but devolves rapidly into a less coherent structure. Multi-point tracking of the CME leading edge and the effects of the launched compression waves (e.g., pushed streamers) give anti-sunward speeds that initially exceed 900 km s^(–1) at all measured position angles. We combine our analysis of ultraviolet and white-light images with a comprehensive study of the velocity dispersion of energetic particles measured in situ by particle detectors located at STEREO-A (STA) and first Lagrange point (L1), to demonstrate that the delayed solar particle release times at STA and L1 are consistent with the time required (30-40 minutes) for the CME to perturb the corona over a wide range of longitudes. This study finds an association between the longitudinal extent of the perturbed corona (in EUV and white light) and the longitudinal extent of the SEP event in the heliosphere

    Traumatic brain injury leads to alterations in contusional cortical miRNAs involved in dementia

    Get PDF
    There is compelling evidence that head injury is a significant environmental risk factor for Alzheimer's disease (AD) and that a history of traumatic brain injury (TBI) accelerates the onset of AD. Amyloid-β plaques and tau aggregates have been observed in the post-mortem brains of TBI patients; however, the mechanisms leading to AD neuropathology in TBI are still unknown. In this study, we hypothesized that focal TBI induces changes in miRNA expression in and around affected areas, resulting in the altered expression of genes involved in neurodegeneration and AD pathology. For this purpose, we performed a miRNA array in extracts from rats subjected to experimental TBI, using the controlled cortical impact (CCI) model. In and around the contusion, we observed alterations of miRNAs associated with dementia/AD, compared to the contralateral side. Specifically, the expression of miR-9 was significantly upregulated, while miR-29b, miR-34a, miR-106b, miR-181a and miR-107 were downregulated. Via qPCR, we confirmed these results in an additional group of injured rats when compared to naïve animals. Interestingly, the changes in those miRNAs were concomitant with alterations in the gene expression of mRNAs involved in amyloid generation and tau pathology, such as β-APP cleaving enzyme (BACE1) and Glycogen synthase-3-β (GSK3β). In addition increased levels of neuroinflammatory markers (TNF-α), glial activation, neuronal loss, and tau phosphorylation were observed in pericontusional areas. Therefore, our results suggest that the secondary injury cascade in TBI affects miRNAs regulating the expression of genes involved in AD dementia

    Speeds and arrival times of solar transients approximated by self-similar expanding circular fronts

    Full text link
    The NASA STEREO mission opened up the possibility to forecast the arrival times, speeds and directions of solar transients from outside the Sun-Earth line. In particular, we are interested in predicting potentially geo-effective Interplanetary Coronal Mass Ejections (ICMEs) from observations of density structures at large observation angles from the Sun (with the STEREO Heliospheric Imager instrument). We contribute to this endeavor by deriving analytical formulas concerning a geometric correction for the ICME speed and arrival time for the technique introduced by Davies et al. (2012, ApJ, in press) called Self-Similar Expansion Fitting (SSEF). This model assumes that a circle propagates outward, along a plane specified by a position angle (e.g. the ecliptic), with constant angular half width (lambda). This is an extension to earlier, more simple models: Fixed-Phi-Fitting (lambda = 0 degree) and Harmonic Mean Fitting (lambda = 90 degree). This approach has the advantage that it is possible to assess clearly, in contrast to previous models, if a particular location in the heliosphere, such as a planet or spacecraft, might be expected to be hit by the ICME front. Our correction formulas are especially significant for glancing hits, where small differences in the direction greatly influence the expected speeds (up to 100-200 km/s) and arrival times (up to two days later than the apex). For very wide ICMEs (2 lambda > 120 degree), the geometric correction becomes very similar to the one derived by M\"ostl et al. (2011, ApJ, 741, id. 34) for the Harmonic Mean model. These analytic expressions can also be used for empirical or analytical models to predict the 1 AU arrival time of an ICME by correcting for effects of hits by the flank rather than the apex, if the width and direction of the ICME in a plane are known and a circular geometry of the ICME front is assumed.Comment: 15 pages, 5 figures, accepted for publication in "Solar Physics
    • …
    corecore