16,480 research outputs found

    Multi-step Reinforcement Learning: A Unifying Algorithm

    Full text link
    Unifying seemingly disparate algorithmic ideas to produce better performing algorithms has been a longstanding goal in reinforcement learning. As a primary example, TD(λ\lambda) elegantly unifies one-step TD prediction with Monte Carlo methods through the use of eligibility traces and the trace-decay parameter λ\lambda. Currently, there are a multitude of algorithms that can be used to perform TD control, including Sarsa, QQ-learning, and Expected Sarsa. These methods are often studied in the one-step case, but they can be extended across multiple time steps to achieve better performance. Each of these algorithms is seemingly distinct, and no one dominates the others for all problems. In this paper, we study a new multi-step action-value algorithm called Q(σ)Q(\sigma) which unifies and generalizes these existing algorithms, while subsuming them as special cases. A new parameter, σ\sigma, is introduced to allow the degree of sampling performed by the algorithm at each step during its backup to be continuously varied, with Sarsa existing at one extreme (full sampling), and Expected Sarsa existing at the other (pure expectation). Q(σ)Q(\sigma) is generally applicable to both on- and off-policy learning, but in this work we focus on experiments in the on-policy case. Our results show that an intermediate value of σ\sigma, which results in a mixture of the existing algorithms, performs better than either extreme. The mixture can also be varied dynamically which can result in even greater performance.Comment: Appeared at the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18

    Impurity segregation in graphene nanoribbons

    Full text link
    The electronic properties of low-dimensional materials can be engineered by doping, but in the case of graphene nanoribbons (GNR) the proximity of two symmetry-breaking edges introduces an additional dependence on the location of an impurity across the width of the ribbon. This introduces energetically favorable locations for impurities, leading to a degree of spatial segregation in the impurity concentration. We develop a simple model to calculate the change in energy of a GNR system with an arbitrary impurity as that impurity is moved across the ribbon and validate its findings by comparison with ab initio calculations. Although our results agree with previous works predicting the dominance of edge disorder in GNR, we argue that the distribution of adsorbed impurities across a ribbon may be controllable by external factors, namely an applied electric field. We propose that this control over impurity segregation may allow manipulation and fine-tuning of the magnetic and transport properties of GNRs.Comment: 5 pages, 4 figures, submitte

    The silicon stable isotope distribution along the GEOVIDE section (GEOTRACES GA-01) of the North Atlantic Ocean

    Get PDF
    The stable isotope composition of dissolved silicon in seawater (δ30SiDSi) was examined at 10 stations along the GEOVIDE section (GEOTRACES GA-01), spanning the North Atlantic Ocean (40–60∘ N) and Labrador Sea. Variations in δ30SiDSi below 500 m were closely tied to the distribution of water masses. Higher δ30SiDSi values are associated with intermediate and deep water masses of northern Atlantic or Arctic Ocean origin, whilst lower δ30SiDSi values are associated with DSi-rich waters sourced ultimately from the Southern Ocean. Correspondingly, the lowest δ30SiDSi values were observed in the deep and abyssal eastern North Atlantic, where dense southern-sourced waters dominate. The extent to which the spreading of water masses influences the δ30SiDSi distribution is marked clearly by Labrador Sea Water (LSW), whose high δ30SiDSi signature is visible not only within its region of formation within the Labrador and Irminger seas, but also throughout the mid-depth western and eastern North Atlantic Ocean. Both δ30SiDSi and hydrographic parameters document the circulation of LSW into the eastern North Atlantic, where it overlies southern-sourced Lower Deep Water. The GEOVIDE δ30SiDSi distribution thus provides a clear view of the direct interaction between subpolar/polar water masses of northern and southern origin, and allow examination of the extent to which these far-field signals influence the local δ30SiDSi distribution

    Re-estimation of argon isotope ratios leading to a revised estimate of the Boltzmann constant

    Get PDF
    In 2013, NPL, SUERC and Cranfield University published an estimate for the Boltzmann constant [1] based on a measurement of the limiting low-pressure speed of sound in argon gas. Subsequently, an extensive investigation by Yang et al [2] revealed that there was likely to have been an error in the estimate of the molar mass of the argon used in the experiment. Responding to [2], de Podesta et al revised their estimate of the molar mass [3]. The shift in the estimated molar mass, and of the estimate of kB, was large: -2.7 parts in 106, nearly four times the original uncertainty estimate. The work described here was undertaken to understand the cause of this shift and our conclusion is that the original samples were probably contaminated with argon from atmospheric air.
 In this work we have repeated the measurement reported in [1] on the same gas sample that was examined in [2, 3]. However in this work we have used a different technique for sampling the gas that has allowed us to eliminate the possibility of contamination of the argon samples. We have repeated the sampling procedure three times, and examined samples on two mass spectrometers. This procedure confirms the isotopic ratio estimates of Yang et al [2] but with lower uncertainty, particularly in the relative abundance ratio R38:36.
 Our new estimate of the molar mass of the argon used in Isotherm 5 in [1] is 39.947 727(15) g mol-1 which differs by +0.50 parts in 106 from the estimate 39.947 707(28) g mol-1 made in [3]. This new estimate of the molar mass leads to a revised estimate of the Boltzmann constant of kB = 1.380 648 60 (97) × 10−23 J K−1 which differs from the 2014 CODATA value by +0.05 parts in 106.&#13

    Early and late systolic wall stress differentially relate to myocardial contraction and relaxation in middle-aged adults: the Asklepios study

    Get PDF
    Experimental studies implicate late systolic load as a determinant of impaired left ventricular (LV) relaxation. We aimed to assess the relationship between the myocardial loading sequence and left ventricular (LV) contraction and relaxation. Time-resolved central pressure and time-resolved LV geometry were measured with carotid tonometry and speckle-tracking echocardiography, respectively, for computation of time-resolved ejection-phase myocardial wall stress (EP-MWS) among 1,214 middle-aged adults without manifest cardiovascular disease from the general population. Early diastolic annular velocity, systolic annular velocities were measured with tissue Doppler imaging and segmentaveraged longitudinal strain was measured with speckle-tracking echocardiography. After adjustment for age, gender and potential confounders, late EP-MWS was negatively associated with early diastolic mitral annular velocity (e', standardized β=-0.25; P<0.0001) and mitral inflow propagation velocity (Vpe, standardized β=-0.13; P=0.02). In contrast, early EP-MWS was positively associated with e' (standardized β=0.18; P<0.0001) and Vpe (standardized β=0.22; P<0.0001). A higher late EP-MWS predicted a lower systolic mitral annular velocity (S', standardized β=-0.31; P<0.0001) and lesser myocardial longitudinal strain (standardized β=0.32; P<0.0001), whereas a higher early EP-MWS was associated with a higher S' (standardized β=0.16; P=0.002) and greater longitudinal strain (standardized β=-0.24; P=0.002). The loading sequence remained independently associated with e' after adjustment for S' or systolic longitudinal strain. In the context of available experimental data, our findings support the role of the myocardial loading sequence as a determinant of LV systolic and diastolic function. A loading sequence characterized by prominent late systolic wall stress was associated with lower longitudinal systolic function and diastolic relaxation

    Opinions on the use of technology to improve tablet taking in >65-year-old patients on cardiovascular medications.

    Get PDF
    Objective This study was performed to evaluate the perceptions of the use of technology to improve cardiovascular medicine taking among patients aged >65 years. Methods This qualitative study used focus groups with people aged >65 years taking cardiovascular medications from two East London community centres. Thematic analysis was informed by the Perceptions and Practicalities Approach framework. Results Participants welcomed technologies they considered familiar, accessible, and easy to use. They valued the opportunity to receive alerts to help with forgetting and monitoring their treatment. More advanced technologies such as ingestible sensor systems were considered helpful for elderly people with significant cognitive impairments still living in the community because of improved monitoring by caregivers and clinicians and prolonging independence. Although generally adapting to the increase in technology in everyday life, participants raised a number of concerns that included potential reduction in face-to-face communication, data security, becoming dependent on technology, and worrying about the consequences of technological failure. Conclusions Participants raised a number of concerns and practical barriers that would need to be addressed for technologies to be accepted and adopted in this patient group
    corecore