16,480 research outputs found
Multi-step Reinforcement Learning: A Unifying Algorithm
Unifying seemingly disparate algorithmic ideas to produce better performing
algorithms has been a longstanding goal in reinforcement learning. As a primary
example, TD() elegantly unifies one-step TD prediction with Monte
Carlo methods through the use of eligibility traces and the trace-decay
parameter . Currently, there are a multitude of algorithms that can be
used to perform TD control, including Sarsa, -learning, and Expected Sarsa.
These methods are often studied in the one-step case, but they can be extended
across multiple time steps to achieve better performance. Each of these
algorithms is seemingly distinct, and no one dominates the others for all
problems. In this paper, we study a new multi-step action-value algorithm
called which unifies and generalizes these existing algorithms,
while subsuming them as special cases. A new parameter, , is introduced
to allow the degree of sampling performed by the algorithm at each step during
its backup to be continuously varied, with Sarsa existing at one extreme (full
sampling), and Expected Sarsa existing at the other (pure expectation).
is generally applicable to both on- and off-policy learning, but in
this work we focus on experiments in the on-policy case. Our results show that
an intermediate value of , which results in a mixture of the existing
algorithms, performs better than either extreme. The mixture can also be varied
dynamically which can result in even greater performance.Comment: Appeared at the Thirty-Second AAAI Conference on Artificial
Intelligence (AAAI-18
Impurity segregation in graphene nanoribbons
The electronic properties of low-dimensional materials can be engineered by
doping, but in the case of graphene nanoribbons (GNR) the proximity of two
symmetry-breaking edges introduces an additional dependence on the location of
an impurity across the width of the ribbon. This introduces energetically
favorable locations for impurities, leading to a degree of spatial segregation
in the impurity concentration. We develop a simple model to calculate the
change in energy of a GNR system with an arbitrary impurity as that impurity is
moved across the ribbon and validate its findings by comparison with ab initio
calculations. Although our results agree with previous works predicting the
dominance of edge disorder in GNR, we argue that the distribution of adsorbed
impurities across a ribbon may be controllable by external factors, namely an
applied electric field. We propose that this control over impurity segregation
may allow manipulation and fine-tuning of the magnetic and transport properties
of GNRs.Comment: 5 pages, 4 figures, submitte
The silicon stable isotope distribution along the GEOVIDE section (GEOTRACES GA-01) of the North Atlantic Ocean
The stable isotope composition of dissolved silicon in seawater (δ30SiDSi) was examined at 10 stations along the GEOVIDE section (GEOTRACES GA-01), spanning the North Atlantic Ocean (40–60∘ N) and Labrador Sea. Variations in δ30SiDSi below 500 m were closely tied to the distribution of water masses. Higher δ30SiDSi values are associated with intermediate and deep water masses of northern Atlantic or Arctic Ocean origin, whilst lower δ30SiDSi values are associated with DSi-rich waters sourced ultimately from the Southern Ocean. Correspondingly, the lowest δ30SiDSi values were observed in the deep and abyssal eastern North Atlantic, where dense southern-sourced waters dominate. The extent to which the spreading of water masses influences the δ30SiDSi distribution is marked clearly by Labrador Sea Water (LSW), whose high δ30SiDSi signature is visible not only within its region of formation within the Labrador and Irminger seas, but also throughout the mid-depth western and eastern North Atlantic Ocean. Both δ30SiDSi and hydrographic parameters document the circulation of LSW into the eastern North Atlantic, where it overlies southern-sourced Lower Deep Water. The GEOVIDE δ30SiDSi distribution thus provides a clear view of the direct interaction between subpolar/polar water masses of northern and southern origin, and allow examination of the extent to which these far-field signals influence the local δ30SiDSi distribution
Re-estimation of argon isotope ratios leading to a revised estimate of the Boltzmann constant
In 2013, NPL, SUERC and Cranfield University published an estimate for the Boltzmann constant [1] based on a measurement of the limiting low-pressure speed of sound in argon gas. Subsequently, an extensive investigation by Yang et al [2] revealed that there was likely to have been an error in the estimate of the molar mass of the argon used in the experiment. Responding to [2], de Podesta et al revised their estimate of the molar mass [3]. The shift in the estimated molar mass, and of the estimate of kB, was large: -2.7 parts in 106, nearly four times the original uncertainty estimate. The work described here was undertaken to understand the cause of this shift and our conclusion is that the original samples were probably contaminated with argon from atmospheric air. In this work we have repeated the measurement reported in [1] on the same gas sample that was examined in [2, 3]. However in this work we have used a different technique for sampling the gas that has allowed us to eliminate the possibility of contamination of the argon samples. We have repeated the sampling procedure three times, and examined samples on two mass spectrometers. This procedure confirms the isotopic ratio estimates of Yang et al [2] but with lower uncertainty, particularly in the relative abundance ratio R38:36. Our new estimate of the molar mass of the argon used in Isotherm 5 in [1] is 39.947 727(15) g mol-1 which differs by +0.50 parts in 106 from the estimate 39.947 707(28) g mol-1 made in [3]. This new estimate of the molar mass leads to a revised estimate of the Boltzmann constant of kB = 1.380 648 60 (97) × 10−23 J K−1 which differs from the 2014 CODATA value by +0.05 parts in 106.
Early and late systolic wall stress differentially relate to myocardial contraction and relaxation in middle-aged adults: the Asklepios study
Experimental studies implicate late systolic load as a determinant of impaired left ventricular (LV) relaxation. We aimed to assess the relationship between the myocardial loading sequence and left ventricular (LV) contraction and relaxation. Time-resolved central pressure and time-resolved LV geometry were measured with carotid tonometry and speckle-tracking echocardiography, respectively, for computation of time-resolved ejection-phase myocardial wall stress (EP-MWS) among 1,214 middle-aged adults without manifest cardiovascular disease from the general population. Early diastolic annular velocity, systolic annular velocities were measured with tissue Doppler imaging and segmentaveraged longitudinal strain was measured with speckle-tracking echocardiography. After adjustment for age, gender and potential confounders, late EP-MWS was negatively associated with early diastolic mitral annular velocity (e', standardized β=-0.25; P<0.0001) and mitral inflow propagation velocity (Vpe, standardized β=-0.13; P=0.02). In contrast, early EP-MWS was positively associated with e' (standardized β=0.18; P<0.0001) and Vpe (standardized β=0.22; P<0.0001). A higher late EP-MWS predicted a lower systolic mitral annular velocity (S', standardized β=-0.31; P<0.0001) and lesser myocardial longitudinal strain (standardized β=0.32; P<0.0001), whereas a higher early EP-MWS was associated with a higher S' (standardized β=0.16; P=0.002) and greater longitudinal strain (standardized β=-0.24; P=0.002). The loading sequence remained independently associated with e' after adjustment for S' or systolic longitudinal strain. In the context of available experimental data, our findings support the role of the myocardial loading sequence as a determinant of LV systolic and diastolic function. A loading sequence characterized by prominent late systolic wall stress was associated with lower longitudinal systolic function and diastolic relaxation
Opinions on the use of technology to improve tablet taking in >65-year-old patients on cardiovascular medications.
Objective This study was performed to evaluate the perceptions of the use of technology to improve cardiovascular medicine taking among patients aged >65 years. Methods This qualitative study used focus groups with people aged >65 years taking cardiovascular medications from two East London community centres. Thematic analysis was informed by the Perceptions and Practicalities Approach framework. Results Participants welcomed technologies they considered familiar, accessible, and easy to use. They valued the opportunity to receive alerts to help with forgetting and monitoring their treatment. More advanced technologies such as ingestible sensor systems were considered helpful for elderly people with significant cognitive impairments still living in the community because of improved monitoring by caregivers and clinicians and prolonging independence. Although generally adapting to the increase in technology in everyday life, participants raised a number of concerns that included potential reduction in face-to-face communication, data security, becoming dependent on technology, and worrying about the consequences of technological failure. Conclusions Participants raised a number of concerns and practical barriers that would need to be addressed for technologies to be accepted and adopted in this patient group
- …