Search CORE

13,933 research outputs found

Regret Bounds for Reinforcement Learning with Policy Advice

Author: C. Tekin
M.L. Puterman
N. Cesa-Bianchi
R. Ortner
R.S. Sutton
T. Jaksch
Publication venue
Publication date: 01/01/2013
Field of study

In some reinforcement learning problems an agent may be provided with a set of input policies, perhaps learned from prior experience or provided by advisors. We present a reinforcement learning with policy advice (RLPA) algorithm which leverages this input set and learns to use the best policy in the set for the reinforcement learning task at hand. We prove that RLPA has a sub-linear regret of \tilde O(\sqrt{T}) relative to the best input policy, and that both this regret and its computational complexity are independent of the size of the state and action space. Our empirical simulations support our theoretical analysis. This suggests RLPA may offer significant advantages in large domains where some prior good policies are provided

arXiv.org e-Print Archive

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

Reconstruction of deglacial sea surface temperatures in the tropical Pacific from selective analysis of a fossil coral

Author: Allison N.
Ellam R. M.
Finch A. A.
Newville M.
Sutton S. R.
Tudhope A. W.
Publication venue
Publication date: 01/01/2005
Field of study

The Sr/Ca of coral skeletons demonstrates potential as an indicator of sea surface temperatures (SSTs). However, the glacial-interglacial SST ranges predicted from Sr/Ca of fossil corals are usually higher than from other marine proxies. We observed infilling of secondary aragonite, characterised by high Sr/Ca ratios, along intraskeletal pores of a fossil coral from Papua New Guinea that grew during the penultimate deglaciation (130 +/- 2 ka). Selective microanalysis of unaltered areas of the fossil coral indicates that SSTs at similar to 130 ka were <= 1 degrees C cooler than at present in contrast with bulk measurements ( combining infilled and unaltered areas) which indicate a difference of 6-7 degrees C. The analysis of unaltered areas of fossil skeletons by microprobe techniques may offer a route to more accurate reconstruction of past SSTs.</p

Edinburgh Research Explorer

Enlighten

University of St. Andrews - Pure

Validity and practical utility of accelerometry for the measurement of in-hand physical activity in horses

Author: Carnwath J.
Horsfield E.
Hunter-Blair N.
Morrison R.
Ramsoy C.
Sutton D. G. M.
Yam P. S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/09/2015
Field of study

Background: Accelerometers are valid, practical and reliable tools for the measurement of habitual physical activity (PA). Quantification of PA in horses is desirable for use in research and clinical settings. The objective of this study was to evaluate a triaxial accelerometer for objective measurement of PA in the horse by assessment of their practical utility and validity. Horses were recruited to establish both the optimal site of accelerometer attachment and questionnaire designed to explore owner acceptance. Validity and cut-off values were obtained by assessing PA at various gaits. Validation study- 20 horses wore the accelerometer while being filmed for 10 min each of rest, walking and trotting and 5 mins of canter work. Practical utility study- five horses wore accelerometers on polls and withers for 18 h; compliance and relative data losses were quantified. Results: Accelerometry output differed significantly between the four PA levels (P <0•001) for both wither and poll placement. For withers placement, ROC analyses found optimal sensitivity and specificity at a cut-off of <47 counts per minute (cpm) for rest (sensitivity 99.5 %, specificity 100 %), 967–2424 cpm for trotting (sensitivity 96.7 %, specificity 100 %) and ≥2425 cpm for cantering (sensitivity 96.0 %, specificity 97.0 %). Attachment at the poll resulted in optimal sensitivity and specificity at a cut-off of <707 counts per minute (cpm) for rest (sensitivity 97.5 %, specificity 99.6 %), 1546–2609 cpm for trotting (sensitivity 90.33 %, specificity 79.25 %) and ≥2610 cpm for cantering (sensitivity 100 %, specificity 100 %) In terms of practical utility, accelerometry was well tolerated and owner acceptance high. Conclusion: Accelerometry data correlated well with varying levels of in-hand equine activity. The use of accelerometers is a valid method for objective measurement of controlled PA in the horse

Crossref

Springer - Publisher Connector

PubMed Central

Enlighten

Modelling the hepatitis B vaccination programme in prisons

Author: Andrews Nicholas J.
Edmunds W. John
Gay N. J.
Gilbert R. L.
Gill O. N.
Hope V. D.
Piper M.
Sutton A. J.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2005
Field of study

A vaccination programme offering hepatitis B (HBV) vaccine at reception into prison has been introduced into selected prisons in England and Wales. Over the coming years it is anticipated this vaccination programme will be extended. A model has been developed to assess the potential impact of the programme on the vaccination coverage of prisoners, ex-prisoners, and injecting drug users (IDUs). Under a range of coverage scenarios, the model predicts the change over time in the vaccination status of new entrants to prison, current prisoners and IDUs in the community. The model predicts that at baseline in 2012 57% of the IDU population will be vaccinated with up to 72% being vaccinated depending on the vaccination scenario implemented. These results are sensitive to the size of the IDU population in England and Wales and the average time served by an IDU during each prison visit. IDUs that do not receive HBV vaccine in the community are at increased risk from HBV infection. The HBV vaccination programme in prisons is an effective way of vaccinating this hard-to-reach population although vaccination coverage on prison reception must be increased to achieve this

CiteSeerX

Warwick Research Archives Portal Repository