Search CORE

4,725 research outputs found

Perseus: Randomized Point-based Value Iteration for POMDPs

Author: Spaan M. T. J.
Vlassis N.
Publication venue: 'AI Access Foundation'
Publication date: 09/09/2011
Field of study

Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agents belief space. We present a randomized point-based value iteration algorithm called Perseus. The algorithm performs approximate value backup stages, ensuring that in each backup stage the value of each point in the belief set is improved; the key observation is that a single backup may improve the value of many belief points. Contrary to other point-based methods, Perseus backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set. We show how the same idea can be extended to dealing with continuous action spaces. Experimental results show the potential of Perseus in large scale POMDP problems

arXiv.org e-Print Archive

Crossref

Scalable Safe Policy Improvement via Monte Carlo Tree Search

Author: A. Castellini
A. Farinelli
E. Zorzi
F. Bianchi
M. T. J. Spaan
T. D. Simao
Publication venue: PMLR
Publication date: 01/01/2023
Field of study

Algorithms for safely improving policies are important to deploy reinforcement learning approaches in real-world scenarios. In this work, we propose an algorithm, called MCTS-SPIBB, that computes safe policy improvement online using a Monte Carlo Tree Search based strategy. We theoretically prove that the policy generated by MCTS-SPIBB converges, as the number of simulations grows, to the optimal safely improved policy generated by Safe Policy Improvement with Baseline Bootstrapping (SPIBB), a popular algorithm based on policy iteration. Moreover, our empirical analysis performed on three standard benchmark domains shows that MCTS-SPIBB scales to significantly larger problems than SPIBB because it computes the policy online and locally, i.e., only in the states actually visited by the agent

Catalogo dei prodotti della ricerca

The LHCb Outer Tracker is a gaseous detector covering an area of

5\times 6 m^2

with 12 double layers of straw tubes. The performance of the detector is presented based on data of the LHC Run 2 running period from 2015 and 2016. Occupancies and operational experience for data collected in

p p

, pPb and PbPb collisions are described. An updated study of the ageing effects is presented showing no signs of gain deterioration or other radiation damage effects. In addition several improvements with respect to LHC Run 1 data taking are introduced. A novel real-time calibration of the time-alignment of the detector and the alignment of the single monolayers composing detector modules are presented, improving the drift-time and position resolution of the detector by 20\%. Finally, a potential use of the improved resolution for the timing of charged tracks is described, showing the possibility to identify low-momentum hadrons with their time-of-flight.Comment: 29 pages, 20 figures, minor changes to match the published versio

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Cagliari

CERN Document Server

Parameter-Independent Strategies for pMDPs via POMDPs

Author: A Lukina
C Baier
C Baier
C Daws
C Dehnert
C Dehnert
D Beyer
E Bartocci
E Polgreen
EM Hahn
EM Hahn
J Aspnes
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
LI Sennott
M Baldi
M Cubuktepe
M Kwiatkowska
MTJ Spaan
N Jansen
O Madani
PR Halmos
R Lanotte
S Pathak
S Russell
T Quatmann
V Kreinovich
Publication venue
Publication date: 01/01/2018
Field of study

Markov Decision Processes (MDPs) are a popular class of models suitable for solving control decision problems in probabilistic reactive systems. We consider parametric MDPs (pMDPs) that include parameters in some of the transition probabilities to account for stochastic uncertainties of the environment such as noise or input disturbances. We study pMDPs with reachability objectives where the parameter values are unknown and impossible to measure directly during execution, but there is a probability distribution known over the parameter values. We study for the first time computing parameter-independent strategies that are expectation optimal, i.e., optimize the expected reachability probability under the probability distribution over the parameters. We present an encoding of our problem to partially observable MDPs (POMDPs), i.e., a reduction of our problem to computing optimal strategies in POMDPs. We evaluate our method experimentally on several benchmarks: a motivating (repeated) learner model; a series of benchmarks of varying configurations of a robot moving on a grid; and a consensus protocol.Comment: Extended version of a QEST 2018 pape

arXiv.org e-Print Archive

Crossref

Publikationsserver der RWTH Aachen University

IST Austria: PubRep (Institute of Science and Technology)

Dependence of Intramyocardial Pressure and Coronary Flow on Ventricular Loading and Contractility: A Model Study

Author: D. S. Fokkema
D. Zinemanas
D. Zinemanas
E. M. Downey
Ee Kouwenhoven
F. C. P. Yin
Frans N. van De Vosse
J. A. E. Spaan
J. A. Spaan
J. B. Bassingthwaighte
J. M. Huyghe
J. Rijcken
L. S. Mihailescu
M. A. Vis
M. A. Vis
M. A. Vis
M. J. Giezeman
M. Vendelin
P. Bruinsma
P. M. Janssen
P. P. Tombe de
P. Pagliaro
Peter H. M. Bovendeerd
Petra Borsje
R. Beyar
R. Krams
R. Krams
R. Krams
S. Nikolić
T. Arts
T. Arts
Theo Arts
W. M. Chilian
Publication venue: Kluwer Academic Publishers-Plenum Publishers
Publication date: 01/01/2006
Field of study

The phasic coronary arterial inflow during the normal cardiac cycle has been explained with simple (waterfall, intramyocardial pump) models, emphasizing the role of ventricular pressure. To explain changes in isovolumic and low afterload beats, these models were extended with the effect of three-dimensional wall stress, nonlinear characteristics of the coronary bed, and extravascular fluid exchange. With the associated increase in the number of model parameters, a detailed parameter sensitivity analysis has become difficult. Therefore we investigated the primary relations between ventricular pressure and volume, wall stress, intramyocardial pressure and coronary blood flow, with a mathematical model with a limited number of parameters. The model replicates several experimental observations: the phasic character of coronary inflow is virtually independent of maximum ventricular pressure, the amplitude of the coronary flow signal varies about proportionally with cardiac contractility, and intramyocardial pressure in the ventricular wall may exceed ventricular pressure. A parameter sensitivity analysis shows that the normalized amplitude of coronary inflow is mainly determined by contractility, reflected in ventricular pressure and, at low ventricular volumes, radial wall stress. Normalized flow amplitude is less sensitive to myocardial coronary compliance and resistance, and to the relation between active fiber stress, time, and sarcomere shortening velocity

Repository TU/e

Crossref

Springer - Publisher Connector

Pure OAI Repository

PubMed Central

Determination of the Michel Parameters rho, xi, and delta in tau-Lepton Decays with tau --> rho nu Tags

Author: Albrecht H.
ARGUS Collaboration
Balagura V.
Barsuk S.
Belyaev I.
Bracko M.
Chistov R.
Danilov M.
Eckmann R.
Ehret K.
Eiges V.
Frankl C.
Gershtein L.
Gershtein Yu.
Golutvin A.
Graf J.
Hamacher T.
Hast C.
Hofmann R. P.
Hofmann W.
Hupper A.
Igonkina O.
Kapitza H.
Kernel G.
Kirchhoff T.
Knopfle K. T.
Kolanoski H.
Korolko I.
Kosche A.
Kostina G.
Krieger P.
Krizan P.
Kriznic E.
Kuipers H.
Lange Arnd
Lindner A.
Litvintsev D.
MacFarlane D. B.
Mai O.
Mankel R.
Medin G.
Mundt R.
Nau A.
Nowak S.
Oest T.
Pakhlov P.
Podobnik T.
Prentice J. D.
Reim K.
Reiner R.
Ressing D.
Rohde A.
Saull P. R. B.
Schieber M.
Schmidt-Parzefall W.
Schmidtler M.
Schneider M.
Schramm M.
Schroder H.
Schubert Klaus R.
Schulz H. D.
Schwierz R.
Semenov S.
Siegmund T.
Snizhko A.
Spaan B.
Spengler J.
Stiewe J.
Thurn H.
Tichomirov I.
Topfer D.
Tzamariudaki K.
Van de Water Richard George
Waldi R.
Walter M.
Wegener D.
Wegener H.
Werner S.
Weseler S.
Wurth R.
Yoon T. S.
Zaitsev Yu.
Zivko T.
Publication venue: 'Elsevier BV'
Publication date: 27/11/1997
Field of study

Using the ARGUS detector at the

e^+ e^-

storage ring DORIS II, we have measured the Michel parameters

\rho

\xi

, and

\xi\delta

for

\tau^{\pm}\to l^{\pm} \nu\bar\nu

decays in

\tau

-pair events produced at center of mass energies in the region of the

\Upsilon

resonances. Using

\tau^\mp \to \rho^\mp \nu

as spin analyzing tags, we find

\rho_{e}=0.68\pm 0.04 \pm 0.08

\xi_{e}= 1.12 \pm 0.20 \pm 0.09

\xi\delta_{e}= 0.57 \pm 0.14 \pm 0.07

\rho_{\mu}= 0.69 \pm 0.06 \pm 0.08

\xi_{\mu}= 1.25 \pm 0.27 \pm 0.14

and

\xi\delta_{\mu}= 0.72 \pm 0.18 \pm 0.10

. In addition, we report the combined ARGUS results on

\rho

\xi

, and

\xi\delta

using this work und previous measurements.Comment: 10 pages, well formatted postscript can be found at http://pktw06.phy.tu-dresden.de/iktp/pub/desy97-194.p

arXiv.org e-Print Archive

DESY

Bounded approximations for linear multi-objective planning under uncertainty.

Author: Diederik M Roijers
Frans A Oliehoek
Joris Scharpff
Mathijs M De Weerdt
Matthijs T J Spaan
Shimon Whiteson
Publication venue
Publication date: 01/01/2014
Field of study

Abstract Planning under uncertainty poses a complex problem in which multiple objectives often need to be balanced. When dealing with multiple objectives, it is often assumed that the relative importance of the objectives is known a priori. However, in practice human decision makers often find it hard to specify such preferences exactly, and would prefer a decision support system that presents a range of possible alternatives. We propose two algorithms for computing these alternatives for the case of linearly weighted objectives. First, we propose an anytime method, approximate optimistic linear support (AOLS), that incrementally builds up a complete set of -optimal plans, exploiting the piecewise-linear and convex shape of the value function. Second, we propose an approximate anytime method, scalarised sample incremental improvement (SSII), that employs weight sampling to focus on the most interesting regions in weight space, as suggested by a prior over preferences. We show empirically that our methods are able to produce (near-)optimal alternative sets orders of magnitude faster than existing techniques, thereby demonstrating that our methods provide sensible approximations in stochastic multi-objective domains

CiteSeerX

Observation of the Isospin-Violating Decay $D_s^{*+}\to D_s^+\pi^0$

Author: Alam M.
Alexander J.
Ammar R.
Artuso M.
Asner D.
Athanas M.
Avery P.
Balest R.
Baringer P.
Barish B.
Bartelt J.
Bean A.
Bebek C.
Bellerive A.
Berger B.
Bergfeld T.
Berkelman K.
Besson D.
Bishai M.
Bliss D.
Bloom K.
Brandenburg G.
Britton D.
Browder T.
Brower W.
Cassel D.
Chadha M.
Chan S.
Cho H.
Cho K.
Cinabro D.
Coan T.
Coffman D.
Coppage D.
Copty N.
Cowen D.
Crawford G.
Crowcroft D.
Csorna S.
Davis R.
Dickson M.
Dominick J.
Drell P.
Dumas D.
Edwards K.
Ehrlich R.
Eigen G.
Eisenstein B.
Elia R.
Ernst J.
Fadeyev V.
Fast J.
Ford W.
Freyberger A.
Fu X.
Fujino D.
Fulton R.
Gaidarev P.
Gan K.
Gao M.
Garcia-Sciveres M.
Gerndt E.
Gibaut D.
Gibbons L.
Gittelman B.
Gladding G.
Goldberg M.
Gollin G.
Gray S.
Gronberg J.
Hancock N.
Hartill D.
He D.
Heltsley B.
Henderson S.
Hinson J.
Honscheid K.
Horwitz N.
Hyatt E.
Jain V.
Janicek R.
Johnson S.
Jones C.
Jones S.
Kagan H.
Kandaswamy J.
Kass R.
Katayama N.
Kim I.
Kim P.
Kinoshita K.
Kopp S.
Korolkov I.
Korte C.
Kotov S.
Kravchenko I.
Kreinick D.
Kubota Y.
Kutschke R.
Kwak N.
Kwon Y.
Lambrecht M.
Lattery M.
Lee J.
Lee T.
Ling Z.
Lingel K.
Liu T.
Liu Y.
Lohner M.
Ludwig G.
MacFarlane D.
Mahmood A.
Marka S.
Masek G.
Masui J.
McLean K.
Menary S.
Mevissen J.
Miao T.
Miller D.
Miller J.
Mistry N.
Modesitt M.
Momayezi M.
Moneti G.
Morrison R.
Mountain R.
Muheim F.
Mukhin Y.
Nakanishi S.
Nelson H.
Nelson J.
Nelson T.
Nemati B.
Ng C.
Nordberg E.
O'Grady C.
O'Neill J.
Ogg M.
Paar H.
Palmer M.
Park H.
Patel P.
Patterson J.
Patton S.
Peterson D.
Playfer S.
Poling R.
Pomianowski P.
Prescott C.
Qiao C.
Rankin P.
Richman J.
Riley D.
Roberts D.
Roberts S.
Rodriguez J.
Ross W.
Ryd A.
Sadoff A.
Sanghera S.
Saulnier M.
Savinov V.
Schrenk S.
Selen M.
Severini H.
Shelkov V.
Shibata E.
Shipsey I.
Skubic P.
Skwarnicki T.
Smith J.
Soffer A.
Spaan B.
Stone S.
Stroynowski R.
Sun C.
Sung M.
Tajima H.
Thaler J.
Thorndike E.
Urheim J.
Volobouev I.
Wang P.
Wang R.
Wappler F.
Wei G.
Weinstein A.
White C.
Wilson R.
Witherell M.
Wolf A.
Wood M.
Würthwein F.
Xing X.
Yamamoto H.
Yang S.
Yelton J.
Zoeller M.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/08/1995
Field of study

Using data collected with the CLEO~II detector, we have observed the isospin-violating decay

D_s^{*+}\to D_s^+\pi^0

. The decay rate for this mode, relative to the dominant radiative decay, is found to be

\Gamma(D_s^{*+}\to D_s^+\pi^0)/\Gamma(D_s^{*+}\to D_s^+\gamma)= 0.062^{+0.020}_{-0.018}\pm0.022

.Comment: 8 page uuencoded postscript file, also available through http://w4.lns.cornell.edu/public/CLN

arXiv.org e-Print Archive

Crossref

Enlighten