Search CORE

1,065 research outputs found

Thompson Sampling: An Asymptotically Optimal Finite Time Analysis

Author: A. Salomon
B.C. May
J.-Y. Audibert
J.-Y. Audibert
O.C. Granmo
P. Auer
T.L. Lai
W.R. Thompson
Publication venue
Publication date: 01/01/2012
Field of study

The question of the optimality of Thompson Sampling for solving the stochastic multi-armed bandit problem had been open since 1933. In this paper we answer it positively for the case of Bernoulli rewards by providing the first finite-time analysis that matches the asymptotic rate given in the Lai and Robbins lower bound for the cumulative regret. The proof is accompanied by a numerical comparison with other optimal policies, experiments that have been lacking in the literature until now for the Bernoulli case.Comment: 15 pages, 2 figures, submitted to ALT (Algorithmic Learning Theory

arXiv.org e-Print Archive

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

The Innermost Ejecta of Core Collapse Supernovae

Author: Beloborodov
Buras
C. Fröhlich
E. Bravo
F.-K. Thielemann
Fröhlich
G. Martmez-Pinedo
Janka
Liebendörfer
Liebendörfer
M. Liebendörfer
Mezzacappa
Mezzacappa
N.T. Zinner
Nomoto
P. Hauser
Qian
Rampp
Thielemann
Thielemann
Thompson
Thompson
W.R. Hix
Woosley
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

We ensure successful explosions (of otherwise non-explosive models) by enhancing the neutrino luminosity via reducing the neutrino scattering cross sections or by increasing the heating efficiency via enhancing the neutrino absorption cross sections in the heating region. Our investigations show that the resulting electron fraction Ye in the innermost ejecta is close to 0.5, in some areas even exceeding 0.5. We present the effects of the resulting values for Ye on the nucleosynthesis yields of the innermost zones of core collapse supernovae.Comment: 4pages, 2figures; contribution to Nuclei In The Cosmos VIII, to appear in Nucl. Phys.

arXiv.org e-Print Archive

CiteSeerX

Crossref

UPCommons. Portal del coneixement obert de la UPC

CERN Document Server

What makes medical students better listeners?

Author: Clarke S.
De Meo R.
Knebel J.F.
Matusz P.J.
Murray M.M.
Thompson W.R.
Publication venue: 'Elsevier BV'
Publication date: 01/07/2016
Field of study

Diagnosing heart conditions by auscultation is an important clinical skill commonly learnt by medical students. Clinical proficiency for this skill is in decline [1], and new teaching methods are needed. Successful discrimination of heartbeat sounds is believed to benefit mainly from acoustical training [2]. From recent studies of auditory training [3,4] we hypothesized that semantic representations outside the auditory cortex contribute to diagnostic accuracy in cardiac auscultation. To test this hypothesis, we analysed auditory evoked potentials (AEPs) which were recorded from medical students while they diagnosed quadruplets of heartbeat cycles. The comparison of trials with correct (Hits) versus incorrect diagnosis (Misses) revealed a significant difference in brain activity at 280-310 ms after the onset of the second cycle within the left middle frontal gyrus (MFG) and the right prefrontal cortex. This timing and locus suggest that semantic rather than acoustic representations contribute critically to auscultation skills. Thus, teaching auscultation should emphasize the link between the heartbeat sound and its meaning. Beyond cardiac auscultation, this issue is of interest for all fields where subtle but complex perceptual differences identify items in a well-known semantic context

Crossref

Serveur académique lausannois

A two-armed bandit based scheme for accelerated decentralized learning

Author: K.S. Narendra
M.L. Tsetlin
O.C. Granmo
O.C. Granmo
S.L. Scott
W.R. Thompson
Y.U. Cao
Publication venue: Springer Berlin / Heidelberg
Publication date: 01/01/2011
Field of study

The two-armed bandit problem is a classical optimization problem where a decision maker sequentially pulls one of two arms attached to a gambling machine, with each pull resulting in a random reward. The reward distributions are unknown, and thus, one must balance between exploiting existing knowledge about the arms, and obtaining new information. Bandit problems are particularly fascinating because a large class of real world problems, including routing, QoS control, game playing, and resource allocation, can be solved in a decentralized manner when modeled as a system of interacting gambling machines. Although computationally intractable in many cases, Bayesian methods provide a standard for optimal decision making. This paper proposes a novel scheme for decentralized decision making based on the Goore Game in which each decision maker is inherently Bayesian in nature, yet avoids computational intractability by relying simply on updating the hyper parameters of sibling conjugate priors, and on random sampling from these posteriors. We further report theoretical results on the variance of the random rewards experienced by each individual decision maker. Based on these theoretical results, each decision maker is able to accelerate its own learning by taking advantage of the increasingly more reliable feedback that is obtained as exploration gradually turns into exploitation in bandit problem based learning. Extensive experiments demonstrate that the accelerated learning allows us to combine the benefits of conservative learning, which is high accuracy, with the benefits of hurried learning, which is fast convergence. In this manner, our scheme outperforms recently proposed Goore Game solution schemes, where one has to trade off accuracy with speed. We thus believe that our methodology opens avenues for improved performance in a number of applications of bandit based decentralized decision making

Crossref

NORA - Norwegian Open Research Archives

Agder University Research Archive

The Neutrino Signal in Stellar Core Collapse and Postbounce Evolution

Author: A. Mezzacappa
Bethe
Bruenn
Bruenn
Bruenn
Bruenn
Burrows
Burrows
Burrows
F.-K. Thielemann
Fryer
Fryer
G. Martinez-Pinedo
Herant
Horowitz
Janka
Janka
Janka
Lattimer
Liebendorfer
Liebendörfer
Liebendörfer
M. Liebendörfer
MacFadyen
Messer
Mezzacappa
Mezzacappa
Mezzacappa
Myra
Nomoto
O.E.B. Messer
Qian
Rampp
Rampp
Reddy
Thompson
Thompson
Thompson
W.R. Hix
Wheeler
Wilson
Wilson
Woosley
Woosley
Publication venue: 'Elsevier BV'
Publication date: 14/11/2002
Field of study

General relativistic multi-group and multi-flavor Boltzmann neutrino transport in spherical symmetry adds a new level of detail to the numerical bridge between microscopic nuclear and weak interaction physics and the macroscopic evolution of the astrophysical object. Although no supernova explosions are obtained, we investigate the neutrino luminosities in various phases of the postbounce evolution for a wide range of progenitor stars between 13 and 40 solar masses. The signal probes the dynamics of material layered in and around the protoneutron star and is, within narrow limits, sensitive to improvements in the weak interaction physics. Only changes that dramatically exceed physical limitations allow experiments with exploding models. We discuss the differences in the neutrino signal and find the electron fraction in the innermost ejecta to exceed 0.5 as a consequence of thermal balance and weak equilibrium at the masscut.Comment: 8 pages, 4 figures. Proceedings of the Nuclear Physics in Astrophysics Conference, Debrecen, Hungary, 2002, to appear in Nuc. Phys. A. Color figures added and reference actualize

arXiv.org e-Print Archive

Crossref

CERN Document Server

Solving Non-Stationary Bandit Problems by Random Sampling from Sibling Kalman Filters

Author: C. Dimitrakakis
J. Vermorel
K.S. Narendra
O.C. Granmo
P. Auer
R. Dearden
R.S. Sutton
S. Russel
T.M. Mitchell
W.R. Thompson
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2010
Field of study

The multi-armed bandit problem is a classical optimization problem where an agent sequentially pulls one of multiple arms attached to a gambling machine, with each pull resulting in a random reward. The reward distributions are unknown, and thus, one must balance between exploiting existing knowledge about the arms, and obtaining new information. Dynamically changing (non-stationary) bandit problems are particularly challenging because each change of the reward distributions may progressively degrade the performance of any fixed strategy. Although computationally intractable in many cases, Bayesian methods provide a standard for optimal decision making. This paper proposes a novel solution scheme for bandit problems with non-stationary normally distributed rewards. The scheme is inherently Bayesian in nature, yet avoids computational intractability by relying simply on updating the hyper parameters of sibling Kalman Filters, and on random sampling from these posteriors. Furthermore, it is able to track the better actions, thus supporting non-stationary bandit problems. Extensive experiments demonstrate that our scheme outperforms recently proposed bandit playing algorithms, not only in non-stationary environments, but in stationary environments also. Furthermore, our scheme is robust to inexact parameter settings. We thus believe that our methodology opens avenues for obtaining improved novel solutions

Crossref

NORA - Norwegian Open Research Archives

Agder University Research Archive

Discretized Bayesian pursuit – A new scheme for reinforcement learning

Author: B.J. Oommen
B.J. Oommen
B.J. Oommen
B.J. Oommen
B.J. Oommen
C. Unsal
K.S. Narendra
O. Granmo
O. Granmo
S. Lakshmivarahan
T. Dean
W.R. Thompson
X. Zhang
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2012
Field of study

The success of Learning Automata (LA)-based estimator algorithms over the classical, Linear Reward-Inaction ( L RI )-like schemes, can be explained by their ability to pursue the actions with the highest reward probability estimates. Without access to reward probability estimates, it makes sense for schemes like the L RI to first make large exploring steps, and then to gradually turn exploration into exploitation by making progressively smaller learning steps. However, this behavior becomes counter-intuitive when pursuing actions based on their estimated reward probabilities. Learning should then ideally proceed in progressively larger steps, as the reward probability estimates turn more accurate. This paper introduces a new estimator algorithm, the Discretized Bayesian Pursuit Algorithm (DBPA), that achieves this. The DBPA is implemented by linearly discretizing the action probability space of the Bayesian Pursuit Algorithm (BPA) [1]. The key innovation is that the linear discrete updating rules mitigate the counter-intuitive behavior of the corresponding linear continuous updating rules, by augmenting them with the reward probability estimates. Extensive experimental results show the superiority of DBPA over previous estimator algorithms. Indeed, the DBPA is probably the fastest reported LA to date

Crossref

Carleton University's Institutional Repository

NORA - Norwegian Open Research Archives

Agder University Research Archive

Differential iron requirements for osteoblast and adipocyte differentiation

Author: Atkins G.J.
Clinkenbeard E.L.
Edwards D.F.
Miller C.J.
Prideaux M.
Quintana‐Martinez A.
Thompson W.R.
Wright C.S.
Publication venue: 'Wiley'
Publication date: 01/01/2021
Field of study

Bone marrow mesenchymal progenitor cells are precursors for various cell types including osteoblasts, adipocytes, and chondrocytes. The external environment and signals act to direct the pathway of differentiation. Importantly, situations such as aging and chronic kidney disease display alterations in the balance of osteoblast and adipocyte differentiation, adversely affecting bone integrity. Iron deficiency, which can often occur during aging and chronic kidney disease, is associated with reduced bone density. The purpose of this study was to assess the effects of iron deficiency on the capacity of progenitor cell differentiation pathways. Mouse and human progenitor cells, differentiated under standard osteoblast and adipocyte protocols in the presence of the iron chelator deferoxamine (DFO), were used. Under osteogenic conditions, 5μM DFO significantly impaired expression of critical osteoblast genes, including osteocalcin, type 1 collagen, and dentin matrix protein 1. This led to a reduction in alkaline phosphatase activity and impaired mineralization. Despite prolonged exposure to chronic iron deficiency, cells retained viability as well as normal hypoxic responses with significant increases in transferrin receptor and protein accumulation of hypoxia inducible factor 1α. Similar concentrations of DFO were used when cells were maintained in adipogenic conditions. In contrast to osteoblast differentiation, DFO modestly suppressed adipocyte gene expression of peroxisome-proliferating activated receptor gamma, lipoprotein lipase, and adiponectin at earlier time points with normalization at later stages. Lipid accumulation was also similar in all conditions. These data suggest the critical importance of iron in osteoblast differentiation, and as long as the external stimuli are present, iron deficiency does not impede adipogenesis. © 2021 The Authors. JBMR Plus published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research.Daniel F. Edwards III, Christopher J. Miller, Arelis Quintana-Martinez, Christian S. Wright, Matthew Prideaux, Gerald J. Atkins, William R. Thompson, and Erica L. Clinkenbear

Adelaide Research & Scholarship

Vortex dynamics and states of artificially layered superconducting films with correlated defects

Linear resistances and

IV

-characteristics have been measured over a wide range in the parameter space of the mixed phase of multilayered a-TaGe/Ge films. Three films with varying interlayer coupling and correlated defects oriented at an angle

\approx 25

from the film normal were investigated. Experimental data were analyzed within vortex glass models and a second order phase transition from a resistive vortex liquid to a pinned glass phase. Various vortex phases including changes from three to two dimensional behavior depending on anisotropy have been identified. Careful analysis of

IV

-characteristics in the glass phases revealed a distinctive

T

and

H

-dependence of the glass exponent

\mu

. The vortex dynamics in the Bose-glass phase does not follow the predicted behavior for excitations of vortex kinks or loops.Comment: 16 pages, 10 figures, 3 table

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Crossref

Nucleosynthesis in Neutrino-Driven Supernovae

Author: Aufderheide
Beloborodov
Bruenn
Buras
C. Fröhlich
Cayrel
E. Bravo
F.-K. Thielemann
Fröhlich
Fröhlich
G. Martínez-Pinedo
Gratton
Herant
Hoffman
Janka
K. Langanke
Liebendörfer
Limongi
M. Liebendörfer
McLaughlin
N.T. Zinner
Nagataki
Pruet
Rampp
Rauscher
Thielemann
Thompson
Travaglio
Trimble
W.R. Hix
Woosley
Publication venue: 'Elsevier BV'
Publication date: 18/11/2005
Field of study

Core collapse supernovae are the leading actor in the story of the cosmic origin of the chemical elements. Existing models, which generally assume spherical symmetry and parameterize the explosion, have been able to broadly replicate the observed elemental pattern. However, inclusion of neutrino interactions produces noticeable improvement in the composition of the ejecta when compared to observations. Neutrino interactions may also provide a supernova source for light p-process nuclei.Comment: 7 pages, 2 figures, in proceedings of Astronomy with Radioactivities V, Clemson University, September 5-9, 2005, to appear in New Astronomy Review

arXiv.org e-Print Archive

Crossref

UPCommons. Portal del coneixement obert de la UPC

edoc

CERN Document Server

GSI Repository