Search CORE

38 research outputs found

Focused ion beam/scanning electron microscopy characterization of cell behavior on polymer micro-/nanopatterned substrates: A study of cell–substrate interactions

Author: Andersson
Aparicio
Boyan
C. López-Iglesias
C.A. Mills
Calvacanti-Adam
Cao
Casey
Chou
Clark
Curtis
Dalby
Dalby
den Braber
Drobne
Dubochet
E. Engel
E. Martínez
Flemming
J. Samitier
J.A. Planell
Kamino
Miller
Mills
Obst
Wilkinson
Wojciak-Stothard
Yim
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Steps toward self-aware networks

Author: Boyan J.A.
Erol Gelenbe
Moy
Sakellari G.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Feed-Forward Learning: Fast Reinforcement Learning of Controllers

Author: D.P. Bertsekas
J. Hertz
J.A. Boyan
R.S. Sutton
R.S. Sutton
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Crossref

Unified Inter and Intra Options Learning Using Policy Gradient Methods

Author: I. Menache
J. Peters
J.A. Boyan
R.S. Sutton
S. Bhatnagar
Publication venue
Publication date: 01/01/2012
Field of study

Abstract. Temporally extended actions (or macro-actions) have proven useful for speeding up planning and learning, adding robustness, and building prior knowledge into AI systems. The options framework, as introduced in Sutton, Precup and Singh (1999), provides a natural way to incorporate macro-actions into reinforcement learning. In the subgoals approach, learning is divided into two phases, first learning each option with a prescribed subgoal, and then learning to compose the learned options together. In this paper we offer a unified framework for concurrent inter- and intra-options learning. To that end, we propose a modular parameterization of intra-option policies together with option termination conditions and the option selection policy (inter options), and show that these three decision components may be viewed as a unified policy over an augmented state-action space, to which standard policy gradient algorithms may be applied. We identify the basis functions that apply to each of these decision components, and show that they possess a useful orthogonality property that allows to compute the natural gradient independently for each component. We further outline the extension of the suggested framework to several levels of options hierarchy, and conclude with a brief illustrative example.

CiteSeerX

Crossref

An optimal stopping strategy for online calibration in local search

Author: F.T. Bruss
H.H. Hoos
J.A. Boyan
L. Paquete
M. Molga
P.R. Freeman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

This paper formalizes the problem of choosing online the number of explorations in a local search algorithm as a last-success problem. In this family of stochastic problems the events of interest belong to two categories (success or failure) and the objective consists in predicting when the last success will take place. The application to a local search setting is immediate if we identify the success with the detection of a new local optimum. Being able to predict when the last optimum will be found allows a computational gain by reducing the amount of iterations carried out in the neighborhood of the current solution. The paper proposes a new algorithm for online calibration of the number of iterations during exploration and assesses it with a set of continuous optimisation tasks. © Springer-Verlag Berlin Heidelberg 2011.SCOPUS: cp.kinfo:eu-repo/semantics/publishe

Crossref

DI-fusion

TCP Modification Robust to Packet Reordering in Ant Routing Networks

Author: J.A. Boyan
L. Yong
M. Dijkstra
M. Gadomska
P. Choi
R. Schoonderwoerd
W.R. Stevens
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Crossref

Recursive Least-Squares Learning with Eligibility Traces

Author: A. Antos
A. Nedić
D. Choi
D.P. Bertsekas
J. Tsitsiklis
J.A. Boyan
M. Geist
S.J. Bradtke
Publication venue
Publication date: 01/01/2011
Field of study

In the framework of Markov Decision Processes, we consider the problem of learning a linear approximation of the value function of some fixed policy from one trajectory possibly generated by some other policy. We describe a systematic approach for adapting on-policy learning least squares algorithms of the literature (LSTD [5], LSPE [15], FPKF [7] and GPTD [8]/KTD [10]) to off-policy learning with eligibility traces. This leads to two known algorithms, LSTD(λ)/LSPE(λ) [21] and suggests new extensions of FPKF and GPTD/KTD. We describe their recursive implementation, discuss their convergence properties, and illustrate their behavior experimentally. Overall, our study suggests that the state-of-art LSTD(λ) [21] remains the best least-squares algorithm

HAL-CentraleSupelec

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

Tracking in reinforcement learning

Author: A. Antos
J. Peters
J.A. Boyan
M. Geist
M.G. Lagoudakis
R.E. Kalman
R.S. Sutton
S. Jo
S.J. Bradtke
S.J. Julier
Publication venue: Springer
Publication date: 01/01/2009
Field of study

Abstract. Reinforcement learning induces non-stationarity at several levels. Adaptation to non-stationary environments is of course a desired feature of a fair RL algorithm. Yet, even if the environment of the learning agent can be considered as stationary, generalized policy iteration frameworks, because of the interleaving of learning and control, will produce non-stationarity of the evaluated policy and so of its value function. Tracking the optimal solution instead of trying to converge to it is therefore preferable. In this paper, we propose to handle this tracking issue with a Kalman-based temporal difference framework. Complexity and convergence analysis are studied. Empirical investigations of its ability to handle non-stationarity is finally provided

Rural township of Toodyay, Western Australia, February 1919 /

Author: A.W. Moore
C.-J. Lin
E.H. Mamdani
G.A. Carpenter
G.A. Carpenter
J.A. Boyan
M.C. Mackey
R.J. Williams
R.S. Sutton
R.S. Sutton
S.G. Tzafestas
T. Takagi
Publication venue
Publication date: 01/01/1919
Field of study

Title devised by cataloguer from accompanying information.; Part of the collection: Michael Terry collection of negatives of his expeditions and travels, 1918-1971.; Condition: Spotting.; Also available as a photograph: PIC Album 367.; Also available online at: http://nla.gov.au/nla.pic-vn6248152

Crossref

National Library of Australia Digital Object Repository

DSpace at NTUA

Decision-theoretic control of planetary rovers

Author: C. Boutilier
C. Watkins
E.A. Hansen
E.D. Sacerdoti
J.-P. Foreister
J.A. Boyan
L. Kaelbling
M.L. Puterman
N. Muscettola
R. Fikes
R.S. Sutton
R.S. Sutton
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2002
Field of study

Planetary rovers are small unmanned vehicles equipped with cameras and a variety of sensors used for scientific experiments. They must operate under tight constraints over such resources as operation time, power, storage capacity, and communication bandwidth. Moreover, the limited computational resources of the rover limit the complexity of on-line planning and scheduling. We describe two decision-theoretic approaches to maximize the productivity of planetary rovers: one based on adaptive planning and the other on hierarchical reinforcement learning

CiteSeerX

Crossref

ScholarWorks@UMass Amherst