Search CORE

38 research outputs found

Parallel reward and punishment control in humans and robots: Safe reinforcement learning using the MaxPain algorithm

Author: Elfwing S
Seymour B
Publication venue: 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017
Publication date: 01/01/2017
Field of study

An important issue in reinforcement learning systems for autonomous agents is whether it makes sense to have separate systems for predicting rewards and punishments. In robotics, learning and control are typically achieved by a single controller, with punishments coded as negative rewards. However in biological systems, some evidence suggests that the brain has a separate system for punishment. Although this may in part be due to biological constraints of implementing negative quantities, it raises the question as to whether there is any computational rationale for keeping reward and punishment prediction operationally distinct. Here we outline a basic argument supporting this idea, based on the proposition that learning best-case predictions (as in Q-learning) does not always achieve the safest behaviour. We introduce a modified RL scheme involving a new algorithm which we call 'MaxPain' - which back-ups worst-case predictions in parallel, and then scales the two predictions in a multi-attribute RL policy. i.e. independently learning 'what to do' as well as 'what not to do' and then combining this information. We show how this scheme can improve performance in benchmark RL environments, including a grid-world experiment and a delayed version of the mountain car experiment. In particular, we demonstrate how early exploration and learning are substantially improved, leading to much 'safer' behaviour. In conclusion, the results illustrate the importance of independent punishment prediction in RL, and provide a testable framework for better understanding punishment (such as pain) and avoidance in humans, in both health and disease

Crossref

Apollo (Cambridge)

Evidence for a bimodal distribution of Escherichia coli doubling times below a threshold initial cell concentration

Author: A Elfwing
A Metris
C-Y Chen
Chin-Yi Chen
E Kussell
F Balagadde
G Niven
George C Paoli
J Brewster
L Guillier
L Guillier
Ly-Huong T Nguyen
N Balaban
P Irwin
P Irwin
P Irwin
P Irwin
P Irwin
P Irwin
P Irwin
Peter L Irwin
S Lopez
T Oscar
V Valiunas
Z Kutalik
Publication venue: BioMed Central
Publication date: 01/08/2010
Field of study

Abstract Background In the process of developing a microplate-based growth assay, we discovered that our test organism, a native E. coli isolate, displayed very uniform doubling times (τ) only up to a certain threshold cell density. Below this cell concentration (≤ 100 -1,000 CFU mL-1 ; ≤ 27-270 CFU well-1) we observed an obvious increase in the τ scatter. Results Working with a food-borne E. coli isolate we found that τ values derived from two different microtiter platereader-based techniques (i.e., optical density with growth time {=OD[t]} fit to the sigmoidal Boltzmann equation or time to calculated 1/2-maximal OD {=tm} as a function of initial cell density {=tm[CI]}) were in excellent agreement with the same parameter acquired from total aerobic plate counting. Thus, using either Luria-Bertani (LB) or defined (MM) media at 37°C, τ ranged between 17-18 (LB) or 51-54 (MM) min. Making use of such OD[t] data we collected many observations of τ as a function of manifold initial or starting cell concentrations (CI). We noticed that τ appeared to be distributed in two populations (bimodal) at low CI. When CI ≤100 CFU mL-1 (stationary phase cells in LB), we found that about 48% of the observed τ values were normally distributed around a mean (μτ1) of 18 ± 0.68 min (± στ1) and 52% with μτ2 = 20 ± 2.5 min (n = 479). However, at higher starting cell densities (CI>100 CFU mL-1), the τ values were distributed unimodally (μτ = 18 ± 0.71 min; n = 174). Inclusion of a small amount of ethyl acetate to the LB caused a collapse of the bimodal to a unimodal form. Comparable bimodal τ distribution results were also observed using E. coli cells diluted from mid-log phase cultures. Similar results were also obtained when using either an E. coli O157:H7 or a Citrobacter strain. When sterile-filtered LB supernatants, which formerly contained relatively low concentrations of bacteria(1,000-10,000 CFU mL-1), were employed as a diluent, there was an evident shift of the two populations towards each other but the bimodal effect was still apparent using either stationary or log phase cells. Conclusion These data argue that there is a dependence of growth rate on starting cell density.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Modelling interactions of acid–base balance and respiratory status in the toxicity of metal mixtures in the American oyster Crassostrea virginica

Author: A. Fred Holland
Almeida
Annalaura Mancia
Bishop
Booth
Brett M. Macey
Burnett
Cannon
Carpenter
Chapman
Charles Cunningham
Dailianis
Dankbar
Dorta
Dovzhenko
Elfwing
Engel
Erin J. Burge
Flipič
Geret
Geret
Gregory W. Warr
Hashem
Heidi R. Williams
Jenny
Jonas S. Almeida
Karen G. Burnett
Khan
Linder
Lindy K. Thibodeaux
Louis Burnett
Macey
Marigómez
Marion Beal
Massabuau
Matthew J. Jenny
Paul S. Gross
Quig
Ringwood
Ringwood
Robert W. Chapman
Roesijadi
Roméo
Sanger
Sexton
Sokolova
Sonomi Hikima
Stohs
Stohs
Valko
Viarengo
Viarengo
Wang
Wang
Yang
Publication venue: 'Elsevier BV'
Publication date: 12/11/2009
Field of study

Author Posting. © The Author(s), 2009. This is the author's version of the work. It is posted here by permission of Elsevier B.V. for personal use, not for redistribution. The definitive version was published in Comparative Biochemistry and Physiology - Part A: Molecular & Integrative Physiology 155 (2010): 341-349, doi:10.1016/j.cbpa.2009.11.019.Heavy metals, such as copper, zinc and cadmium, represent some of the most common and serious pollutants in coastal estuaries. In the present study, we used a combination of linear and artificial neural network (ANN) modelling to detect and explore interactions among low-dose mixtures of these heavy metals and their impacts on fundamental physiological processes in tissues of the Eastern oyster, Crassostrea virginica. Animals were exposed to Cd (0.001 – 0.400 μM), Zn (0.001 – 3.059 μM) or Cu (0.002 – 0.787 μM), either alone or in combination for 1 to 27 days. We measured indicators of acid-base balance (hemolymph pH and total CO2), gas exchange (Po2), immunocompetence (total hemocyte counts, numbers of invasive bacteria), antioxidant status (glutathione, GSH), oxidative damage (lipid peroxidation; LPx), and metal accumulation in the gill and the hepatopancreas. Linear analysis showed that oxidative membrane damage from tissue accumulation of environmental metals was correlated with impaired acid-base balance in oysters. ANN analysis revealed interactions of metals with hemolymph acid-base chemistry in predicting oxidative damage that were not evident from linear analyses. These results highlight the usefulness of machine learning approaches, such as ANNs, for improving our ability to recognize and understand the effects of sub-acute exposure to contaminant mixtures.This study was supported by NOAA’s Center of Excellence in Oceans and Human Health at HML and the National Science Foundation

Crossref

Woods Hole Open Access Server

PubMed Central

Archivio istituzionale della ricerca - Università di Ferrara

The behaviour of giant clams (Bivalvia: Cardiidae: Tridacninae)

Author: A Comfort
AC Alcala
AC Alcala
AD Ansell
AM Hart
AS Othman
AS-H Tan
AS-H Tan
AS-H Tan
AS-H Tan
AYM Lin
B Morton
BL Bayne
BL Bayne
BL Bayne
BP Lyons
C Romano
C Wabnitz
CA Richardson
CC Shelley
CM Crawford
CM Yonge
CM Yonge
CM Yonge
CM Yonge
CP Raven
CR Stasek
CR Stasek
CS Rogers
D Huang
D Petersen
D Petersen
DF McMichael
DJW Lane
DL Waller
DW Klumpp
E Blidberg
E Blidberg
E Hviding
EC Peters
ED Gomez
ED Gomez
ED Gomez
FJ Hester
FT Te
G Accordi
G Thorson
GA Heslinga
GA Heslinga
GA Heslinga
H Ling
HJ Cranfield
HP Calumpong
I Svane
ID Bell
J Gwyther
J Krause
J Rosewater
J Rosewater
JA Downing
JA Idjadi
JC Lang
JE Morton
JH Norton
JL Culliney
JR Guest
JS Lucas
JS Lucas
JS Lucas
JS Lucas
JT Hardy
K Bonham
K Fujikura
K Vicentuan-Cabaitan
KR Chan
L Kirkendale
L Vaillant
LA Wilkens
LA Wilkens
LA Wilkens
LC Lund-Hansen
LL Hollingsworth
LS Peck
M LaBarbera
M LaBarbera
M Yamaguchi
MC Hartman
MC Watzin
MD Bertness
MF Land
MF Land
ML Neo
ML Neo
ML Neo
ML Neo
ML Neo
ML Neo
ML Neo
ML Neo
MR Carriker
N Beckvar
NM Husin
O Reimer
P Dumas
P Soo
PA Todd
PA Zahl
Pamela Soo
PC Cabaitan
PE Munro
Peter A. Todd
PV Fankboner
PV Fankboner
PV Fankboner
PW Glynn
R Bustard
R Marois
R Seed
RD Braley
RD Braley
RD Braley
RD Purchon
RD Purchon
RGB Reid
RGB Reid
RGB Reid
RJ Toonen
RK Trench
RW Hickman
S Ellis
S Kawaguti
S Kawaguti
SC Jameson
SK Malham
SK Wada
SK Wada
SN Alcazar
SN Alcazar
SN Alcazar
SN Alcazar
SR Rodríguez
SS Mingoa-Licuanan
SS Mingoa-Licuanan
T Elfwing
T Elfwing
T Elfwing
T Huelsken
THJ Gilmour
TJH Adams
VL Loosanoff
W Eckman
WK Fitt
WK Fitt
Y Su
Y Suzuki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Giant clams, the largest living bivalves, live in close association with coral reefs throughout the Indo-Pacific. These iconic invertebrates perform numerous important ecological roles as well as serve as flagship species—drawing attention to the ongoing destruction of coral reefs and their associated biodiversity. To date, no review of giant clams has focussed on their behaviour, yet this component of their autecology is critical to their life history and hence conservation. Almost 100 articles published between 1865 and 2014 include behavioural observations, and these have been collated and synthesised into five sections: spawning, locomotion, feeding, anti-predation, and stress responses. Even though the exact cues for spawning in the wild have yet to be elucidated, giant clams appear to display diel and lunar periodicities in reproduction, and for some species, peak breeding seasons have been established. Perhaps surprisingly, giant clams have considerable mobility, ranging from swimming and gliding as larvae to crawling in juveniles and adults. Chemotaxis and geotaxis have been established, but giant clams are not phototactic. At least one species exhibits clumping behaviour, which may enhance physical stabilisation, facilitate reproduction, or provide protection from predators. Giant clams undergo several shifts in their mode of acquiring nutrition; starting with a lecithotrophic and planktotrophic diet as larvae, switching to pedal feeding after metamorphosis followed by the transition to a dual mode of filter feeding and phototrophy once symbiosis with zooxanthellae (Symbiodinium spp.) is established. Because of their shell weight and/or byssal attachment, adult giant clams are unable to escape rapidly from threats using locomotion. Instead, they exhibit a suite of visually mediated anti-predation behaviours that include sudden contraction of the mantle, valve adduction, and squirting of water. Knowledge on the behaviour of giant clams will benefit conservation and restocking efforts and help fine-tune mariculture techniques. Understanding the repertoire of giant clam behaviours will also facilitate the prediction of threshold levels for sustainable exploitation as well as recovery rates of depleted clam populations

Crossref

Springer - Publisher Connector

PubMed Central

ScholarBank@NUS

Cardio-respiratory development in bird embryos: new insights from a venerable animal model

Author: Acosta E.
Adair T. H.
Adkins-Regan E.
Albers P.
Albers P. H.
Alexander B. T.
Altimiras J.
Altimiras J.
Andrewartha S.
Asson-Batres M. A.
Azzam M. A.
Baccarelli A.
Barker D. J.
Barker D. J.
Barott H. G.
Bellairs R.
Bergmann R. L.
Bertossi M.
Bjorklund D. F.
Bjornstad S.
Blacker H. A.
Blom J.
Boehm C.
Borkowf A.
Brand M.
Brand Z.
Branum S. R.
Branum S. R.
Broekhuizen M. L.
Bruggeman V.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burggren W. W.
Burness G.
Burton G. J.
Buzala M.
Canga L.
Chan T.
Chiba Y.
Chin E. H.
Chiossi G.
Cohen E.
Collin A.
Collotta M.
Copeland J.
Corona T. B.
Crews D.
Crews D.
Crews D.
Criscuolo F.
Crossley D. A
Crossley D. A
Crossley D. A.
Crossley D. A.
Crossley D. A. II.
Danchin E.
Datar S.
Deans C.
Domyan E. T.
Druyan S
Druyan S.
Duncan E. J.
Dunn B. E.
Dzialowski E. M.
Dzialowski E. M.
Dzialowski E. M.
Dzialowski E. M.
Elfwing M.
Ellis H. L.
Everaert N.
Everaert N.
Everaert N.
Everaert N.
Felsenfeld G.
Fernandez-Twinn D. S.
Ferner K.
Filas B. A.
Finkler M. S.
Fisher S. A.
Flores Santin J.
Ford T. N.
Fournier A.
Franks M. E.
Gabrielli M. G.
Gheorghescu A. K.
Gonzalez-Bulnes A.
Goodrich E. S.
Grabowski C. T.
Grishkevich V.
Gryzinska M.
Hala D.
Hamburger V.
Hasselquist D.
Hector K. L.
Herrera E. A.
Herrington J.
Hill W. L.
Hirst C. E.
Ho D. H
Ho D. H.
Ho D. H.
Holland M. L.
Honarmand M.
Hutchings J. A.
Huth J. C.
Itani N.
Iversen N. K.
Jonker S. S.
Josele Flores Santin
Jourdeuil K. A.
Kain K. H.
Karadas F.
Keller M. C.
Keyte A. L.
Khorrami S.
Kilvitis H. J.
Kim M.
Kishore A. S.
Knobloch J.
Kopf P. G.
Kormos E.
Koutsos E. A.
Kowalski W. J.
Kue C. S.
Kuzawa C. W.
Labaque M. C.
le Noble F. A.
Lee J. Y.
Lee S. J.
Lewallen M. A.
Li C.
Li X. G.
Lindgren I.
Lindgren I.
Loepke A. W.
Lourens A.
Lucitti J. L.
Lucitti J. L.
Luo Z. C.
Maina J. N
Maina J. N.
Maina J. N.
Malir F.
Maria Rojas Antich
Mattsson A.
Mendez-Sanchez J. F.
Mendizabal I.
Mortola J. P
Mortola J. P.
Mroczek-Sosnowska N.
Mueller C. A.
Mueller C. A.
Mueller C. A.
Mueller C. A.
Mueller C. A.
Mueller C. A.
Mueller C. A.
Nangsuay A.
Natarajan A.
Newkirk C. E.
Nijland M. J.
Nowak-Sliwinska P.
Nowak-Sliwinska P.
Näf C.
Oliveira T. F.
Olson C. R.
Olson C. R.
Palo P. E.
Rahn H.
Reed W. L.
Reeves S. R.
Reyna K. S.
Reyna K. S.
Richards M. P.
Riddle O.
Roberts V. H.
Roig B.
Romanoff A.
Romanoff A. L.
Romanoff A. L.
Rosenbruch M.
Rouwet E. V.
Ruijtenbeek K.
Rundle S. D.
Rymer T. L.
Sadler W. W.
Sarre A.
Sato M.
Saw C. L.
Schneider W. J
Sedaghat K.
Shao Z. H.
Sharma S.
Shell L.
Simsek H.
Sirsat S. K.
Smith S. M.
Spicer J. I.
Spicer J. I.
Spicer J. I.
Stangenberg S.
Stekelenburg-de Vos S.
Stern C. D.
Stock M. K.
Stoleson S. H.
Stow C. A.
Streit A.
Strick D. M.
Taylor L. W.
Taylor L. W.
Tazawa H.
Tazawa H.
Tazawa H.
Thornburg K. L.
Tintu A.
Tissier M. L.
Tzschentke B.
Tzschentke B.
Varela-Lasheras I.
Varnagy L.
Varriale A
Villamor E.
Villamor E.
Wada H.
Walter I.
Warkentin K. M.
Warren W. Burggren
West-Eberhard M. J.
Westman O.
Willems E.
Willems E.
Willems E.
Willems E.
Williams T. D.
Xu L. J.
Yu M. H.
Yuan Y. J.
Zhang H.
Zhang X.
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

Parallel reward and punishment control in humans and robots: Safe reinforcement learning using the MaxPain algorithm

Author: Elfwing S
Seymour B
Publication venue
Publication date: 02/04/2018
Field of study

An important issue in reinforcement learning systems for autonomous agents is whether it makes sense to have separate systems for predicting rewards and punishments. In robotics, learning and control are typically achieved by a single controller, with punishments coded as negative rewards. However in biological systems, some evidence suggests that the brain has a separate system for punishment. Although this may in part be due to biological constraints of implementing negative quantities, it raises the question as to whether there is any computational rationale for keeping reward and punishment prediction operationally distinct. Here we outline a basic argument supporting this idea, based on the proposition that learning best-case predictions (as in Q-learning) does not always achieve the safest behaviour. We introduce a modified RL scheme involving a new algorithm which we call 'MaxPain' - which back-ups worst-case predictions in parallel, and then scales the two predictions in a multiattribute RL policy. i.e. independently learning 'what to do' as well as 'what not to do' and then combining this information. We show how this scheme can improve performance in benchmark RL environments, including a grid-world experiment and delayed version of the mountain car experiment. In particular, we demonstrate how early exploration and learning are substantially improved, leading to much 'safer' behaviour. In conclusion, the results illustrate the importance of independent punishment prediction in RL, and provide a testable framework for better understanding punishment (such as pain) and avoidance in humans, in both health and disease

CUED - Cambridge University Engineering Department

Multi-Task Reinforcement Learning: Shaping and Feature Selection

Author: H. Hachiya
L. Breiman
M.E. Taylor
S. Elfwing
Publication venue
Publication date: 01/01/2012
Field of study

Abstract. Shaping functions can be used in multi-task reinforcement learning (RL) to incorporate knowledge from previously experienced source tasks to speed up learning on a new target task. Earlier work has not clearly motivated choices for the shaping function. This paper discusses and empirically compares several alternatives, and demonstrates that the most intuive one may not always be the best option. In addition, we extend previous work on identifying good representations for the value and shaping functions, and show that selecting the right representation results in improved generalization over tasks.

CiteSeerX

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Predictive uncertainty estimation for out-of-distribution detection in digital pathology.

Author: Elfwing S.
Laak J.A.W.M. van der
Linmans J.H.J.
Litjens G.J.S.
Publication venue
Publication date: 01/01/2023
Field of study

Machine learning model deployment in clinical practice demands real-time risk assessment to identify situations in which the model is uncertain. Once deployed, models should be accurate for classes seen during training while providing informative estimates of uncertainty to flag abnormalities and unseen classes for further analysis. Although recent developments in uncertainty estimation have resulted in an increasing number of methods, a rigorous empirical evaluation of their performance on large-scale digital pathology datasets is lacking. This work provides a benchmark for evaluating prevalent methods on multiple datasets by comparing the uncertainty estimates on both in-distribution and realistic near and far out-of-distribution (OOD) data on a whole-slide level. To this end, we aggregate uncertainty values from patch-based classifiers to whole-slide level uncertainty scores. We show that results found in classical computer vision benchmarks do not always translate to the medical imaging setting. Specifically, we demonstrate that deep ensembles perform best at detecting far-OOD data but can be outperformed on a more challenging near-OOD detection task by multi-head ensembles trained for optimal ensemble diversity. Furthermore, we demonstrate the harmful impact OOD data can have on the performance of deployed machine learning models. Overall, we show that uncertainty estimates can be used to discriminate in-distribution from OOD data with high AUC scores. Still, model deployment might require careful tuning based on prior knowledge of prospective OOD data

Expected energy-based restricted Boltzmann machine for classification

Author: Bengio
Cireşan
Decoste
E. Uchibe
Elfwing
Freund
Hinton
Hinton
Hinton
Hinton
Hinton
K. Doya
Larochelle
LeCun
S. Elfwing
Salakhutdinov
Sallans
Smolensky
Sutton
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Emergence of Different Mating Strategies in Artificial Embodied Evolution

Author: C. Bleay
D. Floreano
J.H. Brockmann
K. Doya
M. Sandell
R. Watson
S. Elfwing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Crossref