Search CORE

2,823 research outputs found

Deep reinforcement learning from human preferences

Author: Amodei Dario
Brown Tom B.
Christiano Paul
Legg Shane
Leike Jan
Martic Miljan
Publication venue
Publication date: 13/07/2017
Field of study

For sophisticated reinforcement learning (RL) systems to interact usefully with real-world environments, we need to communicate complex goals to these systems. In this work, we explore goals defined in terms of (non-expert) human preferences between pairs of trajectory segments. We show that this approach can effectively solve complex RL tasks without access to the reward function, including Atari games and simulated robot locomotion, while providing feedback on less than one percent of our agent's interactions with the environment. This reduces the cost of human oversight far enough that it can be practically applied to state-of-the-art RL systems. To demonstrate the flexibility of our approach, we show that we can successfully train complex novel behaviors with about an hour of human time. These behaviors and environments are considerably more complex than any that have been previously learned from human feedback

arXiv.org e-Print Archive

Defensive Tail-curling and Head-mimicking Behavior in a Variable Coralsnake, Micrurus diastema (Squamata: Elapidae) in Cusuco National Park, Honduras

Author: Barazowski Mitchell B.
Brown Tom W.
Publication venue: 'The University of Kansas'
Publication date: 19/07/2020
Field of study

The University of Kansas: Journals@KU

Biodiversity Informatics

Three ways to compute multiport inertance

Author: Brown B,
Gustafsson Tom
Mallinson S,
McBain George,
Publication venue: HAL CCSD
Publication date: 01/01/2019
Field of study

International audienceThe immediate impulse-response of a confined incompressible fluid is characterized by inertance. For a vessel with inlet and outlet, this is a single quantity; for multiple ports the generalization is a singular reciprocal inertance matrix, acting on the port-impulses to give the corresponding inflows. The coefficients are defined by the boundary-fluxes of potential flows. Green's identity converts these to domain integrals of kinetic energy. If the system is discretized with finite elements, a third method is proposed which requires only the stiffness matrix and the solution vectors and no numerical differentiation

Australian Mathematical Society (AustMS): E-Journals

Seeing with sound? Exploring different characteristics of a visual-to-auditory sensory substitution device

Author: Bach-Y-Rita P
David Brown
Durette B
Evans K K
Jamie Ward
Marks L E
McKelvie S J
Tom Macpherson
Publication venue: 'Pion Ltd'
Publication date: 01/01/2011
Field of study

Sensory substitution devices convert live visual images into auditory signals, for example with a web camera (to record the images), a computer (to perform the conversion) and headphones (to listen to the sounds). In a series of three experiments, the performance of one such device (‘The vOICe’) was assessed under various conditions on blindfolded sighted participants. The main task that we used involved identifying and locating objects placed on a table by holding a webcam (like a flashlight) or wearing it on the head (like a miner’s light). Identifying objects on a table was easier with a hand-held device, but locating the objects was easier with a head-mounted device. Brightness converted into loudness was less effective than the reverse contrast (dark being loud), suggesting that performance under these conditions (natural indoor lighting, novice users) is related more to the properties of the auditory signal (ie the amount of noise in it) than the cross-modal association between loudness and brightness. Individual differences in musical memory (detecting pitch changes in two sequences of notes) was related to the time taken to identify or recognise objects, but individual differences in self-reported vividness of visual imagery did not reliably predict performance across the experiments. In general, the results suggest that the auditory characteristics of the device may be more important for initial learning than visual associations

Crossref

Sussex Research Online

Early decarbonisation of the European energy system pays off

Author: Andresen Gorm B.
Brown Tom
Greiner Martin
Victoria Marta
Zhu Kun
Publication venue: Nature Research
Publication date: 04/12/2020
Field of study

KITopen