24,415 research outputs found
Continual Reinforcement Learning in 3D Non-stationary Environments
High-dimensional always-changing environments constitute a hard challenge for
current reinforcement learning techniques. Artificial agents, nowadays, are
often trained off-line in very static and controlled conditions in simulation
such that training observations can be thought as sampled i.i.d. from the
entire observations space. However, in real world settings, the environment is
often non-stationary and subject to unpredictable, frequent changes. In this
paper we propose and openly release CRLMaze, a new benchmark for learning
continually through reinforcement in a complex 3D non-stationary task based on
ViZDoom and subject to several environmental changes. Then, we introduce an
end-to-end model-free continual reinforcement learning strategy showing
competitive results with respect to four different baselines and not requiring
any access to additional supervised signals, previously encountered
environmental conditions or observations.Comment: Accepted in the CLVision Workshop at CVPR2020: 13 pages, 4 figures, 5
table
The genesis of organisational aesthetics
Organisational aesthetics is a burgeoning field with a growing community of scholars engaged in arts-based approaches to research. Recent developments in this field have their origins in the works of early Enlightenment writers such as Vico, Baumgarten and Kant. This paper examines the contributions of these three philosophers and in particular focuses on Vico’s awareness of history and myth; Baumgarten’s notion of sensation and its relationship to rationality; and Kant’s investigations into form and content. By drawing on these ideas, the contemporary aesthetic researcher is informed by qualities such as an alert imagination, comfort with the chaotic, backward thinking, and attention to inner sensations and perceptions, which all work together to provide a coherent view of the organisation as a gestalt
Swift-UVOT Observations of the X-Ray Flash 050406
We present Swift-UVOT data on the optical afterglow of the X-ray flash of
2005 April 6 (XRF 050406) from 88s to \sim 10^5s after the initial prompt
gamma-ray emission. Our observations in the V, B and U bands are the earliest
that have been taken of an XRF optical counterpart. Combining the early -time
optical temporal and spectral properties with \gamma- and simultaneous X-ray
data taken with the BAT and XRT telescopes on-board Swift, we are able to
constrain possible origins of the XRF. The prompt emission had a FRED profile
(fast-rise, exponential decay) with a duration of T_90 = 5.7\pm 0.2s, putting
it at the short end of the long-burst duration distribution. The absence of
photoelectric absorption red-ward of 4000 \AA in the UV/optical spectrum
provides a firm upper limit of z\leq 3.1 on the redshift, thus excluding a high
redshift as the sole reason for the soft spectrum. The optical light curve is
consistent with a power-law decay with slope alpha = -0.75\pm 0.26
(F_{\nu}\propto t^{\alpha}), and a maximum occurring in the first 200s after
the initial gamma-ray emission. The softness of the prompt emission is well
described by an off-axis structured jet model, which is able to account for the
early peak flux and shallow decay observed in the optical and X-ray bands.Comment: 14 pages, 4 figures, accepted for publication in ApJ; typos corrected
and upper limits in table 1 changed from background subtracted count rate in
extraction region to the error associated with thi
Attentive Single-Tasking of Multiple Tasks
In this work we address task interference in universal networks by
considering that a network is trained on multiple tasks, but performs one task
at a time, an approach we refer to as "single-tasking multiple tasks". The
network thus modifies its behaviour through task-dependent feature adaptation,
or task attention. This gives the network the ability to accentuate the
features that are adapted to a task, while shunning irrelevant ones. We further
reduce task interference by forcing the task gradients to be statistically
indistinguishable through adversarial training, ensuring that the common
backbone architecture serving all tasks is not dominated by any of the
task-specific gradients. Results in three multi-task dense labelling problems
consistently show: (i) a large reduction in the number of parameters while
preserving, or even improving performance and (ii) a smooth trade-off between
computation and multi-task accuracy. We provide our system's code and
pre-trained models at http://vision.ee.ethz.ch/~kmaninis/astmt/.Comment: CVPR 2019 Camera Read
- …