693 research outputs found
Mirrors with Regular Hexagonal Segments
The point-spread function and emissivity are calculated for a mirror made from regular hexagonal segments of just a few different sizes. A mirror of this type has many similar segments, which is an advantage for manufacturing, and for an ~f/1 mirror with ≥1000 segments and ≥4 sizes of regular hexagons the increase in intersegment gap area is negligible. This result raises the possibility of making a mirror from very large numbers of identical small segments that are warped to the required figure
A Policy Search Method For Temporal Logic Specified Reinforcement Learning Tasks
Reward engineering is an important aspect of reinforcement learning. Whether
or not the user's intentions can be correctly encapsulated in the reward
function can significantly impact the learning outcome. Current methods rely on
manually crafted reward functions that often require parameter tuning to obtain
the desired behavior. This operation can be expensive when exploration requires
systems to interact with the physical world. In this paper, we explore the use
of temporal logic (TL) to specify tasks in reinforcement learning. TL formula
can be translated to a real-valued function that measures its level of
satisfaction against a trajectory. We take advantage of this function and
propose temporal logic policy search (TLPS), a model-free learning technique
that finds a policy that satisfies the TL specification. A set of simulated
experiments are conducted to evaluate the proposed approach
Searching for collective behavior in a network of real neurons
Maximum entropy models are the least structured probability distributions
that exactly reproduce a chosen set of statistics measured in an interacting
network. Here we use this principle to construct probabilistic models which
describe the correlated spiking activity of populations of up to 120 neurons in
the salamander retina as it responds to natural movies. Already in groups as
small as 10 neurons, interactions between spikes can no longer be regarded as
small perturbations in an otherwise independent system; for 40 or more neurons
pairwise interactions need to be supplemented by a global interaction that
controls the distribution of synchrony in the population. Here we show that
such "K-pairwise" models--being systematic extensions of the previously used
pairwise Ising models--provide an excellent account of the data. We explore the
properties of the neural vocabulary by: 1) estimating its entropy, which
constrains the population's capacity to represent visual information; 2)
classifying activity patterns into a small set of metastable collective modes;
3) showing that the neural codeword ensembles are extremely inhomogenous; 4)
demonstrating that the state of individual neurons is highly predictable from
the rest of the population, allowing the capacity for error correction.Comment: 24 pages, 19 figure
Deep reinforcement learning from human preferences
For sophisticated reinforcement learning (RL) systems to interact usefully
with real-world environments, we need to communicate complex goals to these
systems. In this work, we explore goals defined in terms of (non-expert) human
preferences between pairs of trajectory segments. We show that this approach
can effectively solve complex RL tasks without access to the reward function,
including Atari games and simulated robot locomotion, while providing feedback
on less than one percent of our agent's interactions with the environment. This
reduces the cost of human oversight far enough that it can be practically
applied to state-of-the-art RL systems. To demonstrate the flexibility of our
approach, we show that we can successfully train complex novel behaviors with
about an hour of human time. These behaviors and environments are considerably
more complex than any that have been previously learned from human feedback
The SAO and Kelvin waves in the EuroGRIPS GCMS and the UK Met. Office analyses
International audienceWe compare the tropical oscillations and planetary scale Kelvin waves in four troposphere-stratosphere climate models and the assimilated dataset produced by the United Kingdom Meteorological Office (UKMO). The comparison has been made in the GRIPS framework "GCM-Reality Intercomparison Project for SPARC", where SPARC is Stratospheric Processes and their Role in Climate, a project of the World Climate Research Program. The four models evaluated are European members of GRIPS: the UKMO Unified Model (UM), the model of the Free University in Berlin (FUB–GCM), the ARPEGE-climat model of the French National Centre for Meteorological Research (CNRM), and the Extended UGAMP GCM (EUGCM) of the Centre for Global Atmospheric Modelling (CGAM). The integrations were performed with different, but annually periodic external conditions (e.g., sea-surface temperature, sea ice, and incoming solar radiation). The structure of the tropical winds and the strengths of the Kelvin waves are examined. In the analyses where the SAO (Semi-Annual Oscillation) and the QBO (Quasi-Biennal Oscillation) are reasonably well captured, the amplitude of these analysed Kelvin waves is close to that observed in independent data from UARS (Upper Atmosphere Research Satellite). In agreement with observations, the Kelvin waves generated in the models propagate into the middle atmosphere as wave packets, consistent with a convective forcing origin. In three of the models, slow Kelvin waves propagate too high and their amplitudes are overestimated in the upper stratosphere and in the mesosphere, the exception is the UM which has weaker waves. None of the modelled waves are sufficient to force realistic eastward phases of the QBO or SAO. Although the SAO is represented by all models, only two of them are able to generate westerlies between 10 hPa and 50 hPa. The importance of the role played in the SAO by unresolved gravity waves is emphasized. Although it exhibits some unrealistic features, the EUGCM, which includes a parametrization of gravity waves with a non-zero phase speed, is able to simulate clear easterly to westerly transitions as well as westerlies with down-ward propagation. Thermal damping is also important for the westerly forcing in the stratosphere
- …
