512 research outputs found
Mirrors with Regular Hexagonal Segments
The point-spread function and emissivity are calculated for a mirror made from regular hexagonal segments of just a few different sizes. A mirror of this type has many similar segments, which is an advantage for manufacturing, and for an ~f/1 mirror with ≥1000 segments and ≥4 sizes of regular hexagons the increase in intersegment gap area is negligible. This result raises the possibility of making a mirror from very large numbers of identical small segments that are warped to the required figure
A Policy Search Method For Temporal Logic Specified Reinforcement Learning Tasks
Reward engineering is an important aspect of reinforcement learning. Whether
or not the user's intentions can be correctly encapsulated in the reward
function can significantly impact the learning outcome. Current methods rely on
manually crafted reward functions that often require parameter tuning to obtain
the desired behavior. This operation can be expensive when exploration requires
systems to interact with the physical world. In this paper, we explore the use
of temporal logic (TL) to specify tasks in reinforcement learning. TL formula
can be translated to a real-valued function that measures its level of
satisfaction against a trajectory. We take advantage of this function and
propose temporal logic policy search (TLPS), a model-free learning technique
that finds a policy that satisfies the TL specification. A set of simulated
experiments are conducted to evaluate the proposed approach
Deep reinforcement learning from human preferences
For sophisticated reinforcement learning (RL) systems to interact usefully
with real-world environments, we need to communicate complex goals to these
systems. In this work, we explore goals defined in terms of (non-expert) human
preferences between pairs of trajectory segments. We show that this approach
can effectively solve complex RL tasks without access to the reward function,
including Atari games and simulated robot locomotion, while providing feedback
on less than one percent of our agent's interactions with the environment. This
reduces the cost of human oversight far enough that it can be practically
applied to state-of-the-art RL systems. To demonstrate the flexibility of our
approach, we show that we can successfully train complex novel behaviors with
about an hour of human time. These behaviors and environments are considerably
more complex than any that have been previously learned from human feedback
Implicit Gender Bias in the Classroom: Memories from K-12 Education
Implicit biases affect everyone in society, including within the K-12 education system. This study investigated what memories of implicit gender bias preservice teachers (PSTs) recalled from their K-12 education. These memories may be connected to the PSTs’ embedded implicit biases and indicate the long-term impact of teachers’ biases on students. A total of 141 undergraduate PSTs from two universities were surveyed regarding gender expectations and recognition of LGBTQ+ people. Results indicated an inconsistency between espoused beliefs and practices within the classrooms. Because schools often reflect society’s norms and perpetuate them through implicit bias, understanding what biases are currently accepted and reinforced in schools allows teacher education programs to unpack these specific biases with their preservice teachers to promote greater equality and ultimately reduce sexism in our society
Searching for collective behavior in a network of real neurons
Maximum entropy models are the least structured probability distributions
that exactly reproduce a chosen set of statistics measured in an interacting
network. Here we use this principle to construct probabilistic models which
describe the correlated spiking activity of populations of up to 120 neurons in
the salamander retina as it responds to natural movies. Already in groups as
small as 10 neurons, interactions between spikes can no longer be regarded as
small perturbations in an otherwise independent system; for 40 or more neurons
pairwise interactions need to be supplemented by a global interaction that
controls the distribution of synchrony in the population. Here we show that
such "K-pairwise" models--being systematic extensions of the previously used
pairwise Ising models--provide an excellent account of the data. We explore the
properties of the neural vocabulary by: 1) estimating its entropy, which
constrains the population's capacity to represent visual information; 2)
classifying activity patterns into a small set of metastable collective modes;
3) showing that the neural codeword ensembles are extremely inhomogenous; 4)
demonstrating that the state of individual neurons is highly predictable from
the rest of the population, allowing the capacity for error correction.Comment: 24 pages, 19 figure
- …