334,820 research outputs found
CrossNorm: Normalization for Off-Policy TD Reinforcement Learning
Off-policy temporal difference (TD) methods are a powerful class of
reinforcement learning (RL) algorithms. Intriguingly, deep off-policy TD
algorithms are not commonly used in combination with feature normalization
techniques, despite positive effects of normalization in other domains. We show
that naive application of existing normalization techniques is indeed not
effective, but that well-designed normalization improves optimization stability
and removes the necessity of target networks. In particular, we introduce a
normalization based on a mixture of on- and off-policy transitions, which we
call cross-normalization. It can be regarded as an extension of batch
normalization that re-centers data for two different distributions, as present
in off-policy learning. Applied to DDPG and TD3, cross-normalization improves
over the state of the art across a range of MuJoCo benchmark tasks
Local-Global Temporal Difference Learning for Satellite Video Super-Resolution
Optical-flow-based and kernel-based approaches have been widely explored for
temporal compensation in satellite video super-resolution (VSR). However, these
techniques involve high computational consumption and are prone to fail under
complex motions. In this paper, we proposed to exploit the well-defined
temporal difference for efficient and robust temporal compensation. To fully
utilize the temporal information within frames, we separately modeled the
short-term and long-term temporal discrepancy since they provide distinctive
complementary properties. Specifically, a short-term temporal difference module
is designed to extract local motion representations from residual maps between
adjacent frames, which provides more clues for accurate texture representation.
Meanwhile, the global dependency in the entire frame sequence is explored via
long-term difference learning. The differences between forward and backward
segments are incorporated and activated to modulate the temporal feature,
resulting in holistic global compensation. Besides, we further proposed a
difference compensation unit to enrich the interaction between the spatial
distribution of the target frame and compensated results, which helps maintain
spatial consistency while refining the features to avoid misalignment.
Extensive objective and subjective evaluation of five mainstream satellite
videos demonstrates that the proposed method performs favorably for satellite
VSR. Code will be available at \url{https://github.com/XY-boy/TDMVSR}Comment: Submitted to IEEE TCSV
Neural correlates of implicit knowledge about statistical regularities
In this study, we examined the neural correlates of implicit knowledge about statistical regularities of temporal order and item chunks using functional magnetic resonance imaging (fMRI). In a familiarization scan, participants viewed a stream of scenes consisting of structured (i.e., three scenes were always presented in the same order) and random triplets. In the subsequent test scan, participants were required to detect a target scene. Test sequences included both forward order of scenes presented during the familiarization scan and backward order of scenes (i.e., reverse order of forward scenes). Behavioral results showed a learning effect of temporal order in the forward condition and scene chunks in the backward condition. fMRI data from the familiarization scan showed the difference of activations between the structured and random blocks in the left posterior cingulate cortex including the retrosplenial cortex. More important, in the test scan, we observed brain activities in the left parietal lobe when participants detected target scenes on temporal order information. In contrast, the left precuneus activated when participants detected target scenes based on scene chunks. Our findings help clarify the brain mechanisms of implicit knowledge about acquired regularities
- …