219 research outputs found
Sustained Negative BOLD Response in Human fMRI Finger Tapping Task
In this work, we investigated the sustained negative blood oxygen level-dependent (BOLD) response (sNBR) using functional magnetic resonance imaging during a finger tapping task. We observed that the sNBR for this task was more extensive than has previously been reported. The cortical regions involved in sNBR are divided into the following three groups: frontal, somatosensory and occipital. By investigating the spatial structure, area, amplitude, and dynamics of the sNBR in comparison with those of its positive BOLD response (PBR) counterpart, we made the following observations. First, among the three groups, the somatosensory group contained the greatest number of activated voxels and the fewest deactivated voxels. In addition, the amplitude of the sNBR in this group was the smallest among the three groups. Second, the onset and peak time of the sNBR are both larger than those of the PBR, whereas the falling edge time of the sNBR is less than that of the PBR. Third, the long distance between most sNBR foci and their corresponding PBR foci makes it unlikely that they share the same blood supply artery. Fourth, the couplings between the sNBR and its PBR counterpart are distinct among different regions and thus should be investigated separately. These findings imply that the origin of most sNBR foci in the finger-tapping task is much more likely to be neuronal activity suppression rather than “blood steal.
Visualizing Neural Network Developing Perturbation Theory
In this letter, motivated by the question that whether the empirical fitting
of data by neural network can yield the same structure of physical laws, we
apply the neural network to a simple quantum mechanical two-body scattering
problem with short-range potentials, which by itself also plays an important
role in many branches of physics. We train a neural network to accurately
predict -wave scattering length, which governs the low-energy scattering
physics, directly from the scattering potential without solving Schr\"odinger
equation or obtaining the wavefunction. After analyzing the neural network, it
is shown that the neural network develops perturbation theory order by order
when the potential increases. This provides an important benchmark to the
machine-assisted physics research or even automated machine learning physics
laws.Comment: 5 pages, 4 figure
Efficient RLHF: Reducing the Memory Usage of PPO
Reinforcement Learning with Human Feedback (RLHF) has revolutionized language
modeling by aligning models with human preferences. However, the RL stage,
Proximal Policy Optimization (PPO), requires over 3x the memory of Supervised
Fine-Tuning (SFT), making it infeasible to use for most practitioners. To
address this issue, we present a comprehensive analysis the memory usage,
performance, and training time of memory-savings techniques for PPO. We
introduce Hydra-RLHF by first integrating the SFT and Reward models and then
dynamically turning LoRA "off" during training. Our experiments show: 1. Using
LoRA during PPO reduces its memory usage to be smaller than SFT while improving
alignment across four public benchmarks, and 2. Hydra-PPO reduces the latency
per sample of LoRA-PPO by up to 65% while maintaining its performance. Our
results demonstrate that Hydra-PPO is a simple and promising solution for
enabling more widespread usage of RLHF
Comparison of the Long-Range Climate Memory in Outgoing Longwave Radiation over the Tibetan Plateau and the Indian Monsoon Region
Based on the detrended fluctuation analysis (DFA) method, scaling behaviors of the daily outgoing longwave radiation (OLR) from 1979 to 2015 over the Tibetan Plateau (TP) and the Indian Monsoon Region (IMR) are analyzed. The results show that there is long-term memory for the OLR time series over the TP and IMR. The long-range memory behaviors of OLR over TP are stronger than those over IMR. The averaged values of the scaling exponents over TP and IMR are 0.71 and 0.64; the maximum values in the two regions are 0.81 and 0.75; the minimum values are 0.59 and 0.58. The maximum frequency counts for scaling exponents occur in the range of 0.625 and 0.675 both in TP and in IMR. The spatial distribution of the scaling exponents of the OLR sequence is closely related to the conditions of climatic high cloud cover in the two areas. The high cloud cover over TP is obviously less than that of IMR. In addition, the scaling behaviors of OLR over TP and IMR are caused by the fractal characteristics of time series, which is further proved by randomly disrupting the time series to remove trends and correlation
Adapting LLM Agents Through Communication
Recent advancements in large language models (LLMs) have shown potential for
human-like agents. To help these agents adapt to new tasks without extensive
human supervision, we propose the Learning through Communication (LTC)
paradigm, a novel training approach enabling LLM agents to improve continuously
through interactions with their environments and other agents. Recent
advancements in large language models (LLMs) have shown potential for
human-like agents. To help these agents adapt to new tasks without extensive
human supervision, we propose the Learning through Communication (LTC)
paradigm, a novel training approach enabling LLM agents to improve continuously
through interactions with their environments and other agents. Through
iterative exploration and PPO training, LTC empowers the agent to assimilate
short-term experiences into long-term memory. To optimize agent interactions
for task-specific learning, we introduce three structured communication
patterns: Monologue, Dialogue, and Analogue-tailored for common tasks such as
decision-making, knowledge-intensive reasoning, and numerical reasoning. We
evaluated LTC on three datasets: ALFWorld (decision-making), HotpotQA
(knowledge-intensive reasoning), and GSM8k (numerical reasoning). On ALFWorld,
it exceeds the instruction tuning baseline by 12% in success rate. On HotpotQA,
LTC surpasses the instruction-tuned LLaMA-7B agent by 5.1% in EM score, and it
outperforms the instruction-tuned 9x larger PaLM-62B agent by 0.6%. On GSM8k,
LTC outperforms the CoT-Tuning baseline by 3.6% in accuracy. The results
showcase the versatility and efficiency of the LTC approach across diverse
domains. We will open-source our code to promote further development of the
community.Comment: Preprin
Weakly-Supervised Hashing in Kernel Space
Poster Presentation, 8 pages.</p
In-Context Learning Unlocked for Diffusion Models
We present Prompt Diffusion, a framework for enabling in-context learning in
diffusion-based generative models. Given a pair of task-specific example
images, such as depth from/to image and scribble from/to image, and a text
guidance, our model automatically understands the underlying task and performs
the same task on a new query image following the text guidance. To achieve
this, we propose a vision-language prompt that can model a wide range of
vision-language tasks and a diffusion model that takes it as input. The
diffusion model is trained jointly over six different tasks using these
prompts. The resulting Prompt Diffusion model is the first diffusion-based
vision-language foundation model capable of in-context learning. It
demonstrates high-quality in-context generation on the trained tasks and
generalizes effectively to new, unseen vision tasks with their respective
prompts. Our model also shows compelling text-guided image editing results. Our
framework, with code publicly available at
https://github.com/Zhendong-Wang/Prompt-Diffusion, aims to facilitate research
into in-context learning for computer vision
- …