Search CORE

382 research outputs found

Habits and goals in synergy: a variational Bayesian framework for behavior

Author: Doya Kenji
Han Dongqi
Li Dongsheng
Tani Jun
Publication venue
Publication date: 11/04/2023
Field of study

How to behave efficiently and flexibly is a central problem for understanding biological agents and creating intelligent embodied AI. It has been well known that behavior can be classified as two types: reward-maximizing habitual behavior, which is fast while inflexible; and goal-directed behavior, which is flexible while slow. Conventionally, habitual and goal-directed behaviors are considered handled by two distinct systems in the brain. Here, we propose to bridge the gap between the two behaviors, drawing on the principles of variational Bayesian theory. We incorporate both behaviors in one framework by introducing a Bayesian latent variable called "intention". The habitual behavior is generated by using prior distribution of intention, which is goal-less; and the goal-directed behavior is generated by the posterior distribution of intention, which is conditioned on the goal. Building on this idea, we present a novel Bayesian framework for modeling behaviors. Our proposed framework enables skill sharing between the two kinds of behaviors, and by leveraging the idea of predictive coding, it enables an agent to seamlessly generalize from habitual to goal-directed behavior without requiring additional training. The proposed framework suggests a fresh perspective for cognitive science and embodied AI, highlighting the potential for greater integration between habitual and goal-directed behaviors

arXiv.org e-Print Archive

FwdLLM: Efficient FedLLM using Forward Gradient

Author: Cai Dongqi
Li Xiang
Wang Shangguang
Wu Yaozong
Xu Mengwei
Publication venue
Publication date: 20/01/2024
Field of study

Large Language Models (LLMs) are transforming the landscape of mobile intelligence. Federated Learning (FL), a method to preserve user data privacy, is often employed in fine-tuning LLMs to downstream mobile tasks, an approach known as FedLLM. Though recent efforts have addressed the network issue induced by the vast model size, they have not practically mitigated vital challenges concerning integration with mobile devices, such as significant memory consumption and sluggish model convergence. In response to these challenges, this work introduces FwdLLM, an innovative FL protocol designed to enhance the FedLLM efficiency. The key idea of FwdLLM to employ backpropagation (BP)-free training methods, requiring devices only to execute ``perturbed inferences''. Consequently, FwdLLM delivers way better memory efficiency and time efficiency (expedited by mobile NPUs and an expanded array of participant devices). FwdLLM centers around three key designs: (1) it combines BP-free training with parameter-efficient training methods, an essential way to scale the approach to the LLM era; (2) it systematically and adaptively allocates computational loads across devices, striking a careful balance between convergence speed and accuracy; (3) it discriminatively samples perturbed predictions that are more valuable to model convergence. Comprehensive experiments with five LLMs and three NLP tasks illustrate FwdLLM's significant advantages over conventional methods, including up to three orders of magnitude faster convergence and a 14.6x reduction in memory footprint. Uniquely, FwdLLM paves the way for federated learning of billion-parameter LLMs such as LLaMA on COTS mobile devices -- a feat previously unattained.Comment: under revie

arXiv.org e-Print Archive

Recommended from our members

Magneto-optical and photoemission studies of ultrathin wedges

Author: Bader S. D.
Li Dongqi
Publication venue: Argonne National Laboratory
Publication date: 01/12/1995
Field of study

Magnetic phase transitions of Fe wedges grown epitaxially on Cu(100) are detected via the surface magneto-optical Kerr effect and used to construct a phase diagram for face centered Fe. Also, the confinement of Cu sp- and d-quantum-well states is studied for Cu/Co(wedge)/Cu(100) utilizing undulator-based photoemission experiments

UNT Digital Library

AdS/BCFT and Island for curvature-squared gravity

Author: Hu Qi-Lin
Li Dongqi
Miao Rong-Xin
Zeng Yu-Qian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/08/2022
Field of study

In this paper, we investigate AdS/BCFT for curvature-squared gravity. To warm up, we start with Gauss-Bonnet gravity. We derive the one point function of stress tensor and show that the central charge related to the norm of displacement operator is positive for the couplings obeying causality constraints. Furthermore, by imposing the null energy condition on the end-of-the-world brane, we prove the holographic g-theorem for Gauss-Bonnet gravity. This corrects a wrong point of view in the literature, which claims that the holographic g-theorem is violated for Gauss-Bonnet gravity. As a by-product, we obtain the boundary entropy and A-type boundary central charges in general dimensions. We also study AdS/BCFT for general curvature-squared gravity. We find that it is too restrictive for the shape of the brane and the dual BCFT is trivial if one imposes Neumann boundary conditions for all of the gravitational modes. Instead, we propose to impose Dirichlet boundary condition for the massive graviton, while imposing Neumann boundary condition for the massless graviton. In this way, we obtain non-trivial shape dependence of stress tensor and well-defined central charges. In particular, the holographic g-theorem is satisfied by general curvature-squared gravity. Finally, we discuss the island and show that the Page curve can be recovered for Gauss-Bonnet gravity. Interestingly, there are zeroth-order phase transitions for the Page curve within one range of couplings obeying causality constraints. Generalizing the discussions to holographic entanglement entropy and holographic complexity in AdS/CFT, we get new constraints for the Gauss-Bonnet coupling, which is stronger than the causality constraint.Comment: 49 pages, 29 figures, revision accepted for publication in JHEP, main improvements: prove that our g-function can recover the universal term of boundary entropy in general dimensions; add a toy model to explain the novel zeroth-order phase transition of the Page curve analyticall

arXiv.org e-Print Archive

Recommended from our members

Spin polarization of the conduction bands and secondary electrons of Gd(0001)

Author: Bader S.D.
Li Dongqi
Pearson J.
Publication venue: Argonne National Laboratory
Publication date: 31/12/1995
Field of study

Angle- and spin-resolved photoemission was utilized to investigate the 5d bulk bands and the surface state of Gd(0001) in the temperature range. of 130 - 350 K The bulk bands at 1-2 eV below the Fermi energy E{sub F} show Stoner-like behavior, while the temperature dependence of the surface state near E{sub F} indicates spin-mixing behavior due to fluctuating local 5d moments. The secondary electron spectra of the Gd surfaces both before and after initial oxygen adsorption show a polarization dip at low kinetic energies due to the extra scattering channel for minority electrons via the unoccupied 4f level. The temperature dependences of the surface and bulk magnetization are separated using the spin polarization of the surface state and the bulk exchange splitting

UNT Digital Library

Superparamagnetic behavior of ultrathin Fe films grown on Al₂O₃(0001) substrates

Author: Bader S. D.
Endo Yasushi
Li Dongqi
Shiratsuchi Yu
Yamamoto Masahiko
Publication venue: 'AIP Publishing'
Publication date
Field of study

The superparamagnetic behavior of ultrathin Fe films at various growth temperatures was studied. The films were grown on an Al₂O₃(0001) substrate by molecular beam epitaxy (MBE). The blocking temperature was strongly dependent on the growth temperature and the 1-nm-thick Fe films were in the superparamagnetic state. The results show that for growth at 673 and 773 K, Fe forms large particles and the magnetic properties are dominated by the individual particles.Yu Shiratsuchi, Masahiko Yamamoto, and Yasushi Endo, Dongqi Li and S. D. Bader, Journal of Applied Physics 94, 7675 (2003); https://doi.org/10.1063/1.1628408

Osaka University Knowledge Archive

Magnetic phase transition and anisotropy of ultrathin Fe films grown on inclined Al₂O₃(0001) substrates

Author: Bader S. D.
Endo Yasushi
Li Dongqi
Shiratsuchi Yu
Yamamoto Masahiko
Publication venue: 'AIP Publishing'
Publication date
Field of study

We investigated the magnetic properties of ultrathin Fe films grown on inclined Al₂O₃(0001) substrates at various growth temperatures. We report the evolution of the magnetism with Fe thickness tFe, growth temperature, and the effect of the inclination of the substrate orientation on the magnetic anisotropy. The films are superparamagnetic (tFe≈5 monolayer, ML), ferromagnetic (tFe>15 ML), or coexistent (tFe≈10 ML). The effect of inclination of the substrate is small in the superparamagnetic region and substantial in the ferromagnetic region. Fe thin films grown on the inclined substrate have a uniaxial magnetic anisotropy with the magnetic easy axis parallel to the step edge. This uniaxial magnetic anisotropy might be derived from the effective demagnetizing field due to the magnetic charge distribution at the corrugated surface. The strength of the uniaxial magnetic anisotropy decreases as the growth temperature increases. The dependence of the uniaxial magnetic anisotropy on growth temperature is caused by the change of growth mechanism, from smooth to rough with an increasing of growth temperature.Yu Shiratsuchi, Yasushi Endo, and Masahiko Yamamoto, Dongqi Li and S. D. Bader, Journal of Applied Physics 95, 6897 (2004); https://doi.org/10.1063/1.1667432

Osaka University Knowledge Archive

Probing the metal-nonmetal transition in thin metal overlayers using resonant photoemission

Author: Dottl L.
Dowben Peter A.
LaGraffe D.
Li Dongqi
Onellion M.
Vidali G.
Zhang L.
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/05/1991
Field of study

We have studied one and two monolayers of barium on Ni(111) and of mercury on Cu(100). Using resonant photoemission, we have found core excited electrons become delocalized with increasing barium coverage. Similarly, upon formation of the mercury bilayer (as determined by low-energy electron diffraction and by atom-beam scattering), there is a substantial increase in the screening of the photohole. A transition of the electronic structure akin to a metal-nonmetal (metal-insulator) transition is apparent in these final-state effects. The band structure for Hg is similar to the band structure expected for a free-standing film with a free-electron sd band. The delocalization of the core excited electrons resembles the exciton unbinding that occurs at the metal-nonmetal Mott transition

DigitalCommons@University of Nebraska