Search CORE

68 research outputs found

ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting

Author: Pan Rui
Pan Xingyuan
Pi Renjie
Wang Xiaoyu
Zhang Jipeng
Zhang Tong
Publication venue
Publication date: 28/06/2024
Field of study

Bilevel optimization has shown its utility across various machine learning settings, yet most algorithms in practice require second-order information, making it challenging to scale them up. Only recently, a paradigm of first-order algorithms emerged, capable of effectively addressing bilevel optimization problems. Nevertheless, the practical efficiency of this paradigm remains unverified, particularly in the context of large language models (LLMs). This paper introduces the first scalable instantiation of this paradigm called ScaleBiO, focusing on bilevel optimization for large-scale LLM data reweighting. By combining with a recently proposed memory-efficient training technique called LISA, our novel algorithm allows the paradigm to scale to 34-billion-parameter LLMs on eight A40 GPUs, marking the first successful application of bilevel optimization under practical scenarios for large-sized LLMs. Empirically, extensive experiments on data reweighting verify the effectiveness of ScaleBiO for different-scaled models, including GPT-2, LLaMA-3-8B, GPT-NeoX-20B, and Yi-34B, where bilevel optimization succeeds in filtering irrelevant data samples and selecting informative samples. Theoretically, ScaleBiO ensures the optimality of the learned data weights, along with a convergence guarantee matching the conventional first-order bilevel optimization paradigm on smooth and strongly convex objectives

arXiv.org e-Print Archive

DYNAFED: Tackling Client Data Heterogeneity with Global Dynamics

Author: Chen Qifeng
Gao Jiahui
Kim Sunghun
Pi Renjie
Wang Xiaoyu
Xie Yueqi
Zhang Weizhong
Publication venue
Publication date: 20/11/2022
Field of study

The Federated Learning (FL) paradigm is known to face challenges under heterogeneous client data. Local training on non-iid distributed data results in deflected local optimum, which causes the client models drift further away from each other and degrades the aggregated global model's performance. A natural solution is to gather all client data onto the server, such that the server has a global view of the entire data distribution. Unfortunately, this reduces to regular training, which compromises clients' privacy and conflicts with the purpose of FL. In this paper, we put forth an idea to collect and leverage global knowledge on the server without hindering data privacy. We unearth such knowledge from the dynamics of the global model's trajectory. Specifically, we first reserve a short trajectory of global model snapshots on the server. Then, we synthesize a small pseudo dataset such that the model trained on it mimics the dynamics of the reserved global model trajectory. Afterward, the synthesized data is used to help aggregate the deflected clients into the global model. We name our method Dynafed, which enjoys the following advantages: 1) we do not rely on any external on-server dataset, which requires no additional cost for data collection; 2) the pseudo data can be synthesized in early communication rounds, which enables Dynafed to take effect early for boosting the convergence and stabilizing training; 3) the pseudo data only needs to be synthesized once and can be directly utilized on the server to help aggregation in subsequent rounds. Experiments across extensive benchmarks are conducted to showcase the effectiveness of Dynafed. We also provide insights and understanding of the underlying mechanism of our method

arXiv.org e-Print Archive

Li2NiO2F a new oxyfluoride disordered rocksalt cathode material

Author: Bruce Peter G.
Gong Chen
House Robert A.
Marie John-Joseph
Pi Liquan
Pu Shengda
Rees Gregory J.
Robertson Alex W.
Xu Xiaoyu
Publication venue: 'The Electrochemical Society'
Publication date: 01/01/2021
Field of study

Lithium-rich disordered rocksalts such as Li1.3Nb0.3Mn0.4O2 and Li2MnO2F are being investigated as high energy density cathodes for next generation Li-ion batteries. They can support the (de) lithiation of lithium ions over large compositional ranges while preserving the same overall structure. Here, we present a new Ni-rich oxyfluoride cathode, Li2NiO2F, with a disordered rocksalt structure. Li2NiO2F and can deliver a discharge capacity of 200 mAh g−1 at an average voltage of 3.2 V

Warwick Research Archives Portal Repository

Oxford University Research Archive

Corrigendum to: The TianQin project: current progress on science and technology

Author: Bai Yan-Zheng
Bao Jiahui
Barausse Enrico
Cai Lin
Canuto Enrico
Cao Bin
Chen Wei-Ming
Chen Yu
Ding Yan-Wei
Duan Hui-Zong
Fan Huimin
Feng Wen-Fan
Fu Honglin
Gao Qing
Gao TianQuan
Gong Yungui
Gou Xingyu
Gu Chao-Zheng
Gu De-Feng
He Zi-Qi
Hendry Martin
Hong Wei
Hu Xin-Chun
Hu Yi-Ming
Hu Yuexin
Huang Shun-Jia
Huang Xiang-Qing
Jiang Qinghua
Jiang Yuan-Ze
Jiang Yun
Jiang Zhen
Jin Hong-Ming
Korol Valeriya
Li Hong-Yin
Li Ming
Li Ming
Li Pengcheng
Li Rongwang
Li Yuqiang
Li Zhu
Li Zhu-Xi
Li Zhulian
Liang Yu-Rong
Liang Zheng-Cheng
Liao Fang-Jie
Liu Li
Liu Pei-Bo
Liu Qi
Liu Shuai
Liu Xuhui
Liu Yan-Chong
Liu Yuan
Lu Xiong-Fei
Lu Yang
Lu Ze-Huang
Luo Jun
Luo Yan
Luo Zhi-Cai
Mei Jianwei
Milyukov Vadim
Ming Min
Pi Xiaoyu
Qin Chenggang
Qu Shao-Bo
Sesana Alberto
Shao Chenggang
Shi Changfu
Su Wei
Tan Ding-Yin
Tan Yujie
Tan Zhuangbin
Tu Liang-Cheng
Wang Bin
Wang Cheng-Rui
Wang Fengbin
Wang Guan-Fang
Wang Haitian
Wang Jian
Wang Lijiao
Wang Panpan
Wang Xudong
Wang Yan
Wang Yi-Fan
Wei Ran
Wu Shu-Chao
Xiao Chun-Yu
Xu Xiao-Shi
Xue Chao
Yang Fang-Chao
Yang Liang
Yang Ming-Lin
Yang Shan-Qing
Ye Bobing
Yeh Hsien-Chi
Yu Shenghua
Zhai Dongsheng
Zhang Caishi
Zhang Haitao
Zhang Jian-dong
Zhang Jie
Zhang Lihua
Zhang Xin
Zhang Xuefeng
Zhou Hao
Zhou Ming-Yue
Zhou Ze-Bing
Zhu Dong-Dong
Zi Tie-Guang
Publication venue: Oxford : Oxford Univ. Press
Publication date: 01/01/2021
Field of study

In the originally published version, this manuscript included an error related to indicating the corresponding author within the author list. This has now been corrected online to reflect the fact that author Jun Luo is the corresponding author of the article

Institutional Repository of Leibniz Universität Hannover

High-strength, highly conductive and woven organic hydrogel fibers for flexible electronics

Author: Menghan Pi
Rong Ran
Xiangdong Wang
Xiaoyu Wang
Publication venue: Elsevier BV
Publication date: 01/01/2022
Field of study

Crossref

Sustainable MXene/PDA hydrogel with core-shell structure tailored for highly efficient solar evaporation and long-term desalination

Author: Menghan Pi
Rong Ran
Xiaoyu Wang
Zhisen Wang
Publication venue: Elsevier BV
Publication date: 01/09/2021
Field of study

Crossref

Effective Bilevel Optimization via Minimax Reformulation

Author: Pan Rui
Pi Renjie
Wang Xiaoyu
Zhang Tong
Publication venue
Publication date: 19/11/2023
Field of study

Bilevel optimization has found successful applications in various machine learning problems, including hyper-parameter optimization, data cleaning, and meta-learning. However, its huge computational cost presents a significant challenge for its utilization in large-scale problems. This challenge arises due to the nested structure of the bilevel formulation, where each hyper-gradient computation necessitates a costly inner optimization procedure. To address this issue, we propose a reformulation of bilevel optimization as a minimax problem, effectively decoupling the outer-inner dependency. Under mild conditions, we show these two problems are equivalent. Furthermore, we introduce a multi-stage gradient descent and ascent (GDA) algorithm to solve the resulting minimax problem with convergence guarantees. Extensive experimental results demonstrate that our method outperforms state-of-the-art bilevel methods while significantly reducing the computational cost.Comment: Typos and intended inclusion of additional experiment

arXiv.org e-Print Archive

Vitamin D alleviates hypoxia/reoxygenation-induced injury of human trophoblast HTR-8 cells by activating autophagy

Author: Huifeng Zhang
Jing Ma
Xianghua Huang
Xiaoyu Tian
Yalei Pi
Publication venue: Elsevier BV
Publication date: 01/08/2021
Field of study

Crossref

Effect of Glycerol on an N-Vinylpyrrolidone-Based Photopolymer for Transmission Holography

Author: Haining Chen
Huishi Pi
Weiping Li
Xiaoyu Jiang
Zhiwei Shi
Publication venue: MDPI AG
Publication date: 27/05/2021
Field of study

N-vinylpyrrolidone (NVP) has a large molecular structure, so it is difficult to diffuse during holographic recording, especially at low spatial frequencies. We used glycerol to promote the diffusion of NVP, and successfully improved the holographic performance of the photopolymer at low spatial frequencies. As the concentration of glycerol increases, the holographic performance first increases and then remains stable. The optimal concentration of glycerol is 0.21 mol/L. At this concentration, the maximum diffraction efficiency of the photopolymer is 84%, the refractive index modulation is 1.95 × 10−3, and the photosensitive sensitivity is 7.91 × 10−4 cm2/mJ. Compared with the control group, the maximum diffraction efficiency, maximum refractive index modulation and photosensitivity at low spatial frequencies (800 lp/mm) have increased by 11.19 times, 4.69 times and 1.71 times, respectively. Using the optimized photopolymer for transmission holographic recording and reproduction, we have obtained a clear and bright transmission hologram. The photopolymer modified with glycerol is expected to be applied to the fields of holography, diffractive optics, and so on.</jats:p

Crossref

Effect of continuous equal channel angular pressing on microstructure and properties of Al-Ti-C alloy

Author: LI Yinglong
PI Zongli
SHAO Qi
WU Xiaoyu
ZHANG Ling
Publication venue: Journal of Materials Engineering
Publication date: 01/03/2022
Field of study

The Al-Ti-C alloy was extruded in multiple passes in a continuous manner by continuous equal channel angular pressing process. Through observation of the microstructure evolution, the mechanism of grain refinement and changes in mechanical properties were discussed.The results show that continuous equal channel angular pressing process can effectively refine the microstructure of Al-Ti-C alloy, and the grain size is reduced to about 1 μm.The deformation induction is the most important grain refinement mechanism in the deformation process.The accumulation of high density dislocations causes cracks at the interface between the Al matrix and TiAl3 and voids inside the TiAl3. The cracks further propagate through the entire TiAl3 particles, ultimately leading to the refinement of the second phase TiAl3 structure.At the same time, the pinning mechanism and shearing mechanism of the fine second phase TiAl3 structure promote the refinement of the Al matrix.After one pass of continuous equal channel angular pressing, the hardness of the alloy increases most obviously, which is 59.2% higher than that of the original state.With the increase of the number of extrusion passes, the increasing trend of hardness slows down, the plasticity of the alloy decreases, and toughness increases

Directory of Open Access Journals