Search CORE

238 research outputs found

Iterative Forward Tuning Boosts In-context Learning in Language Models

Author: Huang Fei
Hui Binyuan
Li Binhua
Li Yongbin
Yang Jiaxi
Yang Min
Publication venue
Publication date: 30/05/2023
Field of study

Large language models (LLMs) have exhibited an emergent in-context learning (ICL) ability. However, the ICL models that can solve ordinary cases are hardly extended to solve more complex tasks by processing the demonstration examples once. This single-turn ICL is incoordinate with the decision making process of humans by learning from analogy. In this paper, we propose an effective and efficient two-stage framework to boost ICL in LLMs by exploiting a dual form between Transformer attention and gradient descent-based optimization. Concretely, we divide the ICL process into "Deep-Thinking" and inference stages. The "Deep-Thinking" stage performs iterative forward optimization of demonstrations, which is expected to boost the reasoning abilities of LLMs at test time by "thinking" demonstrations multiple times. It produces accumulated meta-gradients by manipulating the Key-Value matrices in the self-attention modules of the Transformer. Then, the inference stage only takes the test query as input without concatenating demonstrations and applies the learned meta-gradients through attention for output prediction. In this way, demonstrations are not required during the inference stage since they are already learned and stored in the definitive meta-gradients. LLMs can be effectively and efficiently adapted to downstream tasks. Extensive experiments on ten classification and multiple-choice datasets show that our method achieves substantially better performance than standard ICL in terms of both accuracy and efficiency.Comment: 14 pages, 5 figure

arXiv.org e-Print Archive

Matching-based Data Valuation for Generative Model

Author: Deng Wenglong
Huang Yangsibo
Li Xiaoxiao
Liu Benlin
Yang Jiaxi
Publication venue
Publication date: 20/04/2023
Field of study

Data valuation is critical in machine learning, as it helps enhance model transparency and protect data properties. Existing data valuation methods have primarily focused on discriminative models, neglecting deep generative models that have recently gained considerable attention. Similar to discriminative models, there is an urgent need to assess data contributions in deep generative models as well. However, previous data valuation approaches mainly relied on discriminative model performance metrics and required model retraining. Consequently, they cannot be applied directly and efficiently to recent deep generative models, such as generative adversarial networks and diffusion models, in practice. To bridge this gap, we formulate the data valuation problem in generative models from a similarity-matching perspective. Specifically, we introduce Generative Model Valuator (GMValuator), the first model-agnostic approach for any generative models, designed to provide data valuation for generation tasks. We have conducted extensive experiments to demonstrate the effectiveness of the proposed method. To the best of their knowledge, GMValuator is the first work that offers a training-free, post-hoc data valuation strategy for deep generative models

arXiv.org e-Print Archive

ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases

Author: Chen Bohua
Cui Jiaxi
Li Zongjian
Yan Yang
Yuan Li
Publication venue
Publication date: 28/06/2023
Field of study

Large Language Models (LLMs) have shown the potential to revolutionize natural language processing tasks in various domains, sparking great interest in vertical-specific large models. However, unlike proprietary models such as BloombergGPT and FinGPT, which have leveraged their unique data accumulations to make strides in the finance domain, there hasn't not many similar large language models in the Chinese legal domain to facilitate its digital transformation. In this paper, we propose an open-source legal large language model named ChatLaw. Due to the importance of data quality, we carefully designed a legal domain fine-tuning dataset. Additionally, to overcome the problem of model hallucinations in legal data screening during reference data retrieval, we introduce a method that combines vector database retrieval with keyword retrieval to effectively reduce the inaccuracy of relying solely on vector database retrieval. Furthermore, we propose a self-attention method to enhance the ability of large models to overcome errors present in reference data, further optimizing the issue of model hallucinations at the model level and improving the problem-solving capabilities of large models. We also open-sourced our model and part of the data at https://github.com/PKU-YuanGroup/ChatLaw

arXiv.org e-Print Archive

Research and Application on Spark Clustering Algorithm in Campus Big Data Analysis

Author: Hou Qing
Wang Guangjian
Wang Xiaozheng
Xin Yang
Xu Jiaxi
Publication venue: 'Bilingual Publishing Co.'
Publication date: 21/05/2020
Field of study

Big data analysis has penetrated into all fields of society and has brought about profound changes. However, there is relatively little research on big data supporting student management regarding college and university’s big data. Taking the student card information as the research sample, using spark big data mining technology and K-Means clustering algorithm, taking scholarship evaluation as an example, the big data is analyzed. Data includes analysis of students’ daily behavior from multiple dimensions, and it can prevent the unreasonable scholarship evaluation caused by unfair factors such as plagiarism, votes of teachers and students, etc. At the same time, students’ absenteeism, physical health and psychological status in advance can be predicted, which makes student management work more active, accurate and effective

Bilingual Publishing Co. (BPC): E-Journals

Effective thermal conductivity of wire-woven bulk Kagome sandwich panels

Author: Bai Jiaxi
Kang Ki-Ju
Kim Tongbeum
Lu Tianjian
Yang Xiaohu
Publication venue: The Chinese Society of Theoretical and Applied Mechanics. Published by Elsevier Ltd.
Publication date: 01/01/2014
Field of study

AbstractThermal transport in a highly porous metallic wire-woven bulk Kagome (WBK) is numerically and analytically modeled. Based on topology similarity and upon introducing an elongation parameter in thermal tortuosity, an idealized Kagome with non-twisted struts is employed. Special focus is placed upon quantifying the effect of topological anisotropy of WBK upon its effective conductivity. It is demonstrated that the effective conductivity reduces linearly as the porosity increases, and the extent of the reduction is significantly dependent on the orientation of WBK. The governing physical mechanism of anisotropic thermal transport in WBK is found to be the anisotropic thermal tortuosity caused by the intrinsic anisotropic topology of WBK

Elsevier - Publisher Connector

Directory of Open Access Journals

Federated Learning Incentive Mechanism under Buyers' Auction Market

Author: Cao Sheng
Guo Zihao
Tsai Li-Chuan
Yang Jiaxi
Zhao Cuifang
Publication venue
Publication date: 10/09/2023
Field of study

Auction-based Federated Learning (AFL) enables open collaboration among self-interested data consumers and data owners. Existing AFL approaches are commonly under the assumption of sellers' market in that the service clients as sellers are treated as scarce resources so that the aggregation servers as buyers need to compete the bids. Yet, as the technology progresses, an increasing number of qualified clients are now capable of performing federated learning tasks, leading to shift from sellers' market to a buyers' market. In this paper, we shift the angle by adapting the procurement auction framework, aiming to explain the pricing behavior under buyers' market. Our modeling starts with basic setting under complete information, then move further to the scenario where sellers' information are not fully observable. In order to select clients with high reliability and data quality, and to prevent from external attacks, we utilize a blockchain-based reputation mechanism. The experimental results validate the effectiveness of our approach

arXiv.org e-Print Archive

A matter of time: Using dynamics and theory to uncover mechanisms of transcriptional bursting

Author: Garcia Hernan G.
Kim Yang Joon
Lammers Nicholas C.
Zhao Jiaxi
Publication venue: 'Elsevier BV'
Publication date: 01/12/2020
Field of study

Eukaryotic transcription generally occurs in bursts of activity lasting minutes to hours; however, state-of-the-art measurements have revealed that many of the molecular processes that underlie bursting, such as transcription factor binding to DNA, unfold on timescales of seconds. This temporal disconnect lies at the heart of a broader challenge in physical biology of predicting transcriptional outcomes and cellular decision-making from the dynamics of underlying molecular processes. Here, we review how new dynamical information about the processes underlying transcriptional control can be combined with theoretical models that predict not only averaged transcriptional dynamics, but also their variability, to formulate testable hypotheses about the molecular mechanisms underlying transcriptional bursting and control.Comment: 41 pages, 4 figures, review articl

arXiv.org e-Print Archive

eScholarship - University of California