Search CORE

26 research outputs found

Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning

Author: Lin Junyang
Lu Keming
Su Qi
Wu Shengguang
Xu Benfeng
Zhou Chang
Publication venue
Publication date: 14/11/2023
Field of study

Enhancing the instruction-following ability of Large Language Models (LLMs) primarily demands substantial instruction-tuning datasets. However, the sheer volume of these imposes a considerable computational burden and annotation cost. To investigate a label-efficient instruction tuning method that allows the model itself to actively sample subsets that are equally or even more effective, we introduce a self-evolving mechanism DiverseEvol. In this process, a model iteratively augments its training subset to refine its own performance, without requiring any intervention from humans or more advanced LLMs. The key to our data sampling technique lies in the enhancement of diversity in the chosen subsets, as the model selects new data points most distinct from any existing ones according to its current embedding space. Extensive experiments across three datasets and benchmarks demonstrate the effectiveness of DiverseEvol. Our models, trained on less than 8% of the original dataset, maintain or improve performance compared with finetuning on full data. We also provide empirical evidence to analyze the importance of diversity in instruction data and the iterative scheme as opposed to one-time sampling. Our code is publicly available at https://github.com/OFA-Sys/DiverseEvol.git

arXiv.org e-Print Archive

Qwen Technical Report

Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat models finetuned with human alignment techniques. The base language models consistently demonstrate superior performance across a multitude of downstream tasks, and the chat models, particularly those trained using Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The chat models possess advanced tool-use and planning capabilities for creating agent applications, showcasing impressive performance even when compared to bigger models on complex tasks like utilizing a code interpreter. Furthermore, we have developed coding-specialized models, Code-Qwen and Code-Qwen-Chat, as well as mathematics-focused models, Math-Qwen-Chat, which are built upon base language models. These models demonstrate significantly improved performance in comparison with open-source models, and slightly fall behind the proprietary models.Comment: 59 pages, 5 figure

arXiv.org e-Print Archive

HOXA7 plays a critical role in metastasis of liver cancer associated with activation of Snail

Author: A Mitra
B Argiropoulos
B Boyer
B Dave
Biao Lei
BJ Scott
Bo Li
Bo Tang
C Cillo
D Hanahan
D Vergara
DG Grier
E Batlle
EB Lewis
ED Hay
Fang Tang
FD Nunes
GJ Miller
Guangying Qi
H Nordenstedt
H Peinado
Jie Liu
JM Llovet
JP Thiery
JP Thiery
K Polyak
M Yilmaz
Qi Huang
R Calvo
R Kalluri
R Krumlauf
Run Zhai
Shengguang Yuan
Shuiping Yu
Songqing He
TT Onder
V Raman
WJ Gehring
Xiaoyu Sun
Xingsi Liang
Xinjin Guo
Y Kishida
Y Li
Y Li
Y Wu
Y Zhang
Yangchao Wei
YS Ang
Zhenran Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Power and Voltage Control for Single-Phase Cascaded H-Bridge Multilevel Converters under Unbalanced Loads

Author: Daliang Yang
Li Yin
Ning Wu
Shengguang Xu
Publication venue: 'MDPI AG'
Publication date: 01/09/2018
Field of study

The conventional control method for a single-phase cascaded H-bridge (CHB) multilevel converter is vector (dq) control; however, dq control requires complicated calculations and additional time delays. This paper presents a novel power control strategy for the CHB multilevel converter. A power-based dc-link voltage balance control is also proposed for unbalanced load conditions. The new control method is designed in a virtual αβ stationary reference frame without coordinate transformation or phase-locked loop (PLL) to avoid the potential issues related to computational complexity. Because only imaginary voltage construction is needed in the proposed control method, the time delay from conventional imaginary current construction can be eliminated. The proposed method can obtain a sinusoidal grid current waveform with unity power factor. Compared with the conventional dq control method, the power control strategy possesses the advantage of a fast dynamic response. The stability of the closed-loop system with the dc-link voltage balance controller is evaluated. Simulation and experimental results are presented to verify the accuracy of the proposed power and voltage control method

Directory of Open Access Journals

Vehicular Motion Experiment and Data Retrieval of a Compact Floating Lidar System

Author: Hongwei Zhang
Jiaping Yin
Qichao Wang
Shengguang Qin
Songhua Wu
Tong Cui
Publication venue: 'EDP Sciences'
Publication date: 07/07/2020
Field of study

Accurate and rapid observation of sea surface wind is important for the research of ocean dynamic prediction model, offshore wind resource assessment, air-sea interaction and flux. A compact floating coherent Doppler lidar system named WindMast 350-M was developed by Ocean University of China (OUC) and Leice Transient Technology Co. LTD (LEICE) for the observations of sea surface wind profiles. As an observation device installed on buoy platforms, the first vehicular motion experiment was conducted at Laoshan campus of Ocean University of China(120.49°E , 36.16°N) on 06 and 12 March, 2019. During the first experiment, the wind profiles measured by the WindMast 350-M were compared with the results from a well calibrated Ground-based Coherent Doppler lidar WindMast WP350. In this contribution, the systematic design and the specifications of 350-M are presented in detail. The preliminary results of the vehicular motion experiment are discussed as well

EDP Sciences OAI-PMH repository (1.2.0)

Vehicular Motion Experiment and Data Retrieval of a Compact Floating Lidar System

Author: Cui Tong
Qin Shengguang
Wang Qichao
Wu Songhua
Yin Jiaping
Zhang Hongwei
Publication venue: 'EDP Sciences'
Publication date: 01/01/2020
Field of study

Directory of Open Access Journals

An ultra-low-power area-efficient non-volatile memory in a 0.18 μm single-poly CMOS process for passive RFID tags

Author: Jia Xiaoyun Feng Peng,Zhang Shengguang,Wu Nanjian,Zhao Baiqin and Liu Su
Publication venue
Publication date: 01/01/2013
Field of study

Knowledge Repository of SEMI,CAS

Mobile Multiwavelength Polarization Raman Lidar for Water Vapor, Cloud and Aerosol Measurement

Author: Bingyi Liu
Dengxin Hua
Fei Gao
Guangyao Dai
Kailin Zhang
Shengguang Qin
Songhua Wu
Xiaoquan Song
Publication venue: 'EDP Sciences'
Publication date: 07/06/2016
Field of study

Aiming at the detection of water vapor mixing ratio, particle linear depolarization ratio, extinction coefficient and cloud information, the Water vapor, Cloud and Aerosol Lidar (WVCAL) was developed by the lidar group at Ocean University of China. The Lidar consists of transmitting subsystem, receiving subsystem, data acquisition and controlling subsystem and auxiliary subsystem. These parts were presented and described in this paper. For the measurement of various physical properties, three channels including Raman channel, polarization channel and infrared channel are integrated in this Lidar system. In this paper, the integration and working principle of these channels is introduced in details. Finally, a measurement example which was operated in coastal area-Qingdao, Shandong province, during 2014 is provided

EDP Sciences OAI-PMH repository (1.2.0)

Mobile Multiwavelength Polarization Raman Lidar for Water Vapor, Cloud and Aerosol Measurement

Author: Dai Guangyao
Gao Fei
Hua Dengxin
Liu Bingyi
Qin Shengguang
Song Xiaoquan
Wu Songhua
Zhang Kailin
Publication venue: EDP Sciences
Publication date: 01/01/2016
Field of study

Directory of Open Access Journals

PHYTOEXTRACTION OF PB AND CU CONTAMINATED SOIL WITH MAIZE AND MICROENCAPSULATED EDTA

Author: Chen Nengchang
Li Fangbai
Liu Chengshuai
Wu Longhua
Xie Zhiyi
Xu Shengguang
Xu Yanling
Zheng Yuji
Publication venue
Publication date: 01/01/2012
Field of study