26 research outputs found
Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning
Enhancing the instruction-following ability of Large Language Models (LLMs)
primarily demands substantial instruction-tuning datasets. However, the sheer
volume of these imposes a considerable computational burden and annotation
cost. To investigate a label-efficient instruction tuning method that allows
the model itself to actively sample subsets that are equally or even more
effective, we introduce a self-evolving mechanism DiverseEvol. In this process,
a model iteratively augments its training subset to refine its own performance,
without requiring any intervention from humans or more advanced LLMs. The key
to our data sampling technique lies in the enhancement of diversity in the
chosen subsets, as the model selects new data points most distinct from any
existing ones according to its current embedding space. Extensive experiments
across three datasets and benchmarks demonstrate the effectiveness of
DiverseEvol. Our models, trained on less than 8% of the original dataset,
maintain or improve performance compared with finetuning on full data. We also
provide empirical evidence to analyze the importance of diversity in
instruction data and the iterative scheme as opposed to one-time sampling. Our
code is publicly available at https://github.com/OFA-Sys/DiverseEvol.git
Qwen Technical Report
Large language models (LLMs) have revolutionized the field of artificial
intelligence, enabling natural language processing tasks that were previously
thought to be exclusive to humans. In this work, we introduce Qwen, the first
installment of our large language model series. Qwen is a comprehensive
language model series that encompasses distinct models with varying parameter
counts. It includes Qwen, the base pretrained language models, and Qwen-Chat,
the chat models finetuned with human alignment techniques. The base language
models consistently demonstrate superior performance across a multitude of
downstream tasks, and the chat models, particularly those trained using
Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The
chat models possess advanced tool-use and planning capabilities for creating
agent applications, showcasing impressive performance even when compared to
bigger models on complex tasks like utilizing a code interpreter. Furthermore,
we have developed coding-specialized models, Code-Qwen and Code-Qwen-Chat, as
well as mathematics-focused models, Math-Qwen-Chat, which are built upon base
language models. These models demonstrate significantly improved performance in
comparison with open-source models, and slightly fall behind the proprietary
models.Comment: 59 pages, 5 figure
Power and Voltage Control for Single-Phase Cascaded H-Bridge Multilevel Converters under Unbalanced Loads
The conventional control method for a single-phase cascaded H-bridge (CHB) multilevel converter is vector (dq) control; however, dq control requires complicated calculations and additional time delays. This paper presents a novel power control strategy for the CHB multilevel converter. A power-based dc-link voltage balance control is also proposed for unbalanced load conditions. The new control method is designed in a virtual αβ stationary reference frame without coordinate transformation or phase-locked loop (PLL) to avoid the potential issues related to computational complexity. Because only imaginary voltage construction is needed in the proposed control method, the time delay from conventional imaginary current construction can be eliminated. The proposed method can obtain a sinusoidal grid current waveform with unity power factor. Compared with the conventional dq control method, the power control strategy possesses the advantage of a fast dynamic response. The stability of the closed-loop system with the dc-link voltage balance controller is evaluated. Simulation and experimental results are presented to verify the accuracy of the proposed power and voltage control method
Vehicular Motion Experiment and Data Retrieval of a Compact Floating Lidar System
Accurate and rapid observation of sea surface wind is important for the research of ocean dynamic prediction model, offshore wind resource assessment, air-sea interaction and flux. A compact floating coherent Doppler lidar system named WindMast 350-M was developed by Ocean University of China (OUC) and Leice Transient Technology Co. LTD (LEICE) for the observations of sea surface wind profiles. As an observation device installed on buoy platforms, the first vehicular motion experiment was conducted at Laoshan campus of Ocean University of China(120.49°E , 36.16°N) on 06 and 12 March, 2019. During the first experiment, the wind profiles measured by the WindMast 350-M were compared with the results from a well calibrated Ground-based Coherent Doppler lidar WindMast WP350. In this contribution, the systematic design and the specifications of 350-M are presented in detail. The preliminary results of the vehicular motion experiment are discussed as well
Vehicular Motion Experiment and Data Retrieval of a Compact Floating Lidar System
Accurate and rapid observation of sea surface wind is important for the research of ocean dynamic prediction model, offshore wind resource assessment, air-sea interaction and flux. A compact floating coherent Doppler lidar system named WindMast 350-M was developed by Ocean University of China (OUC) and Leice Transient Technology Co. LTD (LEICE) for the observations of sea surface wind profiles. As an observation device installed on buoy platforms, the first vehicular motion experiment was conducted at Laoshan campus of Ocean University of China(120.49°E , 36.16°N) on 06 and 12 March, 2019. During the first experiment, the wind profiles measured by the WindMast 350-M were compared with the results from a well calibrated Ground-based Coherent Doppler lidar WindMast WP350. In this contribution, the systematic design and the specifications of 350-M are presented in detail. The preliminary results of the vehicular motion experiment are discussed as well
Mobile Multiwavelength Polarization Raman Lidar for Water Vapor, Cloud and Aerosol Measurement
Aiming at the detection of water vapor mixing ratio, particle linear depolarization ratio, extinction coefficient and cloud information, the Water vapor, Cloud and Aerosol Lidar (WVCAL) was developed by the lidar group at Ocean University of China. The Lidar consists of transmitting subsystem, receiving subsystem, data acquisition and controlling subsystem and auxiliary subsystem. These parts were presented and described in this paper. For the measurement of various physical properties, three channels including Raman channel, polarization channel and infrared channel are integrated in this Lidar system. In this paper, the integration and working principle of these channels is introduced in details. Finally, a measurement example which was operated in coastal area-Qingdao, Shandong province, during 2014 is provided
Mobile Multiwavelength Polarization Raman Lidar for Water Vapor, Cloud and Aerosol Measurement
Aiming at the detection of water vapor mixing ratio, particle linear depolarization ratio, extinction coefficient and cloud information, the Water vapor, Cloud and Aerosol Lidar (WVCAL) was developed by the lidar group at Ocean University of China. The Lidar consists of transmitting subsystem, receiving subsystem, data acquisition and controlling subsystem and auxiliary subsystem. These parts were presented and described in this paper. For the measurement of various physical properties, three channels including Raman channel, polarization channel and infrared channel are integrated in this Lidar system. In this paper, the integration and working principle of these channels is introduced in details. Finally, a measurement example which was operated in coastal area-Qingdao, Shandong province, during 2014 is provided