Search CORE

227 research outputs found

A COMPARATIVE STUDY ON COGNITIVE IMPAIRMENT OF FAKE NEWS BETWEEN CHINESE AND KOREAN AUDIENCES FROM THE PERSPECTIVE OF SOCIAL SYSTEM STRUCTURE

Author: Chun Chuke
Jing Changqiang
Kim Hyunjoo
Zhang Shitao
Publication venue
Publication date: 01/01/2021
Field of study

Hrčak - Portal of scientific journals of Croatia

C-Pack: Packaged Resources To Advance General Chinese Embedding

Author: Liu Zheng
Muennighof Niklas
Xiao Shitao
Zhang Peitian
Publication venue
Publication date: 14/09/2023
Field of study

We introduce C-Pack, a package of resources that significantly advance the field of general Chinese embeddings. C-Pack includes three critical resources. 1) C-MTEB is a comprehensive benchmark for Chinese text embeddings covering 6 tasks and 35 datasets. 2) C-MTP is a massive text embedding dataset curated from labeled and unlabeled Chinese corpora for training embedding models. 3) C-TEM is a family of embedding models covering multiple sizes. Our models outperform all prior Chinese text embeddings on C-MTEB by up to +10% upon the time of the release. We also integrate and optimize the entire suite of training methods for C-TEM. Along with our resources on general Chinese embedding, we release our data and models for English text embeddings. The English models achieve state-of-the-art performance on MTEB benchmark; meanwhile, our released English data is 2 times larger than the Chinese data. All these resources are made publicly available at https://github.com/FlagOpen/FlagEmbedding

arXiv.org e-Print Archive

LM-Cocktail: Resilient Tuning of Language Models via Model Merging

Author: Liu Zheng
Xiao Shitao
Xing Xingrun
Zhang Peitian
Publication venue
Publication date: 08/12/2023
Field of study

The pre-trained language models are continually fine-tuned to better support downstream applications. However, this operation may result in significant performance degeneration on general tasks beyond the targeted domain. To overcome this problem, we propose LM-Cocktail which enables the fine-tuned model to stay resilient in general perspectives. Our method is conducted in the form of model merging, where the fine-tuned language model is merged with the pre-trained base model or the peer models from other domains through weighted average. Despite simplicity, LM-Cocktail is surprisingly effective: the resulted model is able to achieve a strong empirical performance in the whole scope of general tasks while preserving a superior capacity in its targeted domain. We conduct comprehensive experiments with LLama and BGE model on popular benchmarks, including FLAN, MMLU, MTEB, whose results validate the efficacy of our proposed method. The code and checkpoints are available at https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail.Comment: Work is in progres

arXiv.org e-Print Archive

Recommended from our members

Hydrological variations in central China over the past millennium and their links to the Tropic Pacific and North Atlantic Oceans

Author: Chen Jianshun
Chen Shitao
Duan Fucai
Liao Zebo
Shao Qingfeng
Wang Yi
Zhang Zhenqiu
Zhao Kan
Publication venue: European Geosciences Union
Publication date: 10/03/2020
Field of study

Variations of precipitation, aka the Meiyu rain, in East Asian summer monsoon (EASM) domain during the last millennium could help enlighten the hydrological response to future global warming. Here we present a precisely dated and highly resolved stalagmite δ18O record from the Yongxing Cave, central China. Our new record, combined with a previously published one from the same cave, indicates that the Meiyu rain has changed dramatically in association with the global temperature change. In particular, our record shows that the Meiyu rain has been weakened during the Medieval Climate Anomaly (MCA), but intensified during the Little Ice Age (LIA). During the Current Warm Period (CWP), our record indicates a similar weakening of the Meiyu rain. Furthermore, during the MCA and CWP, our records show that the atmospheric precipitation is similarly wet in northern China and similarly dry in central China, but relatively wet during the CWP in southern China. This spatial discrepancy indicates a complicated localized response of the regional precipitation to the anthropogenic forcing. The weakened (intensified) Meiyu rain during the MCA (LIA) matches well with the warm (cold) phases of Northern Hemisphere surface air temperature. This Meiyu rain pattern also corresponds well with the climatic conditions over the Tropical Indo-Pacific warm pool. On the other hand, our record shows a strong association with the North Atlantic climate as well. The reduced (increased) Meiyu rain correlates well with positive (negative) phases of North Atlantic Oscillation. In addition, our record links well with the strong (weak) Atlantic meridional overturning circulation during the MCA (LIA) period. All above-mentioned localized correspondences and remote teleconnections on decadal to centennial timescales indicate that the Meiyu rain is coupled closely with oceanic processes in the Tropical Pacific and North Atlantic Oceans during the MCA and LIA

Sussex Research Online

Probabilistic hesitant fuzzy multiple attribute decisionmaking based on regret theory for the evaluation of venture capital projects

Author: Jiashu Liu
Shitao Zhang
Xiaodi Liu
Zengwen Wang
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2020
Field of study

The selection of venture capital investment projects is one of the most important decision-making activities for venture capitalists. Due to the complexity of investment market and the limited cognition of people, most of the venture capital investment decision problems are highly uncertain and the venture capitalists are often bounded rational under uncertainty. To address such problems, this article presents an approach based on regret theory to probabilistic hesitant fuzzy multiple attribute decision-making. Firstly, when the information on the occurrence probabilities of all the elements in the probabilistic hesitant fuzzy element (P.H.F.E.) is unknown or partially known, two different mathematical programming models based on water-filling theory and the maximum entropy principle are provided to handle these complex situations. Secondly, to capture the psychological behaviours of venture capitalists, the regret theory is utilised to solve the problem of selection of venture capital investment projects. Finally, comparative analysis with the existing approaches is conducted to demonstrate the feasibility and applicability of the proposed method

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Wasserstein distance-based probabilistic linguistic TODIM method with application to the evaluation of sustainable rural tourism potential

Author: Liu Xiaodi
Ma Zhenzhen
Wu Jian
Wu Zhangjiao
Zhang Shitao
Publication venue: Taylor and Francis Group and Juraj Dobrila University of Pula, Faculty of economics and tourism Dr. Mijo Mirković
Publication date: 01/01/2022
Field of study

The evaluation of sustainable rural tourism potential is a key work in sustainable rural tourism development. Due to the complexity of the rural tourism development situation and the limited cognition of people, most of the assessment problems for sustainable rural tourism potential are highly uncertain, which brings challenges to the characterisation and measurement of evaluation information. Besides, decision-makers (DMs) usually do not exhibit complete rationality in the practical evaluation process. To tackle such problems, this paper proposes a new behaviour multi-attribute group decision-making (MAGDM) method with probabilistic linguistic terms sets (PLTSs) by integrating Wasserstein distance measure into TODIM (an acronym in Portuguese of interactive and multicriteria decision making) method. Firstly, a new Wasserstein-based distance measure with PLTSs is defined, and some properties of the proposed distance are developed. Secondly, based on the correlation coefficient among attributes and standard deviation of each attribute, an attribute weight determination method (called PL-CRITIC method) is proposed. Subsequently, a Wasserstein distance-based probabilistic linguistic TODIM method is developed. Finally, the proposed method is applied to the evaluation of sustainable rural tourism potential, along with sensitivity and comparative analyses, as a means of illustrating the effectiveness and advantages of the new method

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia