Search CORE

20 research outputs found

Learning to Program with Natural Language

Author: Duan Nan
Guo Yiduo
Liang Yaobo
Wu Chenfei
Wu Wenshan
Zhao Dongyan
Publication venue
Publication date: 28/05/2023
Field of study

Large Language Models (LLMs) have shown remarkable performance in various basic natural language tasks, which raises hope for achieving Artificial General Intelligence. For completing the complex task, we still need a program for the task first and then ask LLMs to follow the program to generate the specific solution. We propose using natural language as a new programming language to describe task procedures, making them easily understandable to both humans and LLMs. ~The LLM is capable of directly generating natural language programs, but these programs may still contain factual errors or incomplete steps. Therefore, we further propose the Learning to Program (\text{LP}) method to ask LLMs themselves to learn the natural language program based on the training dataset of the complex task first and then use the learned program to guide the inference. Our experiments on the reasoning tasks of five different reasoning types (8 datasets) demonstrate the effectiveness of our approach. Further, our analysis experiment shows that the learned program can be directly used to guide another LLM to improve its performance, which reveals a new transfer learning paradigm.Comment: Work in progres

arXiv.org e-Print Archive

GameEval: Evaluating LLMs on Conversational Games

Author: Duan Nan
Li Juntao
Liang Yaobo
Qiao Dan
Wu Chenfei
Publication venue
Publication date: 19/08/2023
Field of study

The rapid advancements in large language models (LLMs) have presented challenges in evaluating those models. Existing evaluation methods are either reference-based or preference based, which inevitably need human intervention or introduce test bias caused by evaluator models. In this paper, we propose GameEval, a novel approach to evaluating LLMs through goal-driven conversational games, overcoming the limitations of previous methods. GameEval treats LLMs as game players and assigns them distinct roles with specific goals achieved by launching conversations of various forms, including discussion, question answering, and voting. We design three unique games with cooperative or adversarial objectives, accompanied by corresponding evaluation metrics, to show how this new paradigm comprehensively evaluates model performance.Through extensive experiments, we show that GameEval can effectively differentiate the capabilities of various LLMs, providing a comprehensive assessment of their integrated abilities to solve complex problems. Our public anonymous code is available at https://github.com/GameEval/GameEval

arXiv.org e-Print Archive

TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

Author: Duan Nan
Gong Ming
Ji Lei
Liang Yaobo
Liu Yu
Lu Shuai
Mao Shaoguang
Ou Yang
Shou Linjun
Song Ting
Wang Yun
Wu Chenfei
Wu Wenshan
Xia Yan
Publication venue
Publication date: 28/03/2023
Field of study

Artificial Intelligence (AI) has made incredible progress recently. On the one hand, advanced foundation models like ChatGPT can offer powerful conversation, in-context learning and code generation abilities on a broad range of open-domain tasks. They can also generate high-level solution outlines for domain-specific tasks based on the common sense knowledge they have acquired. However, they still face difficulties with some specialized tasks because they lack enough domain-specific data during pre-training or they often have errors in their neural network computations on those tasks that need accurate executions. On the other hand, there are also many existing models and systems (symbolic-based or neural-based) that can do some domain-specific tasks very well. However, due to the different implementation or working mechanisms, they are not easily accessible or compatible with foundation models. Therefore, there is a clear and pressing need for a mechanism that can leverage foundation models to propose task solution outlines and then automatically match some of the sub-tasks in the outlines to the off-the-shelf models and systems with special functionalities to complete them. Inspired by this, we introduce TaskMatrix.AI as a new AI ecosystem that connects foundation models with millions of APIs for task completion. Unlike most previous work that aimed to improve a single AI model, TaskMatrix.AI focuses more on using existing foundation models (as a brain-like central system) and APIs of other AI models and systems (as sub-task solvers) to achieve diversified tasks in both digital and physical domains. As a position paper, we will present our vision of how to build such an ecosystem, explain each key component, and use study cases to illustrate both the feasibility of this vision and the main challenges we need to address next

arXiv.org e-Print Archive

Directory of Open Access Journals

Revealing the two-dimensional electronic structure and anisotropic superconductivity in a natural van der Waals superlattice (PbSe) $_{1.14}$ NbSe $_2$

Author: Bao Ting
Denlinger Jonathan D.
Duan Wenhui
Huang Yaobo
Luo Laipeng
Rousuli Awabaikeli
Wu Yang
Xu Shengnan
Xu Yong
Yao Wei
Zhang Haoxiong
Zhang Hongyun
Zhang Kenan
Zhong Haoyuan
Zhou Shuyun
Publication venue
Publication date: 24/04/2023
Field of study

Van der Waals superlattices are important for tailoring the electronic structures and properties of layered materials. Here we report the superconducting properties and electronic structure of a natural van der Waals superlattice (PbSe)

_{1.14}

NbSe

_2

. Anisotropic superconductivity with a transition temperature

T_c

= 5.6

\pm

0.1 K, which is higher than monolayer NbSe

_2

, is revealed by transport measurements on high-quality samples. Angle-resolved photoemission spectroscopy (ARPES) measurements reveal the two-dimensional electronic structure and a charge transfer of 0.43 electrons per NbSe

_2

unit cell from the blocking PbSe layer. In addition, polarization-dependent ARPES measurements reveal a significant circular dichroism with opposite contrast at K and K' valleys, suggesting a significant spin-orbital coupling and distinct orbital angular momentum. Our work suggests natural van der Waals superlattice as an effective pathway for achieving intriguing properties distinct from both the bulk and monolayer samples.Comment: 8 pages, 4 figure

arXiv.org e-Print Archive

Study of a membrane bioreactor with glass fiber flat grille modules and the modules' optimization based on the local critical flux theory

Author: Dongdong Yuan
Guangxia Wu
Guoliang Xu
Wenjing Yang
Yan Yu
Yaobo Fan
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Unsupervised Context Aware Sentence Representation Pretraining for Multi-lingual Dense Retrieval

Author: Duan Nan
Gong Ming
Jiang Daxin
Liang Yaobo
Ren Houxing
Shou Linjun
Wu Ning
Publication venue
Publication date: 07/06/2022
Field of study

Recent research demonstrates the effectiveness of using pretrained language models (PLM) to improve dense retrieval and multilingual dense retrieval. In this work, we present a simple but effective monolingual pretraining task called contrastive context prediction~(CCP) to learn sentence representation by modeling sentence level contextual relation. By pushing the embedding of sentences in a local context closer and pushing random negative samples away, different languages could form isomorphic structure, then sentence pairs in two different languages will be automatically aligned. Our experiments show that model collapse and information leakage are very easy to happen during contrastive training of language model, but language-specific memory bank and asymmetric batch normalization operation play an essential role in preventing collapsing and information leakage, respectively. Besides, a post-processing for sentence embedding is also very effective to achieve better retrieval performance. On the multilingual sentence retrieval task Tatoeba, our model achieves new SOTA results among methods without using bilingual data. Our model also shows larger gain on Tatoeba when transferring between non-English pairs. On two multi-lingual query-passage retrieval tasks, XOR Retrieve and Mr.TYDI, our model even achieves two SOTA results in both zero-shot and supervised setting among all pretraining models using bilingual data

arXiv.org e-Print Archive

Revealing the Heavy Quasiparticles in the Heavy-Fermion Superconductor CeCu2Si2

Author: Adell Johan
Cao Chao
Fang Yuan
Huang Yaobo
Li Peng
Liu Yang
Shen Dawei
Steglich Frank
Su Hang
Thiagarajan Balasubramanian
Wu Yi
Wu Zhongzheng
Xie Wu
Yuan Huiqiu
Publication venue: 'American Physical Society (APS)'
Publication date: 11/07/2021
Field of study

The superconducting order parameter of the first heavy-fermion superconductor CeCu2Si2 is currently under debate. A key ingredient to understand its superconductivity and physical properties is the quasiparticle dispersion and Fermi surface, which remains elusive experimentally. Here we present measurements from angle-resolved photoemission spectroscopy. Our results emphasize the key role played by the Ce 4f electrons for the low-temperature Fermi surface, highlighting a band-dependent conduction-f electron hybridization. In particular, we find a very heavy quasi-two-dimensional electron band near the bulk X point and moderately heavy three-dimensional hole pockets near the Z point. Comparison with theoretical calculations reveals the strong local correlation in this compound, calling for further theoretical studies. Our results provide the electronic basis to understand the heavy fermion behavior and superconductivity; implications for the enigmatic superconductivity of this compound are also discussed

arXiv.org e-Print Archive

Lund University Publications

MPG.PuRe