20 research outputs found

    Learning to Program with Natural Language

    Full text link
    Large Language Models (LLMs) have shown remarkable performance in various basic natural language tasks, which raises hope for achieving Artificial General Intelligence. For completing the complex task, we still need a program for the task first and then ask LLMs to follow the program to generate the specific solution. We propose using natural language as a new programming language to describe task procedures, making them easily understandable to both humans and LLMs. ~The LLM is capable of directly generating natural language programs, but these programs may still contain factual errors or incomplete steps. Therefore, we further propose the Learning to Program (\text{LP}) method to ask LLMs themselves to learn the natural language program based on the training dataset of the complex task first and then use the learned program to guide the inference. Our experiments on the reasoning tasks of five different reasoning types (8 datasets) demonstrate the effectiveness of our approach. Further, our analysis experiment shows that the learned program can be directly used to guide another LLM to improve its performance, which reveals a new transfer learning paradigm.Comment: Work in progres

    GameEval: Evaluating LLMs on Conversational Games

    Full text link
    The rapid advancements in large language models (LLMs) have presented challenges in evaluating those models. Existing evaluation methods are either reference-based or preference based, which inevitably need human intervention or introduce test bias caused by evaluator models. In this paper, we propose GameEval, a novel approach to evaluating LLMs through goal-driven conversational games, overcoming the limitations of previous methods. GameEval treats LLMs as game players and assigns them distinct roles with specific goals achieved by launching conversations of various forms, including discussion, question answering, and voting. We design three unique games with cooperative or adversarial objectives, accompanied by corresponding evaluation metrics, to show how this new paradigm comprehensively evaluates model performance.Through extensive experiments, we show that GameEval can effectively differentiate the capabilities of various LLMs, providing a comprehensive assessment of their integrated abilities to solve complex problems. Our public anonymous code is available at https://github.com/GameEval/GameEval

    TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

    Full text link
    Artificial Intelligence (AI) has made incredible progress recently. On the one hand, advanced foundation models like ChatGPT can offer powerful conversation, in-context learning and code generation abilities on a broad range of open-domain tasks. They can also generate high-level solution outlines for domain-specific tasks based on the common sense knowledge they have acquired. However, they still face difficulties with some specialized tasks because they lack enough domain-specific data during pre-training or they often have errors in their neural network computations on those tasks that need accurate executions. On the other hand, there are also many existing models and systems (symbolic-based or neural-based) that can do some domain-specific tasks very well. However, due to the different implementation or working mechanisms, they are not easily accessible or compatible with foundation models. Therefore, there is a clear and pressing need for a mechanism that can leverage foundation models to propose task solution outlines and then automatically match some of the sub-tasks in the outlines to the off-the-shelf models and systems with special functionalities to complete them. Inspired by this, we introduce TaskMatrix.AI as a new AI ecosystem that connects foundation models with millions of APIs for task completion. Unlike most previous work that aimed to improve a single AI model, TaskMatrix.AI focuses more on using existing foundation models (as a brain-like central system) and APIs of other AI models and systems (as sub-task solvers) to achieve diversified tasks in both digital and physical domains. As a position paper, we will present our vision of how to build such an ecosystem, explain each key component, and use study cases to illustrate both the feasibility of this vision and the main challenges we need to address next

    Revealing the two-dimensional electronic structure and anisotropic superconductivity in a natural van der Waals superlattice (PbSe)1.14_{1.14}NbSe2_2

    Full text link
    Van der Waals superlattices are important for tailoring the electronic structures and properties of layered materials. Here we report the superconducting properties and electronic structure of a natural van der Waals superlattice (PbSe)1.14_{1.14}NbSe2_2. Anisotropic superconductivity with a transition temperature TcT_c = 5.6 ±\pm 0.1 K, which is higher than monolayer NbSe2_2, is revealed by transport measurements on high-quality samples. Angle-resolved photoemission spectroscopy (ARPES) measurements reveal the two-dimensional electronic structure and a charge transfer of 0.43 electrons per NbSe2_2 unit cell from the blocking PbSe layer. In addition, polarization-dependent ARPES measurements reveal a significant circular dichroism with opposite contrast at K and K' valleys, suggesting a significant spin-orbital coupling and distinct orbital angular momentum. Our work suggests natural van der Waals superlattice as an effective pathway for achieving intriguing properties distinct from both the bulk and monolayer samples.Comment: 8 pages, 4 figure

    Unsupervised Context Aware Sentence Representation Pretraining for Multi-lingual Dense Retrieval

    Full text link
    Recent research demonstrates the effectiveness of using pretrained language models (PLM) to improve dense retrieval and multilingual dense retrieval. In this work, we present a simple but effective monolingual pretraining task called contrastive context prediction~(CCP) to learn sentence representation by modeling sentence level contextual relation. By pushing the embedding of sentences in a local context closer and pushing random negative samples away, different languages could form isomorphic structure, then sentence pairs in two different languages will be automatically aligned. Our experiments show that model collapse and information leakage are very easy to happen during contrastive training of language model, but language-specific memory bank and asymmetric batch normalization operation play an essential role in preventing collapsing and information leakage, respectively. Besides, a post-processing for sentence embedding is also very effective to achieve better retrieval performance. On the multilingual sentence retrieval task Tatoeba, our model achieves new SOTA results among methods without using bilingual data. Our model also shows larger gain on Tatoeba when transferring between non-English pairs. On two multi-lingual query-passage retrieval tasks, XOR Retrieve and Mr.TYDI, our model even achieves two SOTA results in both zero-shot and supervised setting among all pretraining models using bilingual data

    Revealing the Heavy Quasiparticles in the Heavy-Fermion Superconductor CeCu2Si2

    No full text
    The superconducting order parameter of the first heavy-fermion superconductor CeCu2Si2 is currently under debate. A key ingredient to understand its superconductivity and physical properties is the quasiparticle dispersion and Fermi surface, which remains elusive experimentally. Here we present measurements from angle-resolved photoemission spectroscopy. Our results emphasize the key role played by the Ce 4f electrons for the low-temperature Fermi surface, highlighting a band-dependent conduction-f electron hybridization. In particular, we find a very heavy quasi-two-dimensional electron band near the bulk X point and moderately heavy three-dimensional hole pockets near the Z point. Comparison with theoretical calculations reveals the strong local correlation in this compound, calling for further theoretical studies. Our results provide the electronic basis to understand the heavy fermion behavior and superconductivity; implications for the enigmatic superconductivity of this compound are also discussed
    corecore