20 research outputs found
Learning to Program with Natural Language
Large Language Models (LLMs) have shown remarkable performance in various
basic natural language tasks, which raises hope for achieving Artificial
General Intelligence. For completing the complex task, we still need a program
for the task first and then ask LLMs to follow the program to generate the
specific solution. We propose using natural language as a new programming
language to describe task procedures, making them easily understandable to both
humans and LLMs. ~The LLM is capable of directly generating natural language
programs, but these programs may still contain factual errors or incomplete
steps. Therefore, we further propose the Learning to Program (\text{LP}) method
to ask LLMs themselves to learn the natural language program based on the
training dataset of the complex task first and then use the learned program to
guide the inference. Our experiments on the reasoning tasks of five different
reasoning types (8 datasets) demonstrate the effectiveness of our approach.
Further, our analysis experiment shows that the learned program can be directly
used to guide another LLM to improve its performance, which reveals a new
transfer learning paradigm.Comment: Work in progres
GameEval: Evaluating LLMs on Conversational Games
The rapid advancements in large language models (LLMs) have presented
challenges in evaluating those models. Existing evaluation methods are either
reference-based or preference based, which inevitably need human intervention
or introduce test bias caused by evaluator models. In this paper, we propose
GameEval, a novel approach to evaluating LLMs through goal-driven
conversational games, overcoming the limitations of previous methods. GameEval
treats LLMs as game players and assigns them distinct roles with specific goals
achieved by launching conversations of various forms, including discussion,
question answering, and voting. We design three unique games with cooperative
or adversarial objectives, accompanied by corresponding evaluation metrics, to
show how this new paradigm comprehensively evaluates model performance.Through
extensive experiments, we show that GameEval can effectively differentiate the
capabilities of various LLMs, providing a comprehensive assessment of their
integrated abilities to solve complex problems. Our public anonymous code is
available at https://github.com/GameEval/GameEval
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
Artificial Intelligence (AI) has made incredible progress recently. On the
one hand, advanced foundation models like ChatGPT can offer powerful
conversation, in-context learning and code generation abilities on a broad
range of open-domain tasks. They can also generate high-level solution outlines
for domain-specific tasks based on the common sense knowledge they have
acquired. However, they still face difficulties with some specialized tasks
because they lack enough domain-specific data during pre-training or they often
have errors in their neural network computations on those tasks that need
accurate executions. On the other hand, there are also many existing models and
systems (symbolic-based or neural-based) that can do some domain-specific tasks
very well. However, due to the different implementation or working mechanisms,
they are not easily accessible or compatible with foundation models. Therefore,
there is a clear and pressing need for a mechanism that can leverage foundation
models to propose task solution outlines and then automatically match some of
the sub-tasks in the outlines to the off-the-shelf models and systems with
special functionalities to complete them. Inspired by this, we introduce
TaskMatrix.AI as a new AI ecosystem that connects foundation models with
millions of APIs for task completion. Unlike most previous work that aimed to
improve a single AI model, TaskMatrix.AI focuses more on using existing
foundation models (as a brain-like central system) and APIs of other AI models
and systems (as sub-task solvers) to achieve diversified tasks in both digital
and physical domains. As a position paper, we will present our vision of how to
build such an ecosystem, explain each key component, and use study cases to
illustrate both the feasibility of this vision and the main challenges we need
to address next
Revealing the two-dimensional electronic structure and anisotropic superconductivity in a natural van der Waals superlattice (PbSe)NbSe
Van der Waals superlattices are important for tailoring the electronic
structures and properties of layered materials. Here we report the
superconducting properties and electronic structure of a natural van der Waals
superlattice (PbSe)NbSe. Anisotropic superconductivity with a
transition temperature = 5.6 0.1 K, which is higher than monolayer
NbSe, is revealed by transport measurements on high-quality samples.
Angle-resolved photoemission spectroscopy (ARPES) measurements reveal the
two-dimensional electronic structure and a charge transfer of 0.43 electrons
per NbSe unit cell from the blocking PbSe layer. In addition,
polarization-dependent ARPES measurements reveal a significant circular
dichroism with opposite contrast at K and K' valleys, suggesting a significant
spin-orbital coupling and distinct orbital angular momentum. Our work suggests
natural van der Waals superlattice as an effective pathway for achieving
intriguing properties distinct from both the bulk and monolayer samples.Comment: 8 pages, 4 figure
Unsupervised Context Aware Sentence Representation Pretraining for Multi-lingual Dense Retrieval
Recent research demonstrates the effectiveness of using pretrained language
models (PLM) to improve dense retrieval and multilingual dense retrieval. In
this work, we present a simple but effective monolingual pretraining task
called contrastive context prediction~(CCP) to learn sentence representation by
modeling sentence level contextual relation. By pushing the embedding of
sentences in a local context closer and pushing random negative samples away,
different languages could form isomorphic structure, then sentence pairs in two
different languages will be automatically aligned. Our experiments show that
model collapse and information leakage are very easy to happen during
contrastive training of language model, but language-specific memory bank and
asymmetric batch normalization operation play an essential role in preventing
collapsing and information leakage, respectively. Besides, a post-processing
for sentence embedding is also very effective to achieve better retrieval
performance. On the multilingual sentence retrieval task Tatoeba, our model
achieves new SOTA results among methods without using bilingual data. Our model
also shows larger gain on Tatoeba when transferring between non-English pairs.
On two multi-lingual query-passage retrieval tasks, XOR Retrieve and Mr.TYDI,
our model even achieves two SOTA results in both zero-shot and supervised
setting among all pretraining models using bilingual data
Revealing the Heavy Quasiparticles in the Heavy-Fermion Superconductor CeCu2Si2
The superconducting order parameter of the first heavy-fermion superconductor
CeCu2Si2 is currently under debate. A key ingredient to understand its
superconductivity and physical properties is the quasiparticle dispersion and
Fermi surface, which remains elusive experimentally. Here we present
measurements from angle-resolved photoemission spectroscopy. Our results
emphasize the key role played by the Ce 4f electrons for the low-temperature
Fermi surface, highlighting a band-dependent conduction-f electron
hybridization. In particular, we find a very heavy quasi-two-dimensional
electron band near the bulk X point and moderately heavy three-dimensional hole
pockets near the Z point. Comparison with theoretical calculations reveals the
strong local correlation in this compound, calling for further theoretical
studies. Our results provide the electronic basis to understand the heavy
fermion behavior and superconductivity; implications for the enigmatic
superconductivity of this compound are also discussed