8 research outputs found
When Do Program-of-Thoughts Work for Reasoning?
The reasoning capabilities of Large Language Models (LLMs) play a pivotal
role in the realm of embodied artificial intelligence. Although there are
effective methods like program-of-thought prompting for LLMs which uses
programming language to tackle complex reasoning tasks, the specific impact of
code data on the improvement of reasoning capabilities remains under-explored.
To address this gap, we propose complexity-impacted reasoning score (CIRS),
which combines structural and logical attributes, to measure the correlation
between code and reasoning abilities. Specifically, we use the abstract syntax
tree to encode the structural information and calculate logical complexity by
considering the difficulty and the cyclomatic complexity. Through an empirical
analysis, we find not all code data of complexity can be learned or understood
by LLMs. Optimal level of complexity is critical to the improvement of
reasoning abilities by program-aided prompting. Then we design an
auto-synthesizing and stratifying algorithm, and apply it to instruction
generation for mathematical reasoning and code data filtering for code
generation tasks. Extensive results demonstrates the effectiveness of our
proposed approach. Code will be integrated into the EasyInstruct framework at
https://github.com/zjunlp/EasyInstruct.Comment: Work in progres
OceanGPT: A Large Language Model for Ocean Science Tasks
Ocean science, which delves into the oceans that are reservoirs of life and
biodiversity, is of great significance given that oceans cover over 70% of our
planet's surface. Recently, advances in Large Language Models (LLMs) have
transformed the paradigm in science. Despite the success in other domains,
current LLMs often fall short in catering to the needs of domain experts like
oceanographers, and the potential of LLMs for ocean science is under-explored.
The intrinsic reason may be the immense and intricate nature of ocean data as
well as the necessity for higher granularity and richness in knowledge. To
alleviate these issues, we introduce OceanGPT, the first-ever LLM in the ocean
domain, which is expert in various ocean science tasks. We propose DoInstruct,
a novel framework to automatically obtain a large volume of ocean domain
instruction data, which generates instructions based on multi-agent
collaboration. Additionally, we construct the first oceanography benchmark,
OceanBench, to evaluate the capabilities of LLMs in the ocean domain. Though
comprehensive experiments, OceanGPT not only shows a higher level of knowledge
expertise for oceans science tasks but also gains preliminary embodied
intelligence capabilities in ocean technology. Codes, data and checkpoints will
soon be available at https://github.com/zjunlp/KnowLM.Comment: Work in progress. Project Website:
https://zjunlp.github.io/project/OceanGPT
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models
Large Language Models (LLMs) usually suffer from knowledge cutoff or fallacy
issues, which means they are unaware of unseen events or generate text with
incorrect facts owing to the outdated/noisy data. To this end, many knowledge
editing approaches for LLMs have emerged -- aiming to subtly inject/edit
updated knowledge or adjust undesired behavior while minimizing the impact on
unrelated inputs. Nevertheless, due to significant differences among various
knowledge editing methods and the variations in task setups, there is no
standard implementation framework available for the community, which hinders
practitioners to apply knowledge editing to applications. To address these
issues, we propose EasyEdit, an easy-to-use knowledge editing framework for
LLMs. It supports various cutting-edge knowledge editing approaches and can be
readily apply to many well-known LLMs such as T5, GPT-J, LlaMA, etc.
Empirically, we report the knowledge editing results on LlaMA-2 with EasyEdit,
demonstrating that knowledge editing surpasses traditional fine-tuning in terms
of reliability and generalization. We have released the source code on GitHub
at https://github.com/zjunlp/EasyEdit, along with Google Colab tutorials and
comprehensive documentation for beginners to get started. Besides, we present
an online system for real-time knowledge editing, and a demo video at
http://knowlm.zjukg.cn/easyedit.mp4.Comment: The project website is https://github.com/zjunlp/EasyEdi
Dart: A Framework for Grid-Based Database Resource Access and Discovery
Abstract. The Data Grid serves as a data management solution widely adopted by the existent data-intensive Grid applications. However, we argue that the core Grid data management demands can be better satisfied with introduction of database. In this paper, we provide a database-oriented resource management framework, which is intended to integrate database resources with the Grid infrastructure. We start by outlining the sketch of the proposed Database Grid architecture and then focus on two base-level services: the remote database access and database discovery, which we think should be firstly settled down as a necessary foundation for other high-level database services that are more application-driven. We discuss how the design principles that apply to these base-level services can adapt to the characteristics of Grid environment and how they can be nested within the OGSA paradigm.
Promoting Mechanisms of Logistic Capability of Pharmaceutical Wholesale Enterprise Based on Brusselator
DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population
We present an open-source and extensible knowledge extraction toolkit DeepKE,
supporting complicated low-resource, document-level and multimodal scenarios in
the knowledge base population. DeepKE implements various information extraction
tasks, including named entity recognition, relation extraction and attribute
extraction. With a unified framework, DeepKE allows developers and researchers
to customize datasets and models to extract information from unstructured data
according to their requirements. Specifically, DeepKE not only provides various
functional modules and model implementation for different tasks and scenarios
but also organizes all components by consistent frameworks to maintain
sufficient modularity and extensibility. We release the source code at GitHub
in https://github.com/zjunlp/DeepKE with Google Colab tutorials and
comprehensive documents for beginners. Besides, we present an online system in
http://deepke.openkg.cn/EN/re_doc_show.html for real-time extraction of various
tasks, and a demo video.Comment: Work in progress and the project website is http://deepke.zjukg.cn
Changing antimicrobial susceptibility and molecular characterisation of Neisseria gonorrhoeae isolates in Guangdong, China: in a background of rapidly rising epidemic
Discovery of a Novel, Orally Efficacious Liver X Receptor (LXR) β Agonist
This
article describes the application of Contour to the design
and discovery of a novel, potent, orally efficacious liver X receptor
β (LXRβ) agonist (<b>17</b>). Contour technology
is a structure-based drug design platform that generates molecules
using a context perceptive growth algorithm guided by a contact sensitive
scoring function. The growth engine uses binding site perception and
programmable growth capability to create drug-like molecules by assembling
fragments that naturally complement hydrophilic and hydrophobic features
of the protein binding site. Starting with a crystal structure of
LXRβ and a docked 2-(methylsulfonyl)Âbenzyl alcohol fragment
(<b>6</b>), Contour was used to design agonists containing a
piperazine core. Compound <b>17</b> binds to LXRβ with
high affinity and to LXRα to a lesser extent, and induces the
expression of LXR target genes <i>in vitro</i> and <i>in vivo</i>. This molecule served as a starting point for further
optimization and generation of a candidate which is currently in human
clinical trials for treating atopic dermatitis