Search CORE

183 research outputs found

Construction Technology of Pipe Jacking Method through Underground Obstacles

Author: Han Jianyong
Jia Dongfeng
Liu Dong
Wang Qinghai
Yan Fushun
Zhao Yue
Publication venue: 'Bilingual Publishing Co.'
Publication date: 26/12/2022
Field of study

Bilingual Publishing Co. (BPC): E-Journals

Retentive Network: A Successor to Transformer for Large Language Models

Author: Dong Li
Huang Shaohan
Ma Shuming
Sun Yutao
Wang Jianyong
Wei Furu
Xia Yuqing
Xue Jilong
Publication venue
Publication date: 25/07/2023
Field of study

In this work, we propose Retentive Network (RetNet) as a foundation architecture for large language models, simultaneously achieving training parallelism, low-cost inference, and good performance. We theoretically derive the connection between recurrence and attention. Then we propose the retention mechanism for sequence modeling, which supports three computation paradigms, i.e., parallel, recurrent, and chunkwise recurrent. Specifically, the parallel representation allows for training parallelism. The recurrent representation enables low-cost

O(1)

inference, which improves decoding throughput, latency, and GPU memory without sacrificing performance. The chunkwise recurrent representation facilitates efficient long-sequence modeling with linear complexity, where each chunk is encoded parallelly while recurrently summarizing the chunks. Experimental results on language modeling show that RetNet achieves favorable scaling results, parallel training, low-cost deployment, and efficient inference. The intriguing properties make RetNet a strong successor to Transformer for large language models. Code will be available at https://aka.ms/retnet

arXiv.org e-Print Archive

Can LLMs like GPT-4 outperform traditional AI tools in dementia diagnosis? Maybe, but not today

Author: Dong Bowen
Dong Liling
Gao Jing
Li Rongzhen
Li Xiuxing
Liu Ning
Mao Chenhui
Wang Jianyong
Wang Jie
Wang Zhuo
Zhang Wei
Publication venue
Publication date: 02/06/2023
Field of study

Recent investigations show that large language models (LLMs), specifically GPT-4, not only have remarkable capabilities in common Natural Language Processing (NLP) tasks but also exhibit human-level performance on various professional and academic benchmarks. However, whether GPT-4 can be directly used in practical applications and replace traditional artificial intelligence (AI) tools in specialized domains requires further experimental validation. In this paper, we explore the potential of LLMs such as GPT-4 to outperform traditional AI tools in dementia diagnosis. Comprehensive comparisons between GPT-4 and traditional AI tools are conducted to examine their diagnostic accuracy in a clinical setting. Experimental results on two real clinical datasets show that, although LLMs like GPT-4 demonstrate potential for future advancements in dementia diagnosis, they currently do not surpass the performance of traditional AI tools. The interpretability and faithfulness of GPT-4 are also evaluated by comparison with real doctors. We discuss the limitations of GPT-4 in its current state and propose future research directions to enhance GPT-4 in dementia diagnosis.Comment: 16 pages, 6 figure

arXiv.org e-Print Archive

FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering

Author: Dong Bowen
Duan Zhichao
Fan Sunqi
Gu Yu
Li Xiuxing
Li Zhenyu
Liu Ning
Wang Jianyong
Publication venue
Publication date: 26/01/2024
Field of study

Knowledge base question answering (KBQA) is a critical yet challenging task due to the vast number of entities within knowledge bases and the diversity of natural language questions posed by users. Unfortunately, the performance of most KBQA models tends to decline significantly in real-world scenarios where high-quality annotated data is insufficient. To mitigate the burden associated with manual annotation, we introduce FlexKBQA by utilizing Large Language Models (LLMs) as program translators for addressing the challenges inherent in the few-shot KBQA task. Specifically, FlexKBQA leverages automated algorithms to sample diverse programs, such as SPARQL queries, from the knowledge base, which are subsequently converted into natural language questions via LLMs. This synthetic dataset facilitates training a specialized lightweight model for the KB. Additionally, to reduce the barriers of distribution shift between synthetic data and real user questions, FlexKBQA introduces an executionguided self-training method to iterative leverage unlabeled user questions. Furthermore, we explore harnessing the inherent reasoning capability of LLMs to enhance the entire framework. Consequently, FlexKBQA delivers substantial flexibility, encompassing data annotation, deployment, and being domain agnostic. Through extensive experiments on GrailQA, WebQSP, and KQA Pro, we observe that under the few-shot even the more challenging zero-shot scenarios, FlexKBQA achieves impressive results with a few annotations, surpassing all previous baselines and even approaching the performance of supervised models, achieving a remarkable 93% performance relative to the fully-supervised models. We posit that FlexKBQA represents a significant advancement towards exploring better integration of large and lightweight models. The code is open-sourced.Comment: Accepted as AAAI-24 Oral paper; Knowledge Base Question Answering; Large Language Model; Data Generation; Few-Shot & Zero-Sho

arXiv.org e-Print Archive

A Modeling Study of the Responses of Mesosphere and Lower Thermosphere Winds to Geomagnetic Storms at Middle Latitudes

Author: Burns Alan G.
Chen Xuetao
Dong Wenjun
Li Jingyuan
Lu Jianyong
Wang Wenbin
Yuan Tao
Yue Jia
Publication venue: Hosted by Utah State University Libraries
Publication date: 24/04/2019
Field of study

Thermosphere Ionosphere Mesosphere Electrodynamics General Circulation Model (TIMEGCM) simulations are diagnostically analyzed to investigate the causes of mesosphere and lower thermosphere (MLT) wind changes at middle latitudes during the 17 April 2002 storm. In the early phase of the storm, middle‐latitude upper thermospheric wind changes are greater and occur earlier than MLT wind changes. The horizontal wind changes cause downward vertical wind changes, which are transmitted to the MLT region. Adiabatic heating and heat advection associated with downward vertical winds cause MLT temperature increases. The pressure gradient produced by these temperature changes and the Coriolis force then drive strong equatorward meridional wind changes at night, which expand toward lower latitudes. Momentum advection is minor. As the storm evolves, the enhanced MLT temperatures produce upward vertical winds. These upward winds then lead to a decreased temperature, which alters the MLT horizontal wind pattern and causes poleward wind disturbances at higher latitudes

Crossref

DigitalCommons@USU

Comparative proteomic analysis of plasma from bipolar depression and depressive disorder: identification of proteins associated with immune regulatory

Author: ChengLong Huang
ChengLong Rao
Dong Wu
HaiYang Shi
Jian Zhou
JianYong Tang
Jin Chen
Ke Cheng
Li Liao
Peng Xie
YiRen Song
YongTao Yang
You Wu
Publication venue: Springer Nature
Publication date: 01/01/2015
Field of study

Springer - Publisher Connector