Search CORE

60 research outputs found

Multiobjective Gate Assignment Based on Passenger Walking Distance and Fairness

Author: Linyan Zeng
Yu Jiang
Yuxiao Luo
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

Passenger walking distance is an important index of the airport service quality. How to shorten the walking distance and balance the airlines' service quality is the focus of much research on airport gate assignment problems. According to the problems of airport passenger service quality, an optimization gate assignment model is established. The gate assignment model is based on minimizing the total walking distance of all passengers and balancing the average walking distance of passengers among different airlines. Lingo is used in the simulation of a large airport gate assignment. Test results show that the optimization model can reduce the average walking distance of passenger effectively, improve the number of flights assigned to gate, balance airline service quality, and enhance the overall service level of airports and airlines. The model provides reference for the airport gate preassignment

Crossref

Directory of Open Access Journals

Revisiting Parallel Context Windows: A Frustratingly Simple Alternative and Chain-of-Thought Deterioration

Author: Dong Yuxiao
Liu Xiao
Men Kaiwen
Tang Jie
Yang Kejuan
Zeng Aohan
Publication venue
Publication date: 24/05/2023
Field of study

We identify two crucial limitations in the evaluation of recent parallel-integrated method Parallel Context Windows (PCW), which extends the maximum context lengths of language models, e.g., 2048 for LLaMA, by harnessing window-wise attention and positional embedding techniques. We first show that a simple yet strong baseline, weighted sum ensemble, is missing for the in-context few-shot classification. Moreover, on more challenging Chain-of-Thought (CoT) reasoning (e.g., HotpotQA), PCW would present unexpected deterioration regarding question miscomprehension and false inference. Based on our findings, we suggest that the existing PCW design may not guarantee sufficient improvement and practicality in handling lengthy documents in real-world applications. More community efforts on enabling language models' long context understanding ability should be paid

arXiv.org e-Print Archive

AgentTuning: Enabling Generalized Agent Abilities for LLMs

Author: Dong Yuxiao
Liu Mingdao
Liu Xiao
Lu Rui
Tang Jie
Wang Bowen
Zeng Aohan
Publication venue
Publication date: 22/10/2023
Field of study

Open large language models (LLMs) with great performance in various tasks have significantly advanced the development of LLMs. However, they are far inferior to commercial models such as ChatGPT and GPT-4 when acting as agents to tackle complex tasks in the real world. These agent tasks employ LLMs as the central controller responsible for planning, memorization, and tool utilization, necessitating both fine-grained prompting methods and robust LLMs to achieve satisfactory performance. Though many prompting methods have been proposed to complete particular agent tasks, there is lack of research focusing on improving the agent capabilities of LLMs themselves without compromising their general abilities. In this work, we present AgentTuning, a simple and general method to enhance the agent abilities of LLMs while maintaining their general LLM capabilities. We construct AgentInstruct, a lightweight instruction-tuning dataset containing high-quality interaction trajectories. We employ a hybrid instruction-tuning strategy by combining AgentInstruct with open-source instructions from general domains. AgentTuning is used to instruction-tune the Llama 2 series, resulting in AgentLM. Our evaluations show that AgentTuning enables LLMs' agent capabilities without compromising general abilities. The AgentLM-70B is comparable to GPT-3.5-turbo on unseen agent tasks, demonstrating generalized agent capabilities. We open source the AgentInstruct and AgentLM-7B, 13B, and 70B models at https://github.com/THUDM/AgentTuning, serving open and powerful alternatives to commercial LLMs for agent tasks.Comment: 31 page

arXiv.org e-Print Archive

CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation

Author: Cheng Jiale
Dong Yuxiao
Feng Zhuoer
Huang Minlie
Ke Pei
Lei Xuanyu
Liu Xiao
Tang Jie
Wang Hongning
Wang Shengyuan
Wen Bosi
Zeng Aohan
Publication venue
Publication date: 30/11/2023
Field of study

Since the natural language processing (NLP) community started to make large language models (LLMs), such as GPT-4, act as a critic to evaluate the quality of generated texts, most of them only train a critique generation model of a specific scale on specific datasets. We argue that a comprehensive investigation on the key factor of LLM-based evaluation models, such as scaling properties, is lacking, so that it is still inconclusive whether these models have potential to replace GPT-4's evaluation in practical scenarios. In this paper, we propose a new critique generation model called CritiqueLLM, which includes a dialogue-based prompting method for high-quality referenced / reference-free evaluation data. Experimental results show that our model can achieve comparable evaluation performance to GPT-4 especially in system-level correlations, and even outperform GPT-4 in 3 out of 8 tasks in a challenging reference-free setting. We conduct detailed analysis to show promising scaling properties of our model in the quality of generated critiques. We also demonstrate that our generated critiques can act as scalable feedback to directly improve the generation quality of LLMs.Comment: 18 pages, 5 figure

arXiv.org e-Print Archive

Recommended from our members

Grafted c-kit+/SSEA1− eye-wall progenitor cells delay retinal degeneration in mice by regulating neural plasticity and forming new graft-to-host synapses

Author: Chen Xi
Chen Zehua
Fu Caiyun
Li Zhengya
Liu Xiaoli
Xu Haiwei
Yin Zheng Qin
Zeng Yuxiao
Zhao Chen
Zou Ting
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/02/2017
Field of study

Background: Despite diverse pathogenesis, the common pathological change observed in age-related macular degeneration and in most hereditary retinal degeneration (RD) diseases is photoreceptor loss. Photoreceptor replacement by cell transplantation may be a feasible treatment for RD. The major obstacles to clinical translation of stem cell-based cell therapy in RD remain the difficulty of obtaining sufficient quantities of appropriate and safe donor cells and the poor integration of grafted stem cell-derived photoreceptors into the remaining retinal circuitry. Methods: Eye-wall c-kit+/stage-specific embryonic antigen 1 (SSEA1)− cells were isolated via fluorescence-activated cell sorting, and their self-renewal and differentiation potential were detected by immunochemistry and flow cytometry in vitro. After labeling with quantum nanocrystal dots and transplantation into the subretinal space of rd1 RD mice, differentiation and synapse formation by daughter cells of the eye-wall c-kit+/SSEA1− cells were evaluated by immunochemistry and western blotting. Morphological changes of the inner retina of rd1 mice after cell transplantation were demonstrated by immunochemistry. Retinal function of rd1 mice that received cell grafts was tested via flash electroretinograms and the light/dark transition test. Results: Eye-wall c-kit+/SSEA1− cells were self-renewing and clonogenic, and they retained their proliferative potential through more than 20 passages. Additionally, eye-wall c-kit+/SSEA1− cells were capable of differentiating into multiple retinal cell types including photoreceptors, bipolar cells, horizontal cells, amacrine cells, Müller cells, and retinal pigment epithelium cells and of transdifferentiating into smooth muscle cells and endothelial cells in vitro. The levels of synaptophysin and postsynaptic density-95 in the retinas of eye-wall c-kit+/SSEA1− cell-transplanted rd1 mice were significantly increased at 4 weeks post transplantation. The c-kit+/SSEA1− cells were capable of differentiating into functional photoreceptors that formed new synaptic connections with recipient retinas in rd1 mice. Transplantation also partially corrected the abnormalities of inner retina of rd1 mice. At 4 and 8 weeks post transplantation, the rd1 mice that received c-kit+/SSEA1− cells showed significant increases in a-wave and b-wave amplitude and the percentage of time spent in the dark area. Conclusions: Grafted c-kit+/SSEA1− cells restored the retinal function of rd1 mice via regulating neural plasticity and forming new graft-to-host synapses. Electronic supplementary material The online version of this article (doi:10.1186/s13287-016-0451-8) contains supplementary material, which is available to authorized users

Harvard University - DASH

GLM-130B: An Open Bilingual Pre-trained Model

Author: Chen Wenguang
Ding Ming
Dong Yuxiao
Du Zhengxiao
Lai Hanyu
Liu Xiao
Ma Zixuan
Tam Weng Lam
Tang Jie
Wang Zihan
Xia Xiao
Xu Yifan
Xue Yufei
Yang Zhuoyi
Zeng Aohan
Zhai Jidong
Zhang Peng
Zheng Wendi
Publication venue
Publication date: 25/10/2023
Field of study

We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters. It is an attempt to open-source a 100B-scale model at least as good as GPT-3 (davinci) and unveil how models of such a scale can be successfully pre-trained. Over the course of this effort, we face numerous unexpected technical and engineering challenges, particularly on loss spikes and divergence. In this paper, we introduce the training process of GLM-130B including its design choices, training strategies for both efficiency and stability, and engineering efforts. The resultant GLM-130B model offers significant outperformance over GPT-3 175B (davinci) on a wide range of popular English benchmarks while the performance advantage is not observed in OPT-175B and BLOOM-176B. It also consistently and significantly outperforms ERNIE TITAN 3.0 260B -- the largest Chinese language model -- across related benchmarks. Finally, we leverage a unique scaling property of GLM-130B to reach INT4 quantization without post training, with almost no performance loss, making it the first among 100B-scale models and more importantly, allowing its effective inference on 4

\times

RTX 3090 (24G) or 8

\times

RTX 2080 Ti (11G) GPUs, the most affordable GPUs required for using 100B-scale models. The GLM-130B model weights are publicly accessible and its code, training logs, related toolkit, and lessons learned are open-sourced at \url{https://github.com/THUDM/GLM-130B/}.Comment: Accepted to ICLR 202

arXiv.org e-Print Archive

AgentBench: Evaluating LLMs as Agents

Author: Deng Xiang
Ding Hangliang
Dong Yuxiao
Du Zhengxiao
Gu Yu
Huang Minlie
Lai Hanyu
Lei Xuanyu
Liu Xiao
Men Kaiwen
Shen Sheng
Su Yu
Sun Huan
Tang Jie
Xu Yifan
Yang Kejuan
Yu Hao
Zeng Aohan
Zhang Chenhui
Zhang Hanchen
Zhang Shudan
Zhang Tianjun
Publication venue
Publication date: 25/10/2023
Field of study

Large Language Models (LLMs) are becoming increasingly smart and autonomous, targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has been an urgent need to evaluate LLMs as agents on challenging tasks in interactive environments. We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities in a multi-turn open-ended generation setting. Our extensive test over 27 API-based and open-sourced (OSS) LLMs shows that, while top commercial LLMs present a strong ability of acting as agents in complex environments, there is a significant disparity in performance between them and OSS competitors. We identify the typical reasons of failures in environments and LLMs, showing that poor long-term reasoning, decision-making, and instruction following abilities are the main obstacles for developing usable LLM agents. Training on code and high quality multi-turn alignment data could improve agent performance. Datasets, environments, and an integrated evaluation package for AgentBench are released at \url{https://github.com/THUDM/AgentBench}.Comment: 55 page

arXiv.org e-Print Archive

Optimizing interplanar spacing, oxygen vacancies and micromorphology via lithium-ion pre-insertion into ammonium vanadate nanosheets for advanced cathodes in aqueous zinc-ion batteries

Author: Chen Ji
Chen Yuxiang
Lam Kwok-Ho
Li Yangjie
Lin Dunmin
Tan Xin
Wu Xingqiao
Zeng Yuxiao
Zhai Yijun
Zhang Xiaoqin
Zhang Xiaoyue
Zheng Qiaoji
Publication venue: Wiley
Publication date: 11/02/2024
Field of study

Ammonium vanadates, featuring an N─H···O hydrogen bond network structure between NH4+ and V─O layers, have become popular cathode materials for aqueous zinc-ion batteries (AZIBs). Their appeal lies in their multi-electron transfer, high specific capacity, and facile synthesis. However, a major drawback arises as Zn2+ ions tend to form bonds with electronegative oxygen atoms between V─O layers during cycling, leading to irreversible structural collapse. Herein, Li+ pre-insertion into the intermediate layer of NH4V4O10 is proposed to enhance the electrochemical activity of ammonium vanadate cathodes for AZIBs, which extends the interlayer distance of NH4V4O10 to 9.8 Å and offers large interlaminar channels for Zn2+ (de)intercalation. Moreover, Li+ intercalation weakens the crystallinity, transforms the micromorphology from non-nanostructured strips to ultrathin nanosheets, and increases the level of oxygen defects, thus exposing more active sites for ion and electron transport, facilitating electrolyte penetration, and improving electrochemical kinetics of electrode. In addition, the introduction of Li+ significantly reduces the bandgap by 0.18 eV, enhancing electron transfer in redox reactions. Leveraging these unique advantages, the Li+ pre-intercalated NH4V4O10 cathode exhibits a high reversible capacity of 486.1 mAh g−1 at 0.5 A g−1 and an impressive capacity retention rate of 72% after 5,000 cycles at 5 A g−1

Enlighten

ChatRadio-Valuer: A Chat Large Language Model for Generalizable Radiology Report Generation Based on Multi-institution and Multi-system Data

Author: Cai Xiaoyan
Chen Hao
Dai Haixing
Dong Peixin
Guo Lei
Han Junwei
He Lei
Hu Xintao
Jiang Xi
Jiang Zuowei
Kui Xiaoyan
Li Ming
Li Xiang
Li Yiwei
Liu Jun
Liu Tianming
Liu Yuxiao
Liu Zhengliang
Pan Yi
Shang Youlan
Shen Dinggang
Wang Jiaqi
Wang Yisong
Wei Yaonai
Wu Zihao
Yang Li
Yang Longtao
Yao Jiaqi
Zeng Ying
Zhang Lu
Zhang Shu
Zhang Tuo
Zhang Xin
Zhang Yutong
Zhang Zhixue
Zhao Huan
Zhao Shijie
Zhao Wei
Zheng Chao
Zhong Tianyang
Zhu Dajiang
Zhu Ning
Publication venue
Publication date: 09/10/2023
Field of study

Radiology report generation, as a key step in medical image analysis, is critical to the quantitative analysis of clinically informed decision-making levels. However, complex and diverse radiology reports with cross-source heterogeneity pose a huge generalizability challenge to the current methods under massive data volume, mainly because the style and normativity of radiology reports are obviously distinctive among institutions, body regions inspected and radiologists. Recently, the advent of large language models (LLM) offers great potential for recognizing signs of health conditions. To resolve the above problem, we collaborate with the Second Xiangya Hospital in China and propose ChatRadio-Valuer based on the LLM, a tailored model for automatic radiology report generation that learns generalizable representations and provides a basis pattern for model adaptation in sophisticated analysts' cases. Specifically, ChatRadio-Valuer is trained based on the radiology reports from a single institution by means of supervised fine-tuning, and then adapted to disease diagnosis tasks for human multi-system evaluation (i.e., chest, abdomen, muscle-skeleton, head, and maxillofacial

\&

neck) from six different institutions in clinical-level events. The clinical dataset utilized in this study encompasses a remarkable total of \textbf{332,673} observations. From the comprehensive results on engineering indicators, clinical efficacy and deployment cost metrics, it can be shown that ChatRadio-Valuer consistently outperforms state-of-the-art models, especially ChatGPT (GPT-3.5-Turbo) and GPT-4 et al., in terms of the diseases diagnosis from radiology reports. ChatRadio-Valuer provides an effective avenue to boost model generalization performance and alleviate the annotation workload of experts to enable the promotion of clinical AI applications in radiology reports

arXiv.org e-Print Archive

Research on Reflective Semiconductor Optical Amplifier and Its Application in Wavelength Division Multiplexed Passive Optical Network

Author: Zeng Yuxiao
Publication venue
Publication date
Field of study

Institutional Repositories DataBase (IRDB)