Search CORE

63 research outputs found

Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer

Author: Chen Jindong
Hu Zhiting
Liu Lijuan
Tan Bowen
Xing Eric
Zhu Yun
Publication venue
Publication date: 11/11/2023
Field of study

Large language models (LLMs) such as T0, FLAN, and OPT-IML, excel in multi-tasking under a unified instruction-following paradigm, where they also exhibit remarkable generalization abilities to unseen tasks. Despite their impressive performance, these LLMs, with sizes ranging from several billion to hundreds of billions of parameters, demand substantial computational resources, making their training and inference expensive and inefficient. Furthermore, adapting these models to downstream applications, particularly complex tasks, is often unfeasible due to the extensive hardware requirements for finetuning, even when utilizing parameter-efficient approaches such as prompt tuning. Additionally, the most powerful multi-task LLMs, such as OPT-IML-175B and FLAN-PaLM-540B, are not publicly accessible, severely limiting their customization potential. To address these challenges, we introduce a pretrained small scorer, Cappy, designed to enhance the performance and efficiency of multi-task LLMs. With merely 360 million parameters, Cappy functions either independently on classification tasks or serve as an auxiliary component for LLMs, boosting their performance. Moreover, Cappy enables efficiently integrating downstream supervision without requiring LLM finetuning nor the access to their parameters. Our experiments demonstrate that, when working independently on 11 language understanding tasks from PromptSource, Cappy outperforms LLMs that are several orders of magnitude larger. Besides, on 45 complex tasks from BIG-Bench, Cappy boosts the performance of the advanced multi-task LLM, FLAN-T5, by a large margin. Furthermore, Cappy is flexible to cooperate with other LLM adaptations, including finetuning and in-context learning, offering additional performance enhancement.Comment: In proceedings of NeurIPS 2023; Code and model available at https://github.com/tanyuqian/cappy and https://huggingface.co/btan2/cappy-large, respectivel

arXiv.org e-Print Archive

Quality in E-learning:user experiences in China

Author: Blair Ruben
de Vries Peter
de Vries Sjoerd
Riezebos Peter
Stewart Martin
Zhiting Zhu
Publication venue: Iverg Publishing
Publication date: 27/06/2011
Field of study

University of Twente Research Information

Redco: A Lightweight Tool to Automate Distributed Training of LLMs on Any GPU/TPUs

Author: Chen Jindong
Hu Zhiting
Liu Lijuan
Tan Bowen
Wang Hongyi
Xing Eric
Zhu Yun
Zhuang Yonghao
Publication venue
Publication date: 25/10/2023
Field of study

The recent progress of AI can be largely attributed to large language models (LLMs). However, their escalating memory requirements introduce challenges for machine learning (ML) researchers and engineers. Addressing this requires developers to partition a large model to distribute it across multiple GPUs or TPUs. This necessitates considerable coding and intricate configuration efforts with existing model parallel tools, such as Megatron-LM, DeepSpeed, and Alpa. These tools require users' expertise in machine learning systems (MLSys), creating a bottleneck in LLM development, particularly for developers without MLSys background. In this work, we present Redco, a lightweight and user-friendly tool crafted to automate distributed training and inference for LLMs, as well as to simplify ML pipeline development. The design of Redco emphasizes two key aspects. Firstly, to automate model parallism, our study identifies two straightforward rules to generate tensor parallel strategies for any given LLM. Integrating these rules into Redco facilitates effortless distributed LLM training and inference, eliminating the need of additional coding or complex configurations. We demonstrate the effectiveness by applying Redco on a set of LLM architectures, such as GPT-J, LLaMA, T5, and OPT, up to the size of 66B. Secondly, we propose a mechanism that allows for the customization of diverse ML pipelines through the definition of merely three functions, eliminating redundant and formulaic code like multi-host related processing. This mechanism proves adaptable across a spectrum of ML algorithms, from foundational language modeling to complex algorithms like meta-learning and reinforcement learning. Consequently, Redco implementations exhibit much fewer code lines compared to their official counterparts.Comment: Released under Apache License 2.0 at https://github.com/tanyuqian/redc

arXiv.org e-Print Archive

A solvent evaporation route towards fabrication of hierarchically porous ZSM-11 with highly accessible mesopores

Author: Liu Liping
Liu Zhiting
Skov Anne Ladegaard
Song Nan
Song Wen
Xiong Guang
Zhou Xinggui
Zhu Kake
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2015
Field of study

A solvent evaporation route to generate an organosilane modified dry gel and its transformation into hierarchically porous ZSM-11 is reported. The material features good pore-connectivity and improved acid site accessibility towards bulky substrates.</p

Crossref

Online Research Database In Technology

Portability and networked learning environments

Author: Bannon L.J.
Barker P.
Collis B.A.
Collis B.A.
Collis B.A.
Collis B.A.
Dahlstrand I.
Deursen K.
Diana I.P.F.
Diana I.P.F.
Diana I.P.F.
Greif I.
Heeren E.
Holand U.
Ji-Ping Zhang
McGrath J.E.
Ministry of Education Ontario
Oliveira J.
Potter G.
Sandford N.
Wetterling J.M.
Zhu Zhiting
Zhu Zhiting
Publication venue: 'Wiley'
Publication date: 01/01/1994
Field of study

Abstract The portability of educational software is defined as the likelihood of software usage, with or without adaptation, in an educational environment different from that for which it was originally designed and produced. Barriers and research relevant to the portability of electronic learning resources are discussed and organised into a portability-limiting factors model. With the increase in number and scope of networked learning environments, portability issues take on a new dimension. Using electronic (study) books as an example, the portability problem space of networked learning environments is explored

Crossref

University of Twente Research Information

Phase Stability of Hexagonal/cubic Boron Nitride Nanocomposites

Author: Ajayan Pulickel M.
Alvarez Gustavo A.
Birdwell A. Glen
Biswas Abhijit
Christiansen-Salameh Joyce
Dai Pengcheng
Elkins Jacob
Gao Bin
Garg Arushi
Garratt Elias J.
Gray Tia
Ivanov Tony
Jeong Eugene
Kannan Harikishan
Li Chenxi
Neupane Mahesh R.
Pate Bradford B.
Pieshkov Tymofii S.
Puthirath Anand B.
Tian Zhiting
Vajtai Robert
Xu Rui
Zhang Xiang
Zhu Hanyu
Publication venue
Publication date: 17/04/2023
Field of study

Boron nitride (BN) is an exceptional material and among its polymorphs, two-dimensional (2D) hexagonal and three-dimensional (3D) cubic BN (h-BN and c-BN) phases are most common. The phase stability regimes of these BN phases are still under debate and phase transformations of h-BN/c-BN remain a topic of interest. Here, we investigate the phase stability of 2D/3D h-BN/c-BN nanocomposites and show that the co-existence of two phases can lead to strong non-linear optical properties and low thermal conductivity at room temperature. Furthermore, spark-plasma sintering of the nanocomposite shows complete phase transformation to 2D h-BN with improved crystalline quality, where 3D c-BN grain sizes governs the nucleation and growth kinetics. Our demonstration might be insightful in phase engineering of BN polymorphs based nanocomposites with desirable properties for optoelectronics and thermal energy management applications.Comment: 29 pages, 5 figure

arXiv.org e-Print Archive

Cross-cultural portability of educational software: A communication-oriented approach

Author: Zhu zhiting N.
Publication venue: 'University Library/University of Twente'
Publication date: 22/08/1996
Field of study

University of Twente Research Information