Search CORE

4 research outputs found

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Author: Dong Li
Huang Shaohan
Ma Lingxiao
Ma Shuming
Wang Hongyu
Wang Lei
Wang Ruiping
Wang Wenhui
Wei Furu
Xue Jilong
Publication venue
Publication date: 27/02/2024
Field of study

Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single parameter (or weight) of the LLM is ternary {-1, 0, 1}. It matches the full-precision (i.e., FP16 or BF16) Transformer LLM with the same model size and training tokens in terms of both perplexity and end-task performance, while being significantly more cost-effective in terms of latency, memory, throughput, and energy consumption. More profoundly, the 1.58-bit LLM defines a new scaling law and recipe for training new generations of LLMs that are both high-performance and cost-effective. Furthermore, it enables a new computation paradigm and opens the door for designing specific hardware optimized for 1-bit LLMs.Comment: Work in progres

arXiv.org e-Print Archive

A Framework for Generating Diverse Haskell-IO Exercise Tasks

Author: Westphal Oliver
Publication venue
Publication date: 28/08/2020
Field of study

We present the design of a framework to automatically generate a large range of different exercise tasks on Haskell-I/O programming. Automatic task generation is useful in many different ways. Manual task creating is a time consuming process, so automating it saves valuable time for the educator. Together with an automated assessment system automatic task generation allows students to practice with as many exercise tasks as needed. Additionally, each student can be given a slightly different version of a task, reducing issues regarding plagiarism that arise naturally in an e-learning environment. Our task generation is centered around a specification language for I/O behavior that we developed in an earlier work. The task generation framework, an EDSL in Haskell, provides powerful primitives for the creation of various artifacts, including program code, from specifications. We will not go into detail on the technical realization of these primitives. This article instead showcases how such artifacts and the framework as a whole can be used to build exercise tasks templates that can then be (randomly) instantiated.Comment: Part of WFLP 2020 pre-proceeding

arXiv.org e-Print Archive