Search CORE

38 research outputs found

QUIK: Towards End-to-End 4-Bit Inference on Generative Large Language Models

Author: Alistarh Dan
Ashkboos Saleh
Frantar Elias
Hoefler Torsten
Markov Ilia
Ren Jie
Wang Xincheng
Zhong Tingxuan
Publication venue
Publication date: 02/11/2023
Field of study

Large Language Models (LLMs) from the GPT family have become extremely popular, leading to a race towards reducing their inference costs to allow for efficient local computation. Yet, the vast majority of existing work focuses on weight-only quantization, which can reduce runtime costs in the memory-bound one-token-at-a-time generative setting, but does not address them in compute-bound scenarios, such as batched inference or prompt processing. In this paper, we address the general quantization problem, where both weights and activations should be quantized. We show, for the first time, that the majority of inference computations for large generative models such as LLaMA, OPT, and Falcon can be performed with both weights and activations being cast to 4 bits, in a way that leads to practical speedups, while at the same time maintaining good accuracy. We achieve this via a hybrid quantization strategy called QUIK, which compresses most of the weights and activations to 4-bit, while keeping some outlier weights and activations in higher-precision. The key feature of our scheme is that it is designed with computational efficiency in mind: we provide GPU kernels matching the QUIK format with highly-efficient layer-wise runtimes, which lead to practical end-to-end throughput improvements of up to 3.4x relative to FP16 execution. We provide detailed studies for models from the OPT, LLaMA-2 and Falcon families, as well as a first instance of accurate inference using quantization plus 2:4 sparsity. Code is available at: https://github.com/IST-DASLab/QUIK.Comment: 16 page

arXiv.org e-Print Archive

Lithofacies logging identification for strongly heterogeneous deep-buried reservoirs based on improved Bayesian inversion: The Lower Jurassic sandstone, Central Junggar Basin, China

Author: Chao Li
Chao Li
Lan Yu
Likuan Zhang
Likuan Zhang
Ming Cheng
Ming Cheng
Naigui Liu
Naigui Liu
Wenxiu Yang
Xincheng Ren
Yuhong Lei
Yuhong Lei
Zengbao Zhang
Zhiping Zeng
Zongyuan Zheng
Zongyuan Zheng
Zongyuan Zheng
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2023
Field of study

The strong heterogeneity characteristics of deep-buried clastic low-permeability reservoirs may lead to great risks in hydrocarbon exploration and development, which makes the accurate identification of reservoir lithofacies crucial for improving the obtained exploration results. Due to the very limited core data acquired from deep drilling, lithofacies logging identification has become the most important method for comprehensively obtaining the rock information of deep-buried reservoirs and is a fundamental task for carrying out reservoir characterization and geological modeling. In this study, a machine learning method is introduced to lithofacies logging identification, to explore an accurate lithofacies identification method for deep fluvial-delta sandstone reservoirs with frequent lithofacies changes. Here Sangonghe Formation in the Central Junggar Basin of China is taken as an example. The K-means-based synthetic minority oversampling technique (K-means SMOTE) is employed to solve the problem regarding the imbalanced lithofacies data categories used to calibrate logging data, and a probabilistic calibration method is introduced to correct the likelihood function. To address the situation in which traditional machine learning methods ignore the geological deposition process, we introduce a depositional prior for controlling the vertical spreading process based on a Markov chain and propose an improved Bayesian inversion process for training on the log data to identify lithofacies. The results of a series of experiments show that, compared with the traditional machine learning method, the new method improves the recognition accuracy by 20%, and the predicted petrographic vertical distribution results are consistent with geological constraints. In addition, SMOTE and probabilistic calibration can effectively handle data imbalance problems so that different categories can be adequately learned. Also the introduction of geological prior has a positive impact on the overall distribution, which significantly improves the accuracy and recall rate of the method. According to this comprehensive analysis, the proposed method greatly enhanced the identification of the lithofacies distributions in the Sangonghe Formation. Therefore, this method can provide a tool for logging lithofacies interpretation of deep and strongly heterogeneous clastic reservoirs in fluvial-delta and other depositional environments

Directory of Open Access Journals

Design and Optimization of a Pressure Sensor Based on Serpentine-Shaped Graphene Piezoresistors for Measuring Low Pressure

Author: Xincheng Ren
Publication venue: 'MDPI AG'
Publication date: 30/06/2022
Field of study

This thesis describes a novel microelectromechanical system (MEMS) piezoresistive pressure sensor based on serpentine-shaped graphene piezoresistors paired with trapezoidal prisms under the diaphragm for measuring low pressure. The finite element method (FEM) is utilized to analyze the mechanical stress and membrane deflection to enhance the degree of stress concentration in this unique sensor. The functional relationship between mechanical performance and dimension variables is established after using the curve fitting approach to handle the stress and deflection. Additionally, the Taguchi optimization method is employed to identify the best dimensions for the proposed structure. Then, the suggested design is compared to the other three designs in terms of operating performance. It is revealed that the recommended sensor can significantly improve sensitivity while maintaining extremely low nonlinearity. In this study, three different types of serpentine-shaped graphene piezoresistors are also designed, and their sensing capability is compared to silicon. The simulation results indicate that the pressure sensor with Type 2 graphene piezoresistors has a maximum sensitivity of 24.50 mV/psi and ultra-low nonlinearity of 0.06% FSS in the pressure range of 0–3 psi

Multidisciplinary Digital Publishing Institute

Assessment of DBA-L Pressure Vessel Design Method by a Cylindrical Vessel with Hemispherical Ends

Author: Huang Xun
Li Hongjun
Xincheng Ren
Publication venue: 'EDP Sciences'
Publication date: 01/01/2019
Field of study

Stress categorization is an essential procedure in Design by Analysis (DBA) pressure vessel design methods based on elastic analysis in ASME and EN code. It was difficult to implement especially around structural discontinuities. A new elastic analysis, DBA-L, was proposed recently to avoid stress categorization. A model of the cylindrical pressure vessel with spherical end is used to check the validity of this method by comparing with other design methods based on stress categorization procedures and elastic-plastic stress analysis from ASME and EN code. The results indicate that the DBA-L is an economic and explicit method, and can be used an alternative method to stress categorization

Directory of Open Access Journals