Search CORE

25 research outputs found

Density Matters: Improved Core-set for Active Domain Adaptive Segmentation

Author: Jiang Zhengkai
Li Yuxi
Lin Weiyao
Liu Shizhan
Peng Jinlong
Wang Yabiao
Publication venue
Publication date: 15/12/2023
Field of study

Active domain adaptation has emerged as a solution to balance the expensive annotation cost and the performance of trained models in semantic segmentation. However, existing works usually ignore the correlation between selected samples and its local context in feature space, which leads to inferior usage of annotation budgets. In this work, we revisit the theoretical bound of the classical Core-set method and identify that the performance is closely related to the local sample distribution around selected samples. To estimate the density of local samples efficiently, we introduce a local proxy estimator with Dynamic Masked Convolution and develop a Density-aware Greedy algorithm to optimize the bound. Extensive experiments demonstrate the superiority of our approach. Moreover, with very few labels, our scheme achieves comparable performance to the fully supervised counterpart

arXiv.org e-Print Archive

Model Compression and Efficient Inference for Large Language Models: A Survey

Author: Cai Deng
Chen Wei
He Xiaofei
Lin Binbin
Lin Zhengkai
Long Yongliu
Luo Yicong
Wang Wenxiao
Zhang Liye
Publication venue
Publication date: 15/02/2024
Field of study

Transformer based large language models have achieved tremendous success. However, the significant memory and computational costs incurred during the inference process make it challenging to deploy large models on resource-constrained devices. In this paper, we investigate compression and efficient inference methods for large language models from an algorithmic perspective. Regarding taxonomy, similar to smaller models, compression and acceleration algorithms for large language models can still be categorized into quantization, pruning, distillation, compact architecture design, dynamic networks. However, Large language models have two prominent characteristics compared to smaller models: (1) Most of compression algorithms require finetuning or even retraining the model after compression. The most notable aspect of large models is the very high cost associated with model finetuning or training. Therefore, many algorithms for large models, such as quantization and pruning, start to explore tuning-free algorithms. (2) Large models emphasize versatility and generalization rather than performance on a single task. Hence, many algorithms, such as knowledge distillation, focus on how to preserving their versatility and generalization after compression. Since these two characteristics were not very pronounced in early large models, we further distinguish large language models into medium models and ``real'' large models. Additionally, we also provide an introduction to some mature frameworks for efficient inference of large models, which can support basic compression or acceleration algorithms, greatly facilitating model deployment for users.Comment: 47 pages, review 380 papers. The work is ongoin

arXiv.org e-Print Archive

You Only Need 90K Parameters to Adapt Light: A Light Weight Transformer for Image Enhancement and Exposure Correction

Author: Cui Ziteng
Gao Peng
Gu Lin
Harada Tatsuya
Jiang Zhengkai
Li Kunchang
Qiao Yu
Su Shenghan
Publication venue
Publication date: 02/08/2022
Field of study

Challenging illumination conditions (low-light, under-exposure and over-exposure) in the real world not only cast an unpleasant visual appearance but also taint the computer vision tasks. After camera captures the raw-RGB data, it renders standard sRGB images with image signal processor (ISP). By decomposing ISP pipeline into local and global image components, we propose a lightweight fast Illumination Adaptive Transformer (IAT) to restore the normal lit sRGB image from either low-light or under/over-exposure conditions. Specifically, IAT uses attention queries to represent and adjust the ISP-related parameters such as colour correction, gamma correction. With only ~90k parameters and ~0.004s processing speed, our IAT consistently achieves superior performance over SOTA on the current benchmark low-light enhancement and exposure correction datasets. Competitive experimental performance also demonstrates that our IAT significantly enhances object detection and semantic segmentation tasks under various light conditions. Training code and pretrained model is available at https://github.com/cuiziteng/Illumination-Adaptive-Transformer.Comment: 23 page

arXiv.org e-Print Archive

Improving the performance of models for one-step retrosynthesis through re-ranking

Author: Coley Connor W.
Lin Min H.
Tu Zhengkai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Abstract Retrosynthesis is at the core of organic chemistry. Recently, the rapid growth of artificial intelligence (AI) has spurred a variety of novel machine learning approaches for data-driven synthesis planning. These methods learn complex patterns from reaction databases in order to predict, for a given product, sets of reactants that can be used to synthesise that product. However, their performance as measured by the top-N accuracy in matching published reaction precedents still leaves room for improvement. This work aims to enhance these models by learning to re-rank their reactant predictions. Specifically, we design and train an energy-based model to re-rank, for each product, the published reaction as the top suggestion and the remaining reactant predictions as lower-ranked. We show that re-ranking can improve one-step models significantly using the standard USPTO-50k benchmark dataset, such as RetroSim, a similarity-based method, from 35.7 to 51.8% top-1 accuracy and NeuralSym, a deep learning method, from 45.7 to 51.3%, and also that re-ranking the union of two models’ suggestions can lead to better performance than either alone. However, the state-of-the-art top-1 accuracy is not improved by this method. Graphical Abstrac

DSpace@MIT

PubMed Central

DR-NTU (Digital Repository of NTU)

The role of internal defects on anisotropic tensile failure of L-PBF AlSi10Mg alloys

Author: Gao Xi
Lin Ying
Withers Philip
Wu Shengchuan
Wu Zhengkai
Xue Yanling
Publication venue
Publication date: 02/08/2023
Field of study

The University of Manchester - Institutional Repository