Search CORE

33 research outputs found

Towards Vision Enhancing LLMs: Empowering Multimodal Knowledge Storage and Sharing in LLMs

Author: Cao Xiaochun
Hu Baotian
Li Yunxin
Wang Wei
Zhang Min
Publication venue
Publication date: 27/11/2023
Field of study

Recent advancements in multimodal large language models (MLLMs) have achieved significant multimodal generation capabilities, akin to GPT-4. These models predominantly map visual information into language representation space, leveraging the vast knowledge and powerful text generation abilities of LLMs to produce multimodal instruction-following responses. We could term this method as LLMs for Vision because of its employing LLMs for visual-language understanding, yet observe that these MLLMs neglect the potential of harnessing visual knowledge to enhance overall capabilities of LLMs, which could be regraded as Vision Enhancing LLMs. In this paper, we propose an approach called MKS2, aimed at enhancing LLMs through empowering Multimodal Knowledge Storage and Sharing in LLMs. Specifically, we introduce the Modular Visual Memory, a component integrated into the internal blocks of LLMs, designed to store open-world visual information efficiently. Additionally, we present a soft Mixtures-of-Multimodal Experts architecture in LLMs to invoke multimodal knowledge collaboration during generation. Our comprehensive experiments demonstrate that MKS2 substantially augments the reasoning capabilities of LLMs in contexts necessitating physical or commonsense knowledge. It also delivers competitive results on multimodal benchmarks.Comment: 12 pages, 4 figure

arXiv.org e-Print Archive

Generative Model for Models: Rapid DNN Customization for Diverse Tasks and Resource Constraints

Author: Cao Zhengyang
Li Yixuan
Li Yuanchun
Liu Jiacheng
Liu Yunxin
Sun Yi
Wen Hao
Xu Wenxing
Publication venue
Publication date: 28/08/2023
Field of study

Unlike cloud-based deep learning models that are often large and uniform, edge-deployed models usually demand customization for domain-specific tasks and resource-limited environments. Such customization processes can be costly and time-consuming due to the diversity of edge scenarios and the training load for each scenario. Although various approaches have been proposed for rapid resource-oriented customization and task-oriented customization respectively, achieving both of them at the same time is challenging. Drawing inspiration from the generative AI and the modular composability of neural networks, we introduce NN-Factory, an one-for-all framework to generate customized lightweight models for diverse edge scenarios. The key idea is to use a generative model to directly produce the customized models, instead of training them. The main components of NN-Factory include a modular supernet with pretrained modules that can be conditionally activated to accomplish different tasks and a generative module assembler that manipulate the modules according to task and sparsity requirements. Given an edge scenario, NN-Factory can efficiently customize a compact model specialized in the edge task while satisfying the edge resource constraints by searching for the optimal strategy to assemble the modules. Based on experiments on image classification and object detection tasks with different edge devices, NN-Factory is able to generate high-quality task- and resource-specific models within few seconds, faster than conventional model customization approaches by orders of magnitude

arXiv.org e-Print Archive

LUT-NN: Empower Efficient Neural Network Inference with Centroid Learning and Table Lookup

Author: Cai Deng
Cao Ting
Chen Qi
Liu Yunxin
Tang Xiaohu
Wang Yang
Yang Mao
Zhang Li Lyna
Publication venue
Publication date: 06/09/2023
Field of study

On-device Deep Neural Network (DNN) inference consumes significant computing resources and development efforts. To alleviate that, we propose LUT-NN, the first system to empower inference by table lookup, to reduce inference cost. LUT-NN learns the typical features for each operator, named centroid, and precompute the results for these centroids to save in lookup tables. During inference, the results of the closest centroids with the inputs can be read directly from the table, as the approximated outputs without computations. LUT-NN integrates two major novel techniques: (1) differentiable centroid learning through backpropagation, which adapts three levels of approximation to minimize the accuracy impact by centroids; (2) table lookup inference execution, which comprehensively considers different levels of parallelism, memory access reduction, and dedicated hardware units for optimal performance. LUT-NN is evaluated on multiple real tasks, covering image and speech recognition, and nature language processing. Compared to related work, LUT-NN improves accuracy by 66% to 92%, achieving similar level with the original models. LUT-NN reduces the cost at all dimensions, including FLOPs (

\leq

16x), model size (

\leq

7x), latency (

\leq

6.8x), memory (

\leq

6.5x), and power (

\leq

41.7%)

arXiv.org e-Print Archive

Accelerating In-Browser Deep Learning Inference on Diverse Edge Clients through Just-in-Time Kernel Optimizations

Author: Cao Ting
Cao Xu
Cui Wei
Jia Fucheng
Jiang Shiqi
Li Yuanchun
Liu Yunxin
Qiu Lili
Ren Ju
Xia Tianrui
Yang Mao
Zhang Deyu
Publication venue
Publication date: 16/09/2023
Field of study

Web applications are increasingly becoming the primary platform for AI service delivery, making in-browser deep learning (DL) inference more prominent. However, current in-browser inference systems fail to effectively utilize advanced web programming techniques and customize kernels for various client devices, leading to suboptimal performance. To address the issues, this paper presents the first in-browser inference system, nn-JIT.web, which enables just-in-time (JIT) auto-generation of optimized kernels for both CPUs and GPUs during inference. The system achieves this by using two novel web programming techniques that can significantly reduce kernel generation time, compared to other tensor compilers such as TVM, while maintaining or even improving performance. The first technique, Tensor-Web Compiling Co-Design, lowers compiling costs by unifying tensor and web compiling and eliminating redundant and ineffective compiling passes. The second technique, Web-Specific Lite Kernel Optimization Space Design, reduces kernel tuning costs by focusing on web programming requirements and efficient hardware resource utilization, limiting the optimization space to only dozens. nn-JIT.web is evaluated for modern transformer models on a range of client devices, including the mainstream CPUs and GPUs from ARM, Intel, AMD and Nvidia. Results show that nn-JIT.web can achieve up to 8.2x faster within 30 seconds compared to the baselines across various models

arXiv.org e-Print Archive

Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training

Author: Cao Ying
Guo Cong
Guo Minyi
Leng Jingwen
Liu Yunxin
Qiu Yuxian
Yang Fan
Zhang Chen
Zhang Quanlu
Publication venue
Publication date: 22/09/2022
Field of study

An activation function is an element-wise mathematical function and plays a crucial role in deep neural networks (DNN). Many novel and sophisticated activation functions have been proposed to improve the DNN accuracy but also consume massive memory in the training process with back-propagation. In this study, we propose the nested forward automatic differentiation (Forward-AD), specifically for the element-wise activation function for memory-efficient DNN training. We deploy nested Forward-AD in two widely-used deep learning frameworks, TensorFlow and PyTorch, which support the static and dynamic computation graph, respectively. Our evaluation shows that nested Forward-AD reduces the memory footprint by up to 1.97x than the baseline model and outperforms the recomputation by 20% under the same memory reduction ratio.Comment: 8 pages, ICCD 202

arXiv.org e-Print Archive

Upconversion Luminescence and Magnetic Turning of NaLuF 4

Author: Huiyi Cao
Qingyang Wu
Shigang Hu
Shiping Zhan
Xiaofeng Wu
Yunxin Liu
Zhijun Tang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2016
Field of study

Fluorescent and magnetic bifunctional NaLuF4:Yb3+/Tm3+/Gd3+ nanocrystals were synthesized by the solvothermal method and subsequent surface modification. By changing the doping concentration of Gd3+, the shape, size, luminescent properties, and magnetic properties of the nanoparticles can be modulated. These NaLuF4:Yb3+/Tm3+/Gd3+ nanocrystals present efficient blue upconversion fluorescence and excellent paramagnetic property at room temperature. Based on the luminescence resonance energy transfer (LRET), upconversion nanoparticles (UCNPs) were confirmed to be an efficient fluorescent nanoprobe for detecting acriflavine. It is easy to derive the concentration of acriflavine from the Integral Intensity Ratio of Green (emission from acriflavine) to Blue (emission from UCNPs) fluorescent signals. Based on this upconversion fluorescent nanoprobe, the detection limit of acriflavine can reach up to 0.32 μg/mL

Crossref

Directory of Open Access Journals

Line identification of extreme ultraviolet spectra from aluminum ions in EAST Tokamak plasmas

Author: Cao Yiming
Cheng Yunxin
Ding Xiaobin
Hu Ailan
Jie Yinxian
Li Zhengwei
Ma Jiuyang
Mitnik Darío
Morita Shigeru
Xu Lang
Xu Zhehao
Zhang Fengling
Zhang Ling
Zhang Wenming
Zhou Chenxi
Zhou Zhen
Publication venue
Publication date: 04/09/2023
Field of study

Extreme ultraviolet (EUV) spectra emitted from aluminum in the 5-340 A wavelength range were observed in Experimental Advanced Superconducting Tokamak (EAST) discharges. Several spectral lines from aluminum ions with different degrees of ionization were successfully observed with sufficient spectral intensities and resolutions using three fast-time-response EUV spectrometers. The line identification uses three independent state-of-art computational codes for the atomic structure calculations, which provide the wavelengths and radiative transition probabilities rate coefficients. These programs are HULLAC (Hebrew University - Lawrence Livermore Atomic Code), AUTOSTRUCTURE, and FAC (Flexible Atomic Code). Using three different codes allows us to resolve some ambiguities in identifying certain spectral lines and assess the validity of the theoretical predictions

arXiv.org e-Print Archive

Production and characterization of a recombinant single-chain antibody against Hantaan virus envelop glycoprotein

Author: A Joliot
A Plyusnin
A Sorkin
AM Moura-da-Silva
Angang Yang
B Liu
BJ Meyer
C Brignole
C Schmaljohn
CJ Peters
CS Schmaljohn
D Antic
D Peer
DA Cheresh
DR Corey
E Song
ER LaVallie
Fanglin Zhang
G Hu
G Song
I Mellman
Jie Yang
Jifeng Sun
JR Jackson
Junxia Wei
KB Sjölander
KF Pirollo
KF Pirollo
Libo Yao
Lintao Jia
M Arbabi-Ghahroudi
M Linderholm
M Lindgren
M Ogino
N López
R Schier
RE Bird
RM Horton
RO Hynes
Rui Chen
SC Wu
ST Nichol
W Luo
Wen Luo
WH Wen
WH Wen
WP Yang
Y Dorsett
Yan Yan
Yong Zhang
Yunxin Cao
Zhikai Xu
Publication venue: Springer-Verlag
Publication date: 01/01/2009
Field of study

Hantaan virus (HTNV) is the type of Hantavirus causing hemorrhagic fever with renal syndrome, for which no specific therapeutics are available so far. Cell type-specific internalizing antibodies can be used to deliver therapeutics intracellularly to target cell and thus, have potential application in anti-HTNV infection. To achieve intracellular delivery of therapeutics, it is necessary to obtain antibodies that demonstrate sufficient cell type-specific binding, internalizing, and desired cellular trafficking. Here, we describe the prokaryotic expression, affinity purification, and functional testing of a single-chain Fv antibody fragment (scFv) against HTNV envelop glycoprotein (GP), an HTNV-specific antigen normally located on the membranes of HTNV-infected cells. This HTNV GP-targeting antibody, scFv3G1, was produced in the cytoplasm of Escherichia coli cells as a soluble protein and was purified by immobilized metal affinity chromatography. The purified scFv possessed a high specific antigen-binding activity to HTNV GP and HTNV-infected Vero E6 cells and could be internalized into HTNV-infected cells probably through the clathrin-dependent endocytosis pathways similar to that observed with transferrin. Our results showed that the E. coli-produced scFv had potential applications in targeted and intracellular delivery of therapeutics against HTNV infections

Crossref

Springer - Publisher Connector

PubMed Central

Quantitative evaluation of the immunodeficiency of a mouse strain by tumor engraftments

Author: A Agliano
A Gautam
Baiheng Li
Bei Jia
Bing Xu
CA Janeway Jr
D Ehrlich
DH Kaplan
DJ Giard
Donghai Wu
Duanqing Pei
E Klein
FD Urnov
Fenglan Feng
GC Bosma
GP Dunn
Guan-Xiong Li
H Kumar
Huihui Yao
I Fichtner
I Penn
J Lindberg
J Merk
J Zhou
Jin Li
K Machida
L Cong
LD Shultz
LD Shultz
Lin Xu
Lingwen Zeng
M Baker
M Ilie
M Ito
Mei Zhong
N Souza de
P Mali
P Workman
Peng Li
Pentao Liu
Shuhua Li
Simiao Lin
SP Flanagan
Su Cao
Suna Wang
T Fujita
V Shankaran
W Fujii
Wei Ye
X Cao
X Ji
Y Shinkai
Y Xiao
Yangqiu Li
Yao Yao
Yin Li
Yiren Xiao
Yunxin Lai
Zhe-sheng Wen
Zhi-liang Huang
Zhiwu Jiang
Zixia Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref