Search CORE

42 research outputs found

SmartTrim: Adaptive Tokens and Attention Pruning for Efficient Vision-Language Models

Author: Chen Jingchang
Liang Jiafeng
Liu Ming
Qin Bing
Shan Liping
Wang Zekun
Xu Dongliang
Yang Qing
Zhou Wangchunshu
Zhu Haichao
Publication venue
Publication date: 26/02/2024
Field of study

Despite achieving remarkable performance on various vision-language tasks, Transformer-based Vision-Language Models (VLMs) suffer from redundancy in inputs and parameters, significantly hampering their efficiency in real-world applications. Moreover, the degree of redundancy in token representations and model parameters, such as attention heads, varies significantly for different inputs. In light of the challenges, we propose SmartTrim, an adaptive acceleration framework for VLMs, which adjusts the computational overhead per instance. Specifically, we integrate lightweight modules into the original backbone to identify and prune redundant token representations and attention heads within each layer. Furthermore, we devise a self-distillation strategy to enhance the consistency between the predictions of the pruned model and its fully-capacity counterpart. Experimental results across various vision-language tasks consistently demonstrate that SmartTrim accelerates the original model by 2-3 times with minimal performance degradation, highlighting the effectiveness and efficiency compared to previous approaches. Code will be available at https://github.com/kugwzk/SmartTrim.Comment: COLING-LREC 202

arXiv.org e-Print Archive

The GATA factor HANABA TARANU promotes runner formation by regulating axillary bud initiation and outgrowth in cultivated strawberry

Author: Dong Zhenfei
Fan Guangxun
Fan Lingjiao
Feng Zekun
Gao Dehang
Hou Shengfan
Hytönen Timo
Koskela Elli A.
Liang Jiahui
Wang Feng
Wang Hongqing
Wu Ze
Zheng Jing
Publication venue
Publication date: 01/06/2022
Field of study

A runner, as an elongated branch, develops from the axillary bud (AXB) in the leaf axil and is crucial for the clonal propagation of cultivated strawberry (Fragaria x ananassa Duch.). Runner formation occurs in at least two steps: AXB initiation and AXB outgrowth. HANABA TARANU (HAN ) encodes a GATA transcription factor that affects AXB initiation in Arabidopsis and promotes branching in grass species, but the underlying mechanism is largely unknown. Here, the function of a strawberry HAN homolog FaHAN in runner formation was characterized. FaHAN transcripts can be detected in the leaf axils. Overexpression (OE) of FaHAN increased the number of runners, mainly by enhancing AXB outgrowth, in strawberry. The expression of the strawberry homolog of BRANCHED1 , a key inhibitor of AXB outgrowth in many plant species, was significantly downregulated in the AXBs of FaHAN -OE lines, whereas the expression of the strawberry homolog of SHOOT MERISTEMLESS, a marker gene for AXB initiation in Arabidopsis, was upregulated. Moreover, several genes of gibberellin biosynthesis and cytokinin signaling pathways were activated, whereas the auxin response pathway genes were repressed. Further assays indicated that FaHAN could be directly activated by FaNAC2, the overexpression of which in strawberry also increased the number of runners. The silencing of FaNAC2 or FaHAN inhibited AXB initiation and led to a higher proportion of dormant AXBs, confirming their roles in the control of runner formation. Taken together, our results revealed a FaNAC2-FaHAN pathway in the control of runner formation and have provided a means to enhance the vegetative propagation of cultivated strawberry.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

DreamLLM: Synergistic Multimodal Comprehension and Creation

Author: Dong Runpei
Ge Zheng
Han Chunrui
Kong Xiangwen
Ma Kaisheng
Peng Yuang
Qi Zekun
Sun Jianjian
Wei Haoran
Yang Jinrong
Yi Li
Zhang Xiangyu
Zhao Liang
Zhou Hongyu
Publication venue
Publication date: 20/09/2023
Field of study

This paper presents DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models (MLLMs) empowered with frequently overlooked synergy between multimodal comprehension and creation. DreamLLM operates on two fundamental principles. The first focuses on the generative modeling of both language and image posteriors by direct sampling in the raw multimodal space. This approach circumvents the limitations and information loss inherent to external feature extractors like CLIP, and a more thorough multimodal understanding is obtained. Second, DreamLLM fosters the generation of raw, interleaved documents, modeling both text and image contents, along with unstructured layouts. This allows DreamLLM to learn all conditional, marginal, and joint multimodal distributions effectively. As a result, DreamLLM is the first MLLM capable of generating free-form interleaved content. Comprehensive experiments highlight DreamLLM's superior performance as a zero-shot multimodal generalist, reaping from the enhanced learning synergy.Comment: see project page at https://dreamllm.github.io

arXiv.org e-Print Archive

COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning

Author: Bai Yuelin
Chen Wenhu
Du Xinrun
Fu Jie
Huang Wenhao
Jin Yonggang
Liang Yiming
Lin Chenghua
Lin Hongquan
Liu Ziqiang
Ma Nuo
Ni Shiwen
Wang Zekun
Wu Haihong
Yang Min
Yuan Ruibin
Zhang Ge
Zhang Jiajun
Zhang Xincheng
Zheng Tianyu
Zhou Junting
Publication venue
Publication date: 26/03/2024
Field of study

Recently, there have been significant advancements in large language models (LLMs), particularly focused on the English language. These advancements have enabled these LLMs to understand and execute complex instructions with unprecedented accuracy and fluency. However, despite these advancements, there remains a noticeable gap in the development of Chinese instruction tuning. The unique linguistic features and cultural depth of the Chinese language pose challenges for instruction tuning tasks. Existing datasets are either derived from English-centric LLMs or are ill-suited for aligning with the interaction patterns of real-world Chinese users. To bridge this gap, we introduce COIG-CQIA, a high-quality Chinese instruction tuning dataset. Our aim is to build a diverse, wide-ranging instruction-tuning dataset to better align model behavior with human interactions. To this end, we collect a high-quality human-written corpus from various sources on the Chinese Internet, including Q&A communities, Wikis, examinations, and existing NLP datasets. This corpus was rigorously filtered and carefully processed to form the COIG-CQIA dataset. Furthermore, we train models of various scales on different subsets of CQIA, following in-depth evaluation and analyses. The findings from our experiments offer valuable insights for selecting and developing Chinese instruction-tuning datasets. We also find that models trained on CQIA-Subset achieve competitive results in human assessment as well as knowledge and security benchmarks. Data are available at https://huggingface.co/datasets/m-a-p/COIG-CQI

arXiv.org e-Print Archive

A Comprehensive Study of Knowledge Editing for Large Language Models

Author: Chen Huajun
Cheng Siyuan
Deng Shumin
Gu Jia-Chen
Huang Fei
Jiang Yong
Liang Lei
Mao Shengyu
Ni Yuansheng
Tian Bozhong
Wang Mengru
Wang Peng
Xi Zekun
Xie Pengjun
Xu Xin
Xu Ziwen
Yao Yunzhi
Zhang Jintian
Zhang Ningyu
Zhang Zhiqiang
Zhou Jun
Zhu Xiaowei
Publication venue
Publication date: 28/03/2024
Field of study

Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. However, a primary limitation lies in the significant computational demands during training, arising from their extensive parameterization. This challenge is further intensified by the dynamic nature of the world, necessitating frequent updates to LLMs to correct outdated information or integrate new knowledge, thereby ensuring their continued relevance. Note that many applications demand continual model adjustments post-training to address deficiencies or undesirable behaviors. There is an increasing interest in efficient, lightweight methods for on-the-fly model modifications. To this end, recent years have seen a burgeoning in the techniques of knowledge editing for LLMs, which aim to efficiently modify LLMs' behaviors within specific domains while preserving overall performance across various inputs. In this paper, we first define the knowledge editing problem and then provide a comprehensive review of cutting-edge approaches. Drawing inspiration from educational and cognitive research theories, we propose a unified categorization criterion that classifies knowledge editing methods into three groups: resorting to external knowledge, merging knowledge into the model, and editing intrinsic knowledge. Furthermore, we introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches. Additionally, we provide an in-depth analysis of knowledge location, which can give a deeper understanding of the knowledge structures inherent within LLMs. Finally, we discuss several potential applications of knowledge editing, outlining its broad and impactful implications.Comment: Ongoing work; 52 pages, 282 citations; benchmark is available at https://huggingface.co/datasets/zjunlp/KnowEdit code is available at https://github.com/zjunlp/EasyEdit paper list is available at https://github.com/zjunlp/KnowledgeEditingPaper

arXiv.org e-Print Archive