42 research outputs found

    SmartTrim: Adaptive Tokens and Attention Pruning for Efficient Vision-Language Models

    Full text link
    Despite achieving remarkable performance on various vision-language tasks, Transformer-based Vision-Language Models (VLMs) suffer from redundancy in inputs and parameters, significantly hampering their efficiency in real-world applications. Moreover, the degree of redundancy in token representations and model parameters, such as attention heads, varies significantly for different inputs. In light of the challenges, we propose SmartTrim, an adaptive acceleration framework for VLMs, which adjusts the computational overhead per instance. Specifically, we integrate lightweight modules into the original backbone to identify and prune redundant token representations and attention heads within each layer. Furthermore, we devise a self-distillation strategy to enhance the consistency between the predictions of the pruned model and its fully-capacity counterpart. Experimental results across various vision-language tasks consistently demonstrate that SmartTrim accelerates the original model by 2-3 times with minimal performance degradation, highlighting the effectiveness and efficiency compared to previous approaches. Code will be available at https://github.com/kugwzk/SmartTrim.Comment: COLING-LREC 202

    The GATA factor HANABA TARANU promotes runner formation by regulating axillary bud initiation and outgrowth in cultivated strawberry

    Get PDF
    A runner, as an elongated branch, develops from the axillary bud (AXB) in the leaf axil and is crucial for the clonal propagation of cultivated strawberry (Fragaria x ananassa Duch.). Runner formation occurs in at least two steps: AXB initiation and AXB outgrowth. HANABA TARANU (HAN ) encodes a GATA transcription factor that affects AXB initiation in Arabidopsis and promotes branching in grass species, but the underlying mechanism is largely unknown. Here, the function of a strawberry HAN homolog FaHAN in runner formation was characterized. FaHAN transcripts can be detected in the leaf axils. Overexpression (OE) of FaHAN increased the number of runners, mainly by enhancing AXB outgrowth, in strawberry. The expression of the strawberry homolog of BRANCHED1 , a key inhibitor of AXB outgrowth in many plant species, was significantly downregulated in the AXBs of FaHAN -OE lines, whereas the expression of the strawberry homolog of SHOOT MERISTEMLESS, a marker gene for AXB initiation in Arabidopsis, was upregulated. Moreover, several genes of gibberellin biosynthesis and cytokinin signaling pathways were activated, whereas the auxin response pathway genes were repressed. Further assays indicated that FaHAN could be directly activated by FaNAC2, the overexpression of which in strawberry also increased the number of runners. The silencing of FaNAC2 or FaHAN inhibited AXB initiation and led to a higher proportion of dormant AXBs, confirming their roles in the control of runner formation. Taken together, our results revealed a FaNAC2-FaHAN pathway in the control of runner formation and have provided a means to enhance the vegetative propagation of cultivated strawberry.Peer reviewe

    DreamLLM: Synergistic Multimodal Comprehension and Creation

    Full text link
    This paper presents DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models (MLLMs) empowered with frequently overlooked synergy between multimodal comprehension and creation. DreamLLM operates on two fundamental principles. The first focuses on the generative modeling of both language and image posteriors by direct sampling in the raw multimodal space. This approach circumvents the limitations and information loss inherent to external feature extractors like CLIP, and a more thorough multimodal understanding is obtained. Second, DreamLLM fosters the generation of raw, interleaved documents, modeling both text and image contents, along with unstructured layouts. This allows DreamLLM to learn all conditional, marginal, and joint multimodal distributions effectively. As a result, DreamLLM is the first MLLM capable of generating free-form interleaved content. Comprehensive experiments highlight DreamLLM's superior performance as a zero-shot multimodal generalist, reaping from the enhanced learning synergy.Comment: see project page at https://dreamllm.github.io

    COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning

    Full text link
    Recently, there have been significant advancements in large language models (LLMs), particularly focused on the English language. These advancements have enabled these LLMs to understand and execute complex instructions with unprecedented accuracy and fluency. However, despite these advancements, there remains a noticeable gap in the development of Chinese instruction tuning. The unique linguistic features and cultural depth of the Chinese language pose challenges for instruction tuning tasks. Existing datasets are either derived from English-centric LLMs or are ill-suited for aligning with the interaction patterns of real-world Chinese users. To bridge this gap, we introduce COIG-CQIA, a high-quality Chinese instruction tuning dataset. Our aim is to build a diverse, wide-ranging instruction-tuning dataset to better align model behavior with human interactions. To this end, we collect a high-quality human-written corpus from various sources on the Chinese Internet, including Q&A communities, Wikis, examinations, and existing NLP datasets. This corpus was rigorously filtered and carefully processed to form the COIG-CQIA dataset. Furthermore, we train models of various scales on different subsets of CQIA, following in-depth evaluation and analyses. The findings from our experiments offer valuable insights for selecting and developing Chinese instruction-tuning datasets. We also find that models trained on CQIA-Subset achieve competitive results in human assessment as well as knowledge and security benchmarks. Data are available at https://huggingface.co/datasets/m-a-p/COIG-CQI

    A Comprehensive Study of Knowledge Editing for Large Language Models

    Full text link
    Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. However, a primary limitation lies in the significant computational demands during training, arising from their extensive parameterization. This challenge is further intensified by the dynamic nature of the world, necessitating frequent updates to LLMs to correct outdated information or integrate new knowledge, thereby ensuring their continued relevance. Note that many applications demand continual model adjustments post-training to address deficiencies or undesirable behaviors. There is an increasing interest in efficient, lightweight methods for on-the-fly model modifications. To this end, recent years have seen a burgeoning in the techniques of knowledge editing for LLMs, which aim to efficiently modify LLMs' behaviors within specific domains while preserving overall performance across various inputs. In this paper, we first define the knowledge editing problem and then provide a comprehensive review of cutting-edge approaches. Drawing inspiration from educational and cognitive research theories, we propose a unified categorization criterion that classifies knowledge editing methods into three groups: resorting to external knowledge, merging knowledge into the model, and editing intrinsic knowledge. Furthermore, we introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches. Additionally, we provide an in-depth analysis of knowledge location, which can give a deeper understanding of the knowledge structures inherent within LLMs. Finally, we discuss several potential applications of knowledge editing, outlining its broad and impactful implications.Comment: Ongoing work; 52 pages, 282 citations; benchmark is available at https://huggingface.co/datasets/zjunlp/KnowEdit code is available at https://github.com/zjunlp/EasyEdit paper list is available at https://github.com/zjunlp/KnowledgeEditingPaper
    corecore