2 research outputs found

    LION : Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge

    Full text link
    Multimodal Large Language Models (MLLMs) have endowed LLMs with the ability to perceive and understand multi-modal signals. However, most of the existing MLLMs mainly adopt vision encoders pretrained on coarsely aligned image-text pairs, leading to insufficient extraction and reasoning of visual knowledge. To address this issue, we devise a dual-Level vIsual knOwledge eNhanced Multimodal Large Language Model (LION), which empowers the MLLM by injecting visual knowledge in two levels. 1) Progressive incorporation of fine-grained spatial-aware visual knowledge. We design a vision aggregator cooperated with region-level vision-language (VL) tasks to incorporate fine-grained spatial-aware visual knowledge into the MLLM. To alleviate the conflict between image-level and region-level VL tasks during incorporation, we devise a dedicated stage-wise instruction-tuning strategy with mixture-of-adapters. This progressive incorporation scheme contributes to the mutual promotion between these two kinds of VL tasks. 2) Soft prompting of high-level semantic visual evidence. We facilitate the MLLM with high-level semantic visual evidence by leveraging diverse image tags. To mitigate the potential influence caused by imperfect predicted tags, we propose a soft prompting method by embedding a learnable token into the tailored text instruction. Comprehensive experiments on several multi-modal benchmarks demonstrate the superiority of our model (e.g., improvement of 5% accuracy on VSR and 3% CIDEr on TextCaps over InstructBLIP, 5% accuracy on RefCOCOg over Kosmos-2).Comment: Technical Report. Project page: https://rshaojimmy.github.io/Projects/JiuTian-LION Code: https://github.com/rshaojimmy/JiuTia

    Effects of Pig Manure and Its Organic Fertilizer Application on Archaea and Methane Emission in Paddy Fields

    No full text
    Paddy fields account for 10% of global CH4 emissions, and the application of manure may increase CH4 emissions. In this study, high-throughput sequencing technology was used to investigate the effects of manure application on CH4 emissions and methanogens in paddy soil. Three treatments were studied: a controlled treatment (CK), pig manure (PM), and organic fertilizer (OF). The results showed that the contents of Zn, Cr and Ni in paddy soil increased with the application of manure, but the contents of heavy metals gradually decreased with the growth of rice. The Shannon index and Ace index showed that the application of pig manure and organic fertilizer less affected the diversity and richness of soil Archaea. The results of community composition analysis showed that Methanobacterium, Methanobrevibacter, Methanosphaera, Methanosarcina and Rice_Cluster_I were the main methanogens in paddy soil after manure and organic fertilizer application. Soil environmental factors were changed after applied manure, among which total potassium (TK) and total nitrogen (TN) were the main environmental factors affecting methanogens in paddy soil. The changes of soil environmental factors affected the community composition of methanogens, and the increase of the relative abundance of methanogens maybe the main reason for the increase of CH4 emission flux. The relative abundance of methanogens and CH4 emission flux in paddy soil were increased by both pig manure and organic fertilizer application, and pig manure had a bigger impact than organic manure
    corecore