Search CORE

13 research outputs found

ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training

Author: Chi Zewen
He Conghui
Huang Heyan
Mao Xian-Ling
Xu Minghao
Zhang Wentao
Zheng Heqi
Zhuo Le
Publication venue
Publication date: 27/02/2024
Field of study

We propose ProtLLM, a versatile cross-modal large language model (LLM) for both protein-centric and protein-language tasks. ProtLLM features a unique dynamic protein mounting mechanism, enabling it to handle complex inputs where the natural language text is interspersed with an arbitrary number of proteins. Besides, we propose the protein-as-word language modeling approach to train ProtLLM. By developing a specialized protein vocabulary, we equip the model with the capability to predict not just natural language but also proteins from a vast pool of candidates. Additionally, we construct a large-scale interleaved protein-text dataset, named InterPT, for pre-training. This dataset comprehensively encompasses both (1) structured data sources like protein annotations and (2) unstructured data sources like biological research papers, thereby endowing ProtLLM with crucial knowledge for understanding proteins. We evaluate ProtLLM on classic supervised protein-centric tasks and explore its novel protein-language applications. Experimental results demonstrate that ProtLLM not only achieves superior performance against protein-specialized baselines on protein-centric tasks but also induces zero-shot and in-context learning capabilities on protein-language tasks.Comment: https://protllm.github.io/project

arXiv.org e-Print Archive

Cross-Lingual Natural Language Generation via Pre-Training

Author: Chi Zewen
Dong Li
Huang Heyan
Mao Xian-Ling
Wang Wenhui
Wei Furu
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 22/11/2019
Field of study

In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages. We propose to pretrain the encoder and the decoder of a sequence-to-sequence model under both monolingual and cross-lingual settings. The pre-training objective encourages the model to represent different languages in the shared space, so that we can conduct zero-shot cross-lingual transfer. After the pre-training procedure, we use monolingual data to fine-tune the pre-trained model on downstream NLG tasks. Then the sequence-to-sequence model trained in a single language can be directly evaluated beyond that language (i.e., accepting multi-lingual input and producing multi-lingual output). Experimental results on question generation and abstractive summarization show that our model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation. Moreover, cross-lingual transfer improves NLG performance of low-resource languages by leveraging rich-resource language data. Our implementation and data are available at https://github.com/CZWin32768/xnlg

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Quaternion-valued Correlation Learning for Few-Shot Semantic Segmentation

Author: Huang Guoheng
Ling Wing-Kuen
Liu Hongrui
Pun Chi-Man
Yuan Xiaochen
Zheng Zewen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/05/2023
Field of study

Few-shot segmentation (FSS) aims to segment unseen classes given only a few annotated samples. Encouraging progress has been made for FSS by leveraging semantic features learned from base classes with sufficient training samples to represent novel classes. The correlation-based methods lack the ability to consider interaction of the two subspace matching scores due to the inherent nature of the real-valued 2D convolutions. In this paper, we introduce a quaternion perspective on correlation learning and propose a novel Quaternion-valued Correlation Learning Network (QCLNet), with the aim to alleviate the computational burden of high-dimensional correlation tensor and explore internal latent interaction between query and support images by leveraging operations defined by the established quaternion algebra. Specifically, our QCLNet is formulated as a hyper-complex valued network and represents correlation tensors in the quaternion domain, which uses quaternion-valued convolution to explore the external relations of query subspace when considering the hidden relationship of the support sub-dimension in the quaternion space. Extensive experiments on the PASCAL-5i and COCO-20i datasets demonstrate that our method outperforms the existing state-of-the-art methods effectively. Our code is available at https://github.com/zwzheng98/QCLNetComment: for associated paper file, see https://ieeexplore.ieee.org/document/9954424?source=authoraler

arXiv.org e-Print Archive

Evaluation of the impact of crop residue on fractional vegetation cover estimation by vegetation indices over conservation tillage cropland: a simulation study

Author: Chi Xu (486590)
Lin Liu (74495)
Yanling Ding (8783243)
Yunchao Chen (504384)
Zewen Dai (14121519)
Publication venue
Publication date: 15/11/2022
Field of study

Accurate estimation of fractional vegetation cover (FVC) is of great significance to agricultural production. Crop residue management affect crop residue cover (CRC) over croplands. Crop and crop residue on the soil surface both contribute to overall canopy reflectance. Few studies, however, have examined the effect of crop residue on vegetation indices (VIs) and estimated FVC. The present study evaluated the response of eight commonly used VIs to crop residues and FVC uncertainty caused by crop residue based on the dimidiate pixel model (DPM) by using simulated reflectance of low-tilled cropland via a three-dimensional radiative transfer model. The absolute difference (AD) was used to quantify the spectral difference between crop residues and soils in red and near infrared wavelengths. Increases in normalized difference VI (NDVI), ratio VI (RVI), transformed soil-adjusted VI (TSAVI), and normalized difference phenology index (NDPI) were observed when green crops were mixed with crop residue that had negative ADs with soils, but decreases in enhance VI (EVI), perpendicular VI (PVI), SAVI, and litter-soil-adjusted VI (L-SAVI) were observed when crop residue was present under medium and high vegetation cover. The presence of crop residue with a positive AD with soils reduced NDVI, RVI, TSAVI, and NDPI while increased the other VIs. Crop residue had the least impact on EVI- and SAVI-based DPMs, with FVC-estimated uncertainty less than 0.1, followed by the NDPI- and L-SAVI-based model, while DPMs based on NDVI- and RVI performed poorly. Each VI-based DPM’s estimated uncertainty was highly correlated with AD values. Furthermore, the majority of the VI-based models were sensitive to solar position except for the NDPI-based model. Our findings highlight the need of considering the impact of crop residue on FVC retrieval over low-tilled cropland in future research.</p

FigShare