Search CORE

4,753 research outputs found

Recommended from our members

NAD tagSeq reveals that NAD+-capped RNAs are mostly produced from a large number of protein-coding genes in Arabidopsis.

Author: Cai Zongwei
Chen Xuemei
Ni Min
Shao Xiaojian
Xia Yiji
Zhang Hailei
Zhang Shoudong
Zhong Huan
Publication venue: eScholarship, University of California
Publication date: 01/06/2019
Field of study

The 5' end of a eukaryotic mRNA transcript generally has a 7-methylguanosine (m7G) cap that protects mRNA from degradation and mediates almost all other aspects of gene expression. Some RNAs in Escherichia coli, yeast, and mammals were recently found to contain an NAD+ cap. Here, we report the development of the method NAD tagSeq for transcriptome-wide identification and quantification of NAD+-capped RNAs (NAD-RNAs). The method uses an enzymatic reaction and then a click chemistry reaction to label NAD-RNAs with a synthetic RNA tag. The tagged RNA molecules can be enriched and directly sequenced using the Oxford Nanopore sequencing technology. NAD tagSeq can allow more accurate identification and quantification of NAD-RNAs, as well as reveal the sequences of whole NAD-RNA transcripts using single-molecule RNA sequencing. Using NAD tagSeq, we found that NAD-RNAs in Arabidopsis were produced by at least several thousand genes, most of which are protein-coding genes, with the majority of these transcripts coming from <200 genes. For some Arabidopsis genes, over 5% of their transcripts were NAD capped. Gene ontology terms overrepresented in the 2,000 genes that produced the highest numbers of NAD-RNAs are related to photosynthesis, protein synthesis, and responses to cytokinin and stresses. The NAD-RNAs in Arabidopsis generally have the same overall sequence structures as the canonical m7G-capped mRNAs, although most of them appear to have a shorter 5' untranslated region (5' UTR). The identification and quantification of NAD-RNAs and revelation of their sequence features can provide essential steps toward understanding the functions of NAD-RNAs

eScholarship - University of California

Beyond MLE: Convex Learning for Text Generation

Author: Feng Yang
Ma Zhengrui
Shao Chenze
Zhang Min
Publication venue
Publication date: 26/10/2023
Field of study

Maximum likelihood estimation (MLE) is a statistical method used to estimate the parameters of a probability distribution that best explain the observed data. In the context of text generation, MLE is often used to train generative language models, which can then be used to generate new text. However, we argue that MLE is not always necessary and optimal, especially for closed-ended text generation tasks like machine translation. In these tasks, the goal of model is to generate the most appropriate response, which does not necessarily require it to estimate the entire data distribution with MLE. To this end, we propose a novel class of training objectives based on convex functions, which enables text generation models to focus on highly probable outputs without having to estimate the entire data distribution. We investigate the theoretical properties of the optimal predicted distribution when applying convex functions to the loss, demonstrating that convex functions can sharpen the optimal distribution, thereby enabling the model to better capture outputs with high probabilities. Experiments on various text generation tasks and models show the effectiveness of our approach. It enables autoregressive models to bridge the gap between greedy and beam search, and facilitates the learning of non-autoregressive models with a maximum improvement of 9+ BLEU points. Moreover, our approach also exhibits significant impact on large language models (LLMs), substantially enhancing their generative capability on various tasks. Source code is available at \url{https://github.com/ictnlp/Convex-Learning}.Comment: NeurIPS 202

arXiv.org e-Print Archive

Non-autoregressive Streaming Transformer for Simultaneous Translation

Author: Feng Yang
Guo Shoutao
Ma Zhengrui
Shao Chenze
Zhang Min
Zhang Shaolei
Publication venue
Publication date: 23/10/2023
Field of study

Simultaneous machine translation (SiMT) models are trained to strike a balance between latency and translation quality. However, training these models to achieve high quality while maintaining low latency often leads to a tendency for aggressive anticipation. We argue that such issue stems from the autoregressive architecture upon which most existing SiMT models are built. To address those issues, we propose non-autoregressive streaming Transformer (NAST) which comprises a unidirectional encoder and a non-autoregressive decoder with intra-chunk parallelism. We enable NAST to generate the blank token or repetitive tokens to adjust its READ/WRITE strategy flexibly, and train it to maximize the non-monotonic latent alignment with an alignment-based latency loss. Experiments on various SiMT benchmarks demonstrate that NAST outperforms previous strong autoregressive SiMT baselines.Comment: EMNLP 2023 main conference; Source code is available at https://github.com/ictnlp/NAS

arXiv.org e-Print Archive

Butane-1,2,3,4-tetracarboxylic acid–4,4′-bipyridine (1/2)

Author: Li
Min Shao
Ming-Xing Li
Najafpour
Ning Zhang
Sheldrick
Wang
Xue-Min Shi
Publication venue: International Union of Crystallography
Publication date: 01/08/2009
Field of study

The hydrothermal reaction of butane-1,2,3,4-tetracarboxylic acid (H4butca), 4,4′-bipyridine (bipy) and Mn(SO4)2·H2O afforded a new co-crystal, C8H10O8·2C10H8N2 or H4butca·2(bipy), in which strong O—H⋯N hydrogen-bonding and weak π–π stacking [centroid–centroid distance = 3.8459 (19) Å] interactions assemble the organic molecules into a three-dimensional supramolecular framework. C—H⋯O interactions are also present. The whole molecule has inversion symmetry

Crossref

Directory of Open Access Journals

PubMed Central

Discriminating bipartite mixed states by local operations

Author: Fei Shao-Ming
Lai Le-Min
Wang Zhi-Xi
Zhang Fu-Lin
Zhang Jin-Hua
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2020
Field of study

Unambiguous state discrimination of two mixed bipartite states via local operations and classical communications (LOCC) is studied and compared with the result of a scheme realized via global measurement. We show that the success probability of a global scheme for mixed-state discrimination can be achieved perfectly by the local scheme. In addition, we simulate this discrimination via a pair of pure entangled bipartite states. This simulation is perfect for local rather than global schemes due to the existence of entanglement and global coherence in the pure states. We also prove that LOCC protocol and the sequential state discrimination (SSD) can be interpreted in a unified view. We then hybridize the LOCC protocol with three protocols (SSD, reproducing and broadcasting) relying on classical communications. Such hybridizations extend the gaps between the optimal success probability of global and local schemes, which can be eliminated only for the SSD rather than the other two protocols

arXiv.org e-Print Archive

MPG.PuRe

Molecular phylogeny of the antiangiogenic and neurotrophic serpin, pigment epithelium derived factor in vertebrates

Author: Barnstable Colin J
Tombran-Tink Joyce
Xu Xuming
Zhang Samuel Shao-Min
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Pigment epithelium derived factor (PEDF), a member of the serpin family, regulates cell proliferation, promotes survival of neurons, and blocks growth of new blood vessels in mammals. Defining the molecular phylogeny of PEDF by bioinformatic analysis is one approach to understanding the link between its gene structure and its function in these biological processes. RESULTS: From a comprehensive search of available DNA databases we identified a single PEDF gene in all vertebrate species examined. These included four mammalian and six non-mammalian vertebrate species in which PEDF had not previously been described. A five gene cluster around PEDF was found in an approximate 100 kb region in mammals, birds, and amphibians. In ray-finned fish these genes are scattered over three chromosomes although only one PEDF gene was consistently found. The PEDF gene is absent in invertebrates including Drosophila melanogaster (D. melanogaster), Caenorhabditis elegans (C. elegans), and sea squirt (C. intestinalis). The PEDF gene is transcribed in all vertebrate phyla, suggesting it is biologically active throughout vertebrate evolution. The multiple actions of PEDF are likely conserved in evolution since it has the same gene structure across phyla, although the size of the gene ranges from 48.3 kb in X. tropicalis to 2.9 kb in fugu, with human PEDF at a size of 15.6 kb. A strong similarity in the proximal 200 bp of the PEDF promoter in mammals suggests the existence of a possible regulatory region across phyla. Using a non-synonymous/synonymous substitution rate ratio we show that mammalian and fish PEDFs have similar ratios of <0.13, reflecting a strong purifying selection of PEDF gene. A large number of repetitive transposable elements of the SINE and LINE class were found with random distribution in both the promoter and introns of mammalian PEDF. CONCLUSION: The PEDF gene first appears in vertebrates and our studies suggest that the regulation and biological actions of this gene are preserved across vertebrates. This comprehensive analysis of the PEDF gene across phyla provides new information that will aid further characterization of common functional motifs of this serpin in biological processes

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The therapeutic evaluation and mechanism on treating bronchial hyper-responsiveness cough by ziyinqingre prescription

Author: Cui Yi xin
Li Shao dan
Liu Yi
Yang Min hui
Zhang Jun Xiu
Zhang Yin
Publication venue: 'African Journals Online (AJOL)'
Publication date: 06/09/2016
Field of study

Objective: Discussing the effects of Ziyinqingre prescription on the level of airway resistance (Rrs), airway response threshold (Dmin), airway conductance (sGrs) and the level of inflammatory cytokines interleukin-4 (IL-4) and interferon-γ (IFN-γ) of the bronchial hyper-responsiveness (BHR) cough patients.Method: 84 subjects diagnosed as BHR were randomly divided into 42 Chinese Traditional medicine group and 42 control group. The Chinese Traditional Medicine group received Ziyinqingre prescription twice a day and the control group received 10mg Montelukast Sodium tablets once a day for two weeks. Observe the clinical symptoms improvement and the changes of the level of the Rrs, Dmin, sGrs and IL-4, IFN-γ.Results: After receiving the medicine, the symptoms of the Chinese medicine group were obviously alleviated, the outcome was more satisfied than that of the control group. Compared with the control group, the level of Dmin increased and sGrs level decreased more obviously (P<0.05); the level of IL-4 decreased and IFN-γlevel increased more obviously in the Chinese medicine group (P<0.05).Conclusion: Ziyinqingre prescription can not only improve BHR patients’ symptoms, but reduce the level of bronchial responsiveness, which proved a better curative effect of Chinese medicine. The mechanism is probably due to relieving the airway inflammation by keeping the balance between Th1 and Th2 cells.Keywords: Ziyinqingre prescription; cough; bronchial hyper-responsiveness; therapeutic mechanis

AJOL - African Journals Online

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

Author: Chen Kai
Chen Shoufa
Luo Ping
Shao Wenqi
Sun Peize
Xiao Min
Zhang Shilong
Zhang Wenwei
Publication venue
Publication date: 07/07/2023
Field of study

Instruction tuning large language model (LLM) on image-text pairs has achieved unprecedented vision-language multimodal abilities. However, their vision-language alignments are only built on image-level, the lack of region-level alignment limits their advancements to fine-grained multimodal understanding. In this paper, we propose instruction tuning on region-of-interest. The key design is to reformulate the bounding box as the format of spatial instruction. The interleaved sequences of visual features extracted by the spatial instruction and the language embedding are input to LLM, and trained on the transformed region-text data in instruction tuning format. Our region-level vision-language model, termed as GPT4RoI, brings brand new conversational and interactive experience beyond image-level understanding. (1) Controllability: Users can interact with our model by both language and spatial instructions to flexibly adjust the detail level of the question. (2) Capacities: Our model supports not only single-region spatial instruction but also multi-region. This unlocks more region-level multimodal capacities such as detailed region caption and complex region reasoning. (3) Composition: Any off-the-shelf object detector can be a spatial instruction provider so as to mine informative object attributes from our model, like color, shape, material, action, relation to other objects, etc. The code, data, and demo can be found at https://github.com/jshilong/GPT4RoI.Comment: Code has been released at https://github.com/jshilong/GPT4Ro

arXiv.org e-Print Archive

Recommended from our members

A Robust Gene Expression Prognostic Signature for Overall Survival in High-Grade Serous Ovarian Cancer.

Author: Hang Bo
Jin Yu-Lan
Mao Jian-Hua
Snijders Antoine M
Wang Pin
Xiong Guang-Wu
Yang Shao-Min
Zhang Xiao-Wei
Zhao Yue
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

The objective of this research was to develop a robust gene expression-based prognostic signature and scoring system for predicting overall survival (OS) of patients with high-grade serous ovarian cancer (HGSOC). Transcriptomic data of HGSOC patients were obtained from six independent studies in the NCBI GEO database. Genes significantly deregulated and associated with OS in HGSOCs were selected using GEO2R and Kaplan-Meier analysis with log-rank testing, respectively. Enrichment analysis for biological processes and pathways was performed using Gene Ontology analysis. A resampling/cross-validation method with Cox regression analysis was used to identify a novel gene expression-based signature associated with OS, and a prognostic scoring system was developed and further validated in nine independent HGSOC datasets. We first identified 488 significantly deregulated genes in HGSOC patients, of which 232 were found to be significantly associated with their OS. These genes were significantly enriched for cell cycle division, epithelial cell differentiation, p53 signaling pathway, vasculature development, and other processes. A novel 11-gene prognostic signature was identified and a prognostic scoring system was developed, which robustly predicted OS in HGSOC patients in 100 sampling test sets. The scoring system was further validated successfully in nine additional HGSOC public datasets. In conclusion, our integrative bioinformatics study combining transcriptomic and clinical data established an 11-gene prognostic signature for robust and reproducible prediction of OS in HGSOC patients. This signature could be of clinical value for guiding therapeutic selection and individualized treatment

eScholarship - University of California