20 research outputs found
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise
Large Language Models (LLMs) exhibit impressive reasoning and data
augmentation capabilities in various NLP tasks. However, what about small
models? In this work, we propose TeacherLM-7.1B, capable of annotating relevant
fundamentals, chain of thought, and common mistakes for most NLP samples, which
makes annotation more than just an answer, thus allowing other models to learn
"why" instead of just "what". The TeacherLM-7.1B model achieved a zero-shot
score of 52.3 on MMLU, surpassing most models with over 100B parameters. Even
more remarkable is its data augmentation ability. Based on TeacherLM-7.1B, we
augmented 58 NLP datasets and taught various student models with different
parameters from OPT and BLOOM series in a multi-task setting. The experimental
results indicate that the data augmentation provided by TeacherLM has brought
significant benefits. We will release the TeacherLM series of models and
augmented datasets as open-source.Comment: 5 figures, 15 page
Genome-wide identification and expression analysis of 3-ketoacyl-CoA synthase gene family in rice (Oryza sativa L.) under cadmium stress
3-Ketoacyl-CoA synthase (KCS) is the key rate-limiting enzyme for the synthesis of very long-chain fatty acids (VLCFAs) in plants, which determines the carbon chain length of VLCFAs. However, a comprehensive study of KCSs in Oryza sativa has not been reported yet. In this study, we identified 22 OsKCS genes in rice, which are unevenly distributed on nine chromosomes. The OsKCS gene family is divided into six subclasses. Many cis-acting elements related to plant growth, light, hormone, and stress response were enriched in the promoters of OsKCS genes. Gene duplication played a crucial role in the expansion of the OsKCS gene family and underwent a strong purifying selection. Quantitative Real-time polymerase chain reaction (qRT-PCR) results revealed that most KCS genes are constitutively expressed. We also revealed that KCS genes responded differently to exogenous cadmium stress in japonica and indica background, and the KCS genes with higher expression in leaves and seeds may have functions under cadmium stress. This study provides a basis for further understanding the functions of KCS genes and the biosynthesis of VLCFA in rice
The complete chloroplast genomes of Camellia chrysanthoides and Camellia achrysantha
Camellia chrysanthoides H. T. Chang and Camellia achrysantha H. T. Chang et S. Y. Liang are two threatened yellow camellia species endemic to southwestern Guangxi, China. Here, we report the complete chloroplast (cp) genomes of C. chrysanthoides and C. achrysantha for the first time. The total cp genome of C. chrysanthoides is 156,959 bp and contains a large single-copy (LSC, 86,564 bp) region, a small single-copy (SSC, 18,267 bp) region, and a pair of inverted repeat (IR, 26,064 bp) regions. The cp genome of C. achrysantha is 156,658 bp and includes an LSC region of 86,249 bp, SSC region of 18,243 bp, and two IR regions of 26,083 bp each. Both C. chrysanthoides and C. achrysantha have 136 genes, including 93 protein-coding genes, 35 tRNA genes, and eight rRNA genes