7 research outputs found

    MiChao-HuaFen 1.0: A Specialized Pre-trained Corpus Dataset for Domain-specific Large Models

    Full text link
    With the advancement of deep learning technologies, general-purpose large models such as GPT-4 have demonstrated exceptional capabilities across various domains. Nevertheless, there remains a demand for high-quality, domain-specific outputs in areas like healthcare, law, and finance. This paper first evaluates the existing large models for specialized domains and discusses their limitations. To cater to the specific needs of certain domains, we introduce the ``MiChao-HuaFen 1.0'' pre-trained corpus dataset, tailored for the news and governmental sectors. The dataset, sourced from publicly available internet data from 2022, underwent multiple rounds of cleansing and processing to ensure high quality and reliable origins, with provisions for consistent and stable updates. This dataset not only supports the pre-training of large models for Chinese vertical domains but also aids in propelling deep learning research and applications in related fields.Comment: 4 pages,2 figure

    Impact of sequential (first- to third-generation) EGFR-TKI treatment on corrected QT interval in NSCLC patients

    Get PDF
    ObjectiveTo evaluate the impact of sequential (first- to third-generation) epidermal growth factor receptor tyrosine kinase inhibitor (EGFR-TKI) treatment on top-corrected QT interval (top-QTc) in non-small cell lung cancer (NSCLC) patients.MethodsWe retrospectively reviewed the medical records of NSCLC patients undergoing sequential EGFR-TKI treatment at Shanghai Chest Hospital between October 2016 and August 2021. The heart rate (HR), top-QT interval, and top-QTc of their ECGs were extracted from the institutional database and analyzed. Logistic regression was performed to identify predictors for top-QTc prolongation.ResultsOverall, 228 patients were enrolled. Compared with baseline (median, 368 ms, same below), both first-generation (376 ms vs. 368 ms, p < 0.001) and sequential third-generation EGFR-TKIs (376 ms vs. 368 ms, p = 0.002) prolonged top-QT interval to a similar extent (p = 0.635). Top-QTc (438 ms vs. 423 ms, p < 0.001) and HR (81 bpm vs.79 bpm, p = 0.008) increased after first-generation EGFR-TKI treatment. Further top-QTc prolongation (453 ms vs. 438 ms, p < 0.001) and HR increase (88 bpm vs. 81 bpm, p < 0.001) occurred after treatment advanced. Notably, as HR elevated during treatment, top-QT interval paradoxically increased rather than decreased, and the top-QTc increased rather than slightly fluctuated. Moreover, such phenomena were more significant after treatment advanced. After adjusting for confounding factors, pericardial effusion and lower serum potassium levels were independent predictors of additional QTc prolongation during sequential third-generation EGFR-TKI treatment.ConclusionFirst-generation EGFR-TKI could prolong top-QTc, and sequential third-generation EGFR-TKI induced further prolongation. Top-QT interval paradoxically increased and top-QTc significantly increased as HR elevated, which was more significant after sequential EGFR-TKI treatment. Pericardial effusion and lower serum potassium levels were independent predictors of additional QTc prolongation after sequential EGFR-TKI treatment

    Transcriptome-Wide Identification and Expression Profiling of SPX Domain-Containing Members in Responses to Phosphorus Deprivation of Pinus massoniana

    No full text
    The SPX domain-encoding proteins are believed to play important roles in phosphorus (Pi) homeostasis and signal transduction in plants. However, the overall information and responses of SPXs to phosphorus deficiency in pines, remain undefined. In this study, we screened the transcriptome data of Pinus massoniana in response to phosphorus deprivation. Ten SPX domain-containing genes were identified. Based on the conserved domains, the P. massoniana SPX genes were divided into four different subfamilies: SPX, SPX-MFS, SPX-EXS, and SPX-RING. RNA-seq analysis revealed that PmSPX genes were differentially expressed in response to phosphorus deprivation. Furthermore, real-time quantitative PCR (RT-qPCR) showed that PmSPX1 and PmSPX4 showed different expression patterns in different tissues under phosphorus stress. The promoter sequence of 2284 bp upstream of PmSPX1 was obtained by the genome walking method. A cis-element analysis indicated that there were several phosphorus stress response-related elements (e.g., two P1BS elements, a PHO element, and a W-box) in the promoter of PmSPX1. In addition, the previously obtained PmSPX2 promoter sequence contained a W-box, and it was shown that PmWRKY75 could directly bind to the PmSPX2 promoter using yeast one-hybrid analysis in this study. These results presented here revealed the foundational functions of PmSPXs in maintaining plant phosphorus homeostasis

    Transcriptome-Wide Identification and Expression Profiling of SPX Domain-Containing Members in Responses to Phosphorus Deprivation of <i>Pinus massoniana</i>

    No full text
    The SPX domain-encoding proteins are believed to play important roles in phosphorus (Pi) homeostasis and signal transduction in plants. However, the overall information and responses of SPXs to phosphorus deficiency in pines, remain undefined. In this study, we screened the transcriptome data of Pinus massoniana in response to phosphorus deprivation. Ten SPX domain-containing genes were identified. Based on the conserved domains, the P. massoniana SPX genes were divided into four different subfamilies: SPX, SPX-MFS, SPX-EXS, and SPX-RING. RNA-seq analysis revealed that PmSPX genes were differentially expressed in response to phosphorus deprivation. Furthermore, real-time quantitative PCR (RT-qPCR) showed that PmSPX1 and PmSPX4 showed different expression patterns in different tissues under phosphorus stress. The promoter sequence of 2284 bp upstream of PmSPX1 was obtained by the genome walking method. A cis-element analysis indicated that there were several phosphorus stress response-related elements (e.g., two P1BS elements, a PHO element, and a W-box) in the promoter of PmSPX1. In addition, the previously obtained PmSPX2 promoter sequence contained a W-box, and it was shown that PmWRKY75 could directly bind to the PmSPX2 promoter using yeast one-hybrid analysis in this study. These results presented here revealed the foundational functions of PmSPXs in maintaining plant phosphorus homeostasis

    InternLM2 Technical Report

    Full text link
    The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI). However, replicating such advancements in open-source models has been challenging. This paper introduces InternLM2, an open-source LLM that outperforms its predecessors in comprehensive evaluations across 6 dimensions and 30 benchmarks, long-context modeling, and open-ended subjective evaluations through innovative pre-training and optimization techniques. The pre-training process of InternLM2 is meticulously detailed, highlighting the preparation of diverse data types including text, code, and long-context data. InternLM2 efficiently captures long-term dependencies, initially trained on 4k tokens before advancing to 32k tokens in pre-training and fine-tuning stages, exhibiting remarkable performance on the 200k ``Needle-in-a-Haystack" test. InternLM2 is further aligned using Supervised Fine-Tuning (SFT) and a novel Conditional Online Reinforcement Learning from Human Feedback (COOL RLHF) strategy that addresses conflicting human preferences and reward hacking. By releasing InternLM2 models in different training stages and model sizes, we provide the community with insights into the model's evolution
    corecore