104 research outputs found

    How well do Large Language Models perform in Arithmetic tasks?

    Full text link
    Large language models have emerged abilities including chain-of-thought to answer math word problems step by step. Solving math word problems not only requires abilities to disassemble problems via chain-of-thought but also needs to calculate arithmetic expressions correctly for each step. To the best of our knowledge, there is no work to focus on evaluating the arithmetic ability of large language models. In this work, we propose an arithmetic dataset MATH 401 to test the latest large language models including GPT-4, ChatGPT, InstrctGPT, Galactica, and LLaMA with various arithmetic expressions and provide a detailed analysis of the ability of large language models. MATH 401 and evaluation codes are released at \url{https://github.com/GanjinZero/math401-llm}

    Crustal and Upper Mantle Structure Beneath the Northeastern Tibetan Plateau from Joint Analysis of Receiver Functions and Rayleigh Wave Dispersions

    Get PDF
    The crustal and upper mantle velocity structure in the northeastern Tibetan Plateau is obtained from joint analysis of receiver functions and Rayleigh wave dispersions. The resulting velocity model reveals a close correlation between the thick (\u3e60 km) crust and the presence of an intracrustal low-velocity zone beneath the Qiangtang and Songpan-Ganzi terranes as well as the northwestern Qilian orogen. However, the high Vp/Vs ratio of the crust is found only beneath the Qiangtang and Songpan-Ganzi terranes. The crustal low velocity zone does not appear in the west Qinling and southeastern Qilian orogens, which have a relatively thin (∼50 km) crust, indicating that crustal channel flow is not the primary mechanism by which the northeastern Tibetan Plateau grows. A continuous low velocity zone from the mid-to-lower crust down to 160 km beneath the eastern Kunlun fault suggests an induced local mantle upwelling after partial detachment of the lithosphere

    RRHF: Rank Responses to Align Language Models with Human Feedback without tears

    Full text link
    Reinforcement Learning from Human Feedback (RLHF) facilitates the alignment of large language models with human preferences, significantly enhancing the quality of interactions between humans and these models. InstructGPT implements RLHF through several stages, including Supervised Fine-Tuning (SFT), reward model training, and Proximal Policy Optimization (PPO). PPO, however, is sensitive to hyperparameters and requires a minimum of four models in its standard implementation, which makes it hard to train. In contrast, we propose a novel learning paradigm called RRHF, which scores responses generated by different sampling policies and learns to align them with human preferences through ranking loss. RRHF can efficiently align language model output probabilities with human preferences as robust as fine-tuning and it only needs 1 to 2 models during tuning. In addition, RRHF can be considered an extension of SFT and reward models while being simpler than PPO in terms of coding, model counts, and hyperparameters. The entire alignment process can be accomplished within a single RRHF training session. We evaluate RRHF using LLaMA and Alpaca on Helpful and Harmless data, demonstrating performance comparable to PPO.Comment: Codes available at https://github.com/GanjinZero/RRH

    #InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models

    Full text link
    Foundation language models obtain the instruction-following ability through supervised fine-tuning (SFT). Diversity and complexity are considered critical factors of a successful SFT dataset, while their definitions remain obscure and lack quantitative analyses. In this work, we propose InsTag, an open-set fine-grained tagger, to tag samples within SFT datasets based on semantics and intentions and define instruction diversity and complexity regarding tags. We obtain 6.6K tags to describe comprehensive user queries. Then we analyze popular open-sourced SFT datasets and find that the model ability grows with more diverse and complex data. Based on this observation, we propose a data selector based on InsTag to select 6K diverse and complex samples from open-source datasets and fine-tune models on InsTag-selected data. The resulting models, TagLM, outperform open-source models based on considerably larger SFT data evaluated by MT-Bench, echoing the importance of query diversity and complexity. We open-source InsTag in https://github.com/OFA-Sys/InsTag

    Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

    Full text link
    Mathematical reasoning is a challenging task for large language models (LLMs), while the scaling relationship of it with respect to LLM capacity is under-explored. In this paper, we investigate how the pre-training loss, supervised data amount, and augmented data amount influence the reasoning performances of a supervised LLM. We find that pre-training loss is a better indicator of the model's performance than the model's parameter count. We apply supervised fine-tuning (SFT) with different amounts of supervised data and empirically find a log-linear relation between data amount and model performance, and we find better models improve less with enlarged supervised datasets. To augment more data samples for improving model performances without any human effort, we propose to apply Rejection sampling Fine-Tuning (RFT). RFT uses supervised models to generate and collect correct reasoning paths as augmented fine-tuning datasets. We find with augmented samples containing more distinct reasoning paths, RFT improves mathematical reasoning performance more for LLMs. We also find RFT brings more improvement for less performant LLMs. Furthermore, we combine rejection samples from multiple models which push LLaMA-7B to an accuracy of 49.3\% on GSM8K which outperforms the supervised fine-tuning (SFT) accuracy of 35.9\% significantly.Comment: Working in Progres

    Redco: A Lightweight Tool to Automate Distributed Training of LLMs on Any GPU/TPUs

    Full text link
    The recent progress of AI can be largely attributed to large language models (LLMs). However, their escalating memory requirements introduce challenges for machine learning (ML) researchers and engineers. Addressing this requires developers to partition a large model to distribute it across multiple GPUs or TPUs. This necessitates considerable coding and intricate configuration efforts with existing model parallel tools, such as Megatron-LM, DeepSpeed, and Alpa. These tools require users' expertise in machine learning systems (MLSys), creating a bottleneck in LLM development, particularly for developers without MLSys background. In this work, we present Redco, a lightweight and user-friendly tool crafted to automate distributed training and inference for LLMs, as well as to simplify ML pipeline development. The design of Redco emphasizes two key aspects. Firstly, to automate model parallism, our study identifies two straightforward rules to generate tensor parallel strategies for any given LLM. Integrating these rules into Redco facilitates effortless distributed LLM training and inference, eliminating the need of additional coding or complex configurations. We demonstrate the effectiveness by applying Redco on a set of LLM architectures, such as GPT-J, LLaMA, T5, and OPT, up to the size of 66B. Secondly, we propose a mechanism that allows for the customization of diverse ML pipelines through the definition of merely three functions, eliminating redundant and formulaic code like multi-host related processing. This mechanism proves adaptable across a spectrum of ML algorithms, from foundational language modeling to complex algorithms like meta-learning and reinforcement learning. Consequently, Redco implementations exhibit much fewer code lines compared to their official counterparts.Comment: Released under Apache License 2.0 at https://github.com/tanyuqian/redc

    Case Report: Chlamydia psittaci pneumonia complicated by Guillain-Barré syndrome detected using metagenomic next-generation sequencing

    Get PDF
    Psittacosis and Guillain-Barré syndrome are both rare clinical diseases with low incidence, and their combination has rarely been reported. Here, we report a case of Chlamydia psittaci pneumonia combined with Guillain-Barré syndrome. The patient initially presented with high fever, difficulty breathing, and fatigue. Chest computerised tomography indicated large consolidation opacities in both lungs. Metagenomic next-generation sequencing clearly identified the pathogen as C. psittaci. The patient’s fever subsided after targeted antibiotic treatment, but difficulty breathing and fatigue worsened, and the patient developed symmetric limb numbness and weakness. Lumbar puncture, electrophysiological examination, and clinical characteristics were suggestive of Guillain-Barré syndrome, and the symptoms improved after treatment with human immunoglobulin. The results of this study suggest that metagenomic next-generation sequencing is useful for the rapid diagnosis of pulmonary infectious agents. Psittacosis is closely associated with the development of Guillain-Barré syndrome; however, more cases are needed to support this conclusion, and early targeted antibiotic treatment, immunotherapy, and basic supportive treatment are essential for improving outcomes

    Clinical significance of a point mutation in DNA polymerase beta (POLB) gene in gastric cancer.

    Get PDF
    Gastric cancer (GC) is a major cause of global cancer mortality. Genetic variations in DNA repair genes can modulate DNA repair capability and, consequently, have been associated with risk of developing cancer. We have previously identified a T to C point mutation at nucleotide 889 (T889C) in DNA polymerase beta (POLB) gene, a key enzyme involved in base excision repair in primary GCs. The purpose of this study was to evaluate the mutation and expression of POLB in a larger cohort and to identify possible prognostic roles of the POLB alterations in GC. Primary GC specimens and their matched normal adjacent tissues were collected at the time of surgery. DNA, RNA and protein samples were isolated from GC specimens and cell lines. Mutations were detected by PCR-RFLP/DHPLC and sequencing analysis. POLB gene expression was examined by RT-PCR, tissue microarray, Western blotting and immunofluorescence assays. The function of the mutation was evaluated by chemosensitivity, MTT, Transwell matrigel invasion and host cell reactivation assays. The T889C mutation was detected in 18 (10.17%) of 177 GC patients. And the T889C mutation was associated with POLB overexpression, lymph nodes metastases and poor tumor differentiation. In addition, patients with- the mutation had significantly shorter survival time than those without-, following postoperative chemotherapy. Furthermore, cell lines with T889C mutation in POLB gene were more resistant to the treatment of 5-fluorouracil, cisplatin and epirubicin than those with wild type POLB. Forced expression of POLB gene with T889C mutation resulted in enhanced cell proliferation, invasion and resistance to anticancer drugs, along with increased DNA repair capability. These results suggest that POLB gene with T889C mutation in surgically resected primary gastric tissues may be clinically useful for predicting responsiveness to chemotherapy in patients with GC. The POLB gene alteration may serve as a prognostic biomarker for GC

    Clinical Significance of a Point Mutation in DNA Polymerase Beta (POLB) Gene in Gastric Cancer.

    Get PDF
    Gastric cancer (GC) is a major cause of global cancer mortality. Genetic variations in DNA repair genes can modulate DNA repair capability and, consequently, have been associated with risk of developing cancer. We have previously identified a T to C point mutation at nucleotide 889 (T889C) in DNA polymerase beta (POLB) gene, a key enzyme involved in base excision repair in primary GCs. The purpose of this study was to evaluate the mutation and expression of POLB in a larger cohort and to identify possible prognostic roles of the POLB alterations in GC. Primary GC specimens and their matched normal adjacent tissues were collected at the time of surgery. DNA, RNA and protein samples were isolated from GC specimens and cell lines. Mutations were detected by PCR-RFLP/DHPLC and sequencing analysis. POLB gene expression was examined by RT-PCR, tissue microarray, Western blotting and immunofluorescence assays. The function of the mutation was evaluated by chemosensitivity, MTT, Transwell matrigel invasion and host cell reactivation assays. The T889C mutation was detected in 18 (10.17%) of 177 GC patients. And the T889C mutation was associated with POLB overexpression, lymph nodes metastases and poor tumor differentiation. In addition, patients with- the mutation had significantly shorter survival time than those without-, following postoperative chemotherapy. Furthermore, cell lines with T889C mutation in POLB gene were more resistant to the treatment of 5-fluorouracil, cisplatin and epirubicin than those with wild type POLB. Forced expression of POLB gene with T889C mutation resulted in enhanced cell proliferation, invasion and resistance to anticancer drugs, along with increased DNA repair capability. These results suggest that POLBgene with T889C mutation in surgically resected primary gastric tissues may be clinically useful for predicting responsiveness to chemotherapy in patients with GC. The POLB gene alteration may serve as a prognostic biomarker for GC

    Clinicopathological characteristics of gastrointestinal schwannomas: A retrospective analysis of 78 cases

    Get PDF
    IntroductionSchwannomas are tumors arising from Schwan cells of the neural sheath, which rarely occur in the gastrointestinal tract. The aim of the present study was to analyze the clinicopathological features and treatment outcomes of gastrointestinal schwannomas (GISs).MethodsPatients who were diagnosed with GISs in our hospital from January 2010 to December 2021 were selected. Data about demographic characteristics, clinical symptoms, treatment methods and outcomes, pathological results, and follow-up results were retrospectively collected and analyzed.ResultsA total of 78 patients with 79 GISs were included, the female-to-male ratio was 55:23, and the average age was 52.12 ± 12.26 years. One-third (26/78) of the patients were asymptomatic. A total of 79 GISs were removed, and the average size was 3.63 ± 2.03 cm (range, 0.3–10 cm). As for tumor location, 54 GISs were located in the stomach, 14 in the esophagus, 2 in the duodenum, 6 in the colorectum (4 in the colon and 2 in the rectum), and the other 3 in the small intestine. A total of 23 and 55 patients underwent endoscopic and surgical resections, respectively. Compared with surgical resection, endoscopic resection is associated with a smaller diameter, lower cost, and shorter hospital stay. Pathological results revealed that S100 was positive in all the GISs. No recurrence was noticed during a median follow-up of 45 months (range, 6–148 months).ConclusionGISs are rare gastrointestinal tumors with favorable prognoses, which are most commonly seen in the stomach and diagnosed by pathological findings with immunohistochemical staining. Surgical resection remains the standard method for removing GISs, while endoscopic resection may serve as an alternative method for selected patients with GISs and may be attempted in GISs with a diameter of <3 cm and no signs of malignancy
    • …
    corecore