31 research outputs found

    Baichuan 2: Open Large-scale Language Models

    Full text link
    Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering. However, most powerful LLMs are closed-source or limited in their capability for languages other than English. In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens. Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan 2 excels in vertical domains such as medicine and law. We will release all pre-training model checkpoints to benefit the research community in better understanding the training dynamics of Baichuan 2.Comment: Baichuan 2 technical report. Github: https://github.com/baichuan-inc/Baichuan

    A Retrospective Cohort Study: Association Of Sex And Race Differences On 30-Day Retention In Addiction Treatment For Substance Use Disorders After Discharge

    No full text
    Objective: Little is known about the association of sex and racial differences in accessing SUD treatment. little is known whether these disparities continue despite hospitalization. The main objective was to determine the association of race and sex differences on 30-day retention in addiction treatment for all substance use disorders after discharge. Design: A Retrospective Cohort Study using the Yale Addiction Medicine Consult Service databaseSubjects: Patients with SUD who were hospitalized at Yale New Haven Hospital from Oct 26th, 2018 to Jun 30th, 2021 are eligible, which has 2557 observations. Methods: Mixed effect multivariable logistic regression for sex and racial differences on attending 30-day retention addiction treatment after discharge, which involves predictors: gender, race, age, ethnicity, history of medication treatment, housing status, transportation, length of stay, primary substance use disorder, infective endocarditis, and HCV. Results: No association of sex(ORs 1.2, 95%CI 0.92-1.57, p=0.18) and race (ORs 1.33, 95%CI 0.96-1.84, p=0.08; ORs 0.60, 95%CI 0.26-1.40, p=0.24) differences on attending 30-day retention addiction treatment in the multivariable model. However, for the bivariate model, both history medication treatment and primary substance use disorder show a statistically significant association with attending treatment. The covariate of history medication treatment and primary substance use disorder could be the primary contributor to the race differences in retention treatment. Conclusion: There is no significant association between sex differences and attending 30-day retention treatment. Future studies are necessary to understand how patients are influenced by the history of medication treatment and primary substance use disorder, and the effectiveness of SUD hospitalization treatment in addressing disparity is also meaningful to study

    Neural Question Generation with Answer Pivot

    No full text
    Neural question generation (NQG) is the task of generating questions from the given context with deep neural networks. Previous answer-aware NQG methods suffer from the problem that the generated answers are focusing on entity and most of the questions are trivial to be answered. The answer-agnostic NQG methods reduce the bias towards named entities and increasing the model's degrees of freedom, but sometimes result in generating unanswerable questions which are not valuable for the subsequent machine reading comprehension system. In this paper, we treat the answers as the hidden pivot for question generation and combine the question generation and answer selection process in a joint model. We achieve the state-of-the-art result on the SQuAD dataset according to automatic metric and human evaluation

    MoS 2

    No full text

    Tasty Burgers, Soggy Fries: Probing Aspect Robustness in Aspect-Based Sentiment Analysis

    Full text link
    Aspect-based sentiment analysis (ABSA) aims to predict the sentiment towards a specific aspect in the text. However, existing ABSA test sets cannot be used to probe whether a model can distinguish the sentiment of the target aspect from the non-target aspects. To solve this problem, we develop a simple but effective approach to enrich ABSA test sets. Specifically, we generate new examples to disentangle the confounding sentiments of the non-target aspects from the target aspect's sentiment. Based on the SemEval 2014 dataset, we construct the Aspect Robustness Test Set (ARTS) as a comprehensive probe of the aspect robustness of ABSA models. Over 92% data of ARTS show high fluency and desired sentiment on all aspects by human evaluation. Using ARTS, we analyze the robustness of nine ABSA models, and observe, surprisingly, that their accuracy drops by up to 69.73%. We explore several ways to improve aspect robustness, and find that adversarial training can improve models' performance on ARTS by up to 32.85%. Our code and new test set are available at https://github.com/zhijing-jin/ARTS_TestSetComment: EMNLP 2020, long pape

    How to Evaluate the Rice Cultivation Suitability?

    No full text
    To rationally allocate farmland resources, and scientifically make farming industrial planning, we take Yizheng City in Jiangsu Province as the research object, and select 13 indicators. Based on Farmland Resources Management Information System in Yizheng City, we establish AHP model, and membership function model, for the evaluation of farmland suitability of rice. The results show that the farmland area in the highly suitable areas accounts for 10.2% of the total farmland area; the farmland area in the suitable areas accounts for 56.08% of the total farmland area; the farmland area in the marginally suitable areas accounts for 25.50% of the total farmland area; the farmland area in the unsuitable areas accounts for 8.22% of the total farmland area. There is significant positive correlation between the actual yield of rice surveyed and suitability index obtained through evaluation (R2=0.1964, 319 samples); the actual yield of rice in the highly suitable areas is higher than in the marginally suitable areas and suitable areas, and the rice yield is the lowest in the unsuitable areas
    corecore