94 research outputs found

    Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation

    Full text link
    Large Language Models (LLMs) have made progress in various real-world tasks, which stimulates requirements for the evaluation of LLMs. Existing LLM evaluation methods are mainly supervised signal-based which depends on static datasets and cannot evaluate the ability of LLMs in dynamic real-world scenarios where deep interaction widely exists. Other LLM evaluation methods are human-based which are costly and time-consuming and are incapable of large-scale evaluation of LLMs. To address the issues above, we propose a novel Deep Interaction-based LLM-evaluation framework. In our proposed framework, LLMs' performances in real-world domains can be evaluated from their deep interaction with other LLMs in elaborately designed evaluation tasks. Furthermore, our proposed framework is a general evaluation method that can be applied to a host of real-world tasks such as machine translation and code generation. We demonstrate the effectiveness of our proposed method through extensive experiments on four elaborately designed evaluation tasks

    Identifiable Cognitive Diagnosis with Encoder-decoder for Modelling Students' Performance

    Full text link
    Cognitive diagnosis aims to diagnose students' knowledge proficiencies based on their response scores on exam questions, which is the basis of many domains such as computerized adaptive testing. Existing cognitive diagnosis models (CDMs) follow a proficiency-response paradigm, which views diagnostic results as learnable embeddings that are the cause of students' responses and learns the diagnostic results through optimization. However, such a paradigm can easily lead to unidentifiable diagnostic results and the explainability overfitting problem, which is harmful to the quantification of students' learning performance. To address these problems, we propose a novel identifiable cognitive diagnosis framework. Specifically, we first propose a flexible diagnostic module which directly diagnose identifiable and explainable examinee traits and question features from response logs. Next, we leverage a general predictive module to reconstruct response logs from the diagnostic results to ensure the preciseness of the latter. We furthermore propose an implementation of the framework, i.e., ID-CDM, to demonstrate the availability of the former. Finally, we demonstrate the identifiability, explainability and preciseness of diagnostic results of ID-CDM through experiments on four public real-world datasets

    How to manifest the fertilizer reduction effect of pro-environmental agricultural technologies? From the perspective of farmers’ perception and behavioral adoption

    Get PDF
    IntroductionThe ecological and environmental pollution problem at the source of agriculture cannot be ignored, and the manifestation of the fertilizer reduction effect of pro-environmentally agricultural technologies (PEATs) will help motivate farmers to adopt technology, thereby promoting sustainable agricultural development.MethodsFrom the dual perspectives of farmers’ perception and behavior effects, this paper uses 607 survey data of Chinese farmers, and an endogenous switching regression model is employed to identify the influencing factors of farmers’ adoption of PEATs and manifest its fertilizer reduction effect.Results and discussionThe results of the perception survey show that the farmers’ recognition of the fertilizer reduction effect of PEATs is not high, and the technical effect needs to be further demonstrated. Moreover, the estimated results suggest that PEATs can significantly reduce the fertilizer application of farmers. Specifically, if farmers who have adopted PEATs do not adopt them, they will apply more chemical fertilizers, the farmers who have not adopted PEATs will use less chemical fertilizer if they do. Overall, the main influencing factors for farmers adopting PEATs include education level, government officials, cultivated land area, soil fertility, information access channels, and the distance of home-agricultural technology station. This study aims to provide empirical evidence for the formulation of strategies and plans to promote sustainable agricultural development

    Molecular vasculogenic mimicry–Related signatures predict clinical outcomes and therapeutic responses in bladder cancer: Results from real-world cohorts

    Get PDF
    Bladder cancer (BLCA) is a heterogeneous disease, and there are many classical molecular subtypes that reflect tumor immune microenvironment (TME) heterogeneity but their clinical utility is limited and correct individual treatment and prognosis cannot be predicted based on them. To find reliable and effective biomarkers and tools for predicting patients’ clinical responses to several therapies, we developed a new systemic indicator of molecular vasculogenic mimicry (VM)–related genes mediated by molecular subtypes based on the Xiangya cohort and additional external BLCA cohorts using a random forest algorithm. A correlation was then done between the VM_Score and classical molecular subtypes, clinical outcomes, immunophenotypes, and treatment options for BLCA. With the VM_Score, it is possible to predict classical molecular subtypes, immunophenotypes, prognosis, and therapeutic potential of BLCA with high accuracy. The VM_Scores of high levels indicate a more anticancer immune response but a worse prognosis due to a more basal and inflammatory phenotype. The VM_Score was also found associated with low sensitivity to antiangiogenic and targeted therapies targeting the FGFR3, β-catenin, and PPAR-γ pathways but with high sensitivity to cancer immunotherapy, neoadjuvant chemotherapy, and radiotherapy. A number of aspects of BLCA biology were reflected in the VM_Score, providing new insights into precision medicine. Additionally, the VM_Score may be used as an indicator of pan-cancer immunotherapy response and prognosis

    OR-008 ERK-BAX signaling is involved in GLP-1-mediated antidepressant effects of metformin and exercise in CUMS mice

    Get PDF
    Objective Both depression itself and antidepressant medication have been reported to be significantly related to the risk of type 2 diabetes mellitus (T2DM). Glucagon-like peptide-1 (GLP-1), a treatment target for T2DM, has a neuroprotective effect. As an enhancer and sensitiser of GLP-1, metformin has been reported to be safe for the neurodevelopment. The present study aimed to determine whether and how GLP-1 mediates antidepressant effects of metformin and exercise in mice. Methods Male C57BL/6 mice were exposed to chronic unpredictable mild stress (CUMS) for 8 weeks. From the 4th week, CUMS mice were subjected to oral metformin treatment and/or treadmill running. A videocomputerized tracking system was used to record behaviors of mice for a 5-min session. ELISA, western blotting and immunohistochemistry were used to examine gene expression in mouse serum or hippocampus. Results Our results supported the validity of metformin as a useful antidepressant; moreover, treadmill running favored metformin effects on exploratory behaviors and serum corticosterone levels. CUMS reduced GLP-1 protein levels and phosphorylation levels of extracellular signal-regulated kinase 1/2 (ERK1/2), but increased protein levels of B-cell lymphoma 2-associated X-protein (BAX) in mice hippocampus. All these changes were restored by both single and combined treatment with metformin and exercise. Conclusions Our findings have demonstrated that ERK-BAX signaling is involved in GLP-1-mediated antidepressant effects of metformin and exercise, which may provide a novel topic for future clinical research

    Co-exposure to multiple vitamins and the risk of all-cause mortality in patients with diabetes

    Get PDF
    ObjectiveAlthough the effect of vitamins on the risk of mortality in diabetic patients has been reported, most studies focus on individual vitamins. However, humans are often exposed to multiple vitamins simultaneously in daily life. Therefore, it is worth exploring the effects of co-exposure to multiple vitamins on the risk of mortality in diabetic patients.MethodsThis study included diabetic patients aged ≥20WD years who participated in NHANES from 2003 to 2006. An unsupervised K-means clustering method was used to cluster eight vitamins in serum into several patterns of co-exposure to multiple vitamins, and the Cox proportional hazards model was used to evaluate the impact of different patterns of co-exposure to multiple vitamins on the risk of all-cause mortality in diabetic patients.ResultsThree patterns of co-exposure to multiple vitamins were generated based on K-means clustering, namely, low-level, moderate-level, and high-level. Among the 484 diabetic patients, with a median follow-up of 13.7 years, a total of 211 deaths occurred. After adjusting for covariates, the individual vitamins had varying effects on the risk of all-cause mortality in diabetic patients. Compared to the low-level group of co-exposure to multiple vitamins, the high-level group significantly reduced the risk of all-cause mortality in diabetic patients, with a HR of 0.42 (95% CI: 0.20, 0.87). Subgroup analysis demonstrated that high levels of co-exposure to multiple vitamins significantly reduced the risk of all-cause mortality in males, individuals aged ≥ 60 years, and non-Hispanic White people with diabetes compared to the low-level group, with HR of 0.42 (95% CI: 0.18, 0.98), 0.53 (95% CI: 0.26, 0.98), and 0.26 (95% CI: 0.12, 0.58) respectively.ConclusionWhile individual vitamins had different effects on the risk of all-cause mortality in patients with diabetes, high-level co-exposure to multiple vitamins significantly reduced the risk of all-cause mortality in patients with diabetes, with differences observed among genders, ages, and race. This suggests that when developing vitamin intervention strategies for patients with diabetes, consideration should be given not only to the dosage of individual vitamins but also to the variations between different population groups

    Bulked segregant RNA-seq reveals complex resistance expression profile to powdery mildew in wild emmer wheat W762

    Get PDF
    Powdery mildew, caused by Blumeria graminis f. sp. tritici (Bgt), is one of the most destructive fungal diseases threatening global wheat production. Exploring powdery mildew resistance (Pm) gene(s) and dissecting the molecular mechanism of the host resistance are critical to effectively and reasonably control this disease. Durum wheat (Triticum turgidum L. var. durumDesf.) is an important gene donor for wheat improvement against powdery mildew. In this study, a resistant durum wheat accession W762 was used to investigate its potential resistance component(s) and profile its expression pattern in responding to Bgt invasion using bulked segregant RNA-Seq (BSR-Seq) and further qRT-PCR verification. Genetic analysis showed that the powdery mildew resistance in W762 did not meet monogenic inheritance and complex genetic model might exist within the population of W762 Ă— Langdon (susceptible durum wheat). After BSR-Seq, 6,196 consistently different single nucleotide polymorphisms (SNPs) were called between resistant and susceptible parents and bulks, and among them, 763 SNPs were assigned to the chromosome arm 7B. Subsequently, 3,653 differentially expressed genes (DEGs) between resistant and susceptible parents and bulks were annotated and analyzed by Gene Ontology (GO), Cluster of Orthologous Groups (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment. The potential regulated genes were selected and analyzed their temporal expression patterns following Bgt inoculation. As a result, nine disease-related genes showed distinctive expression profile after Bgt invasion and might serve as potential targets to regulate the resistance against powdery mildew in W762. Our study could lay a foundation for analysis of the molecular mechanism and also provide potential targets for the improvement of durable resistance against powdery mildew
    • …
    corecore