13 research outputs found

    On the Calibration of Large Language Models and Alignment

    Full text link
    As large language models attract increasing attention and find widespread application, concurrent challenges of reliability also arise at the same time. Confidence calibration, an effective analysis method for gauging the reliability of deep models, serves as a crucial tool for assessing and improving their reliability. However, such investigation has been comparatively underexplored. In this work, we conduct a systematic examination of the calibration of aligned language models throughout the entire construction process, including pretraining and alignment training. At each stage, we investigate how different training settings, such as parameter scales and training data, affect model calibration. To thoroughly assess model calibration, we evaluate models on three most concerned aspects: generation, factuality and understanding. Our work sheds light on whether popular LLMs are well-calibrated and how the training process influences model calibration.Comment: to be published in findings of EMNLP-202

    Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction

    Full text link
    Entities, as the essential elements in relation extraction tasks, exhibit certain structure. In this work, we formulate such structure as distinctive dependencies between mention pairs. We then propose SSAN, which incorporates these structural dependencies within the standard self-attention mechanism and throughout the overall encoding stage. Specifically, we design two alternative transformation modules inside each self-attention building block to produce attentive biases so as to adaptively regularize its attention flow. Our experiments demonstrate the usefulness of the proposed entity structure and the effectiveness of SSAN. It significantly outperforms competitive baselines, achieving new state-of-the-art results on three popular document-level relation extraction datasets. We further provide ablation and visualization to show how the entity structure guides the model for better relation extraction. Our code is publicly available.Comment: Accepted to AAAI 202

    Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning

    Full text link
    Enhancing the instruction-following ability of Large Language Models (LLMs) primarily demands substantial instruction-tuning datasets. However, the sheer volume of these imposes a considerable computational burden and annotation cost. To investigate a label-efficient instruction tuning method that allows the model itself to actively sample subsets that are equally or even more effective, we introduce a self-evolving mechanism DiverseEvol. In this process, a model iteratively augments its training subset to refine its own performance, without requiring any intervention from humans or more advanced LLMs. The key to our data sampling technique lies in the enhancement of diversity in the chosen subsets, as the model selects new data points most distinct from any existing ones according to its current embedding space. Extensive experiments across three datasets and benchmarks demonstrate the effectiveness of DiverseEvol. Our models, trained on less than 8% of the original dataset, maintain or improve performance compared with finetuning on full data. We also provide empirical evidence to analyze the importance of diversity in instruction data and the iterative scheme as opposed to one-time sampling. Our code is publicly available at https://github.com/OFA-Sys/DiverseEvol.git

    Qwen Technical Report

    Full text link
    Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat models finetuned with human alignment techniques. The base language models consistently demonstrate superior performance across a multitude of downstream tasks, and the chat models, particularly those trained using Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The chat models possess advanced tool-use and planning capabilities for creating agent applications, showcasing impressive performance even when compared to bigger models on complex tasks like utilizing a code interpreter. Furthermore, we have developed coding-specialized models, Code-Qwen and Code-Qwen-Chat, as well as mathematics-focused models, Math-Qwen-Chat, which are built upon base language models. These models demonstrate significantly improved performance in comparison with open-source models, and slightly fall behind the proprietary models.Comment: 59 pages, 5 figure

    Endoscopic versus minimally invasive surgical approach for infected necrotizing pancreatitis: a systematic review and meta-analysis of randomized controlled trials

    No full text
    AbstractBackground/Aims Acute pancreatitis is a common condition of the digestive system, but sometimes it develops into severe cases. In about 10–20% of patients, necrosis of the pancreas or its periphery occurs. Although most have aseptic necrosis, 30% of cases will develop infectious necrotizing pancreatitis. Infected necrotizing pancreatitis (INP) requires a critical treatment approach. Minimally invasive surgical approach (MIS) and endoscopy are the management methods. This meta-analysis compares the outcomes of MIS and endoscopic treatments.Methods We searched a medical database until December 2022 to compare the results of endoscopic and MIS procedures for INP. We selected eligible randomized controlled trials (RCTs) that reported treatment complications for the meta-analysis.Results Five RCTs comparing a total of 284 patients were included in the meta-analysis. Among them, 139 patients underwent MIS, while 145 underwent endoscopic procedures. The results showed significant differences (p  0.05) in the RR for death, bleeding, incisional hernia, percutaneous drainage, pancreatic endocrine deficiency, pancreatic exocrine deficiency, or the need for enzyme use.Conclusions Endoscopic management of INP performs better compared to surgical treatment due to its lower complication rate and higher patient life quality

    Probing the Photonic Spin-Orbit Interactions in the Near Field of Nanostructures

    No full text
    Photonic spin-orbit interactions (SOI) provide a new design paradigm of functional nanomaterials and nanostructures, and have especially accelerated advances in spin-orbit photonics. The berry phase or the geometric phase, a salient property of SOI, plays a vital role in this process. Thus, the characterization of photonic SOI processes together with the Berry phase is highly demanded for studies such as the optical spin-Hall effect, spin-to-vortex conversion, and Rashba effect. Here, a spin-selective and phase-resolved near-field microscopic method is proposed and experimentally demonstrated for real-time probing and direct visualization of photonic SOI at mesoscale, and a 3D tomographic technique for imaging the spatial evolutions of the optical phases is also properly realized. By analyzing a metallic metasurface as a spin-to-vortex conversion platform, the abrupt geometric phase and the spatially evolutional dynamic phases are directly measured and intuitively illustrated. This work provides a powerful tool for the study of spin-orbit phenomena in near-field optics, and can hold the promise for directly exploring the spin-dependent surface states in plasmonics and photonic topological insulators

    Inverse Association of Plasma Vanadium Concentrations with Gestational Diabetes Mellitus

    No full text
    Vanadium compounds were identified to be beneficial for the control of glucose homeostasis. We aimed to explore the association of plasma vanadium (V) with gestational diabetes mellitus (GDM). We performed a case-control study including 252 newly diagnosed GDM cases and 252 controls matched by age, parity, and gestational age. Fasting blood samples were collected from each participant at GDM screening (≥24 weeks of gestation). The plasma concentrations of V were determined utilizing inductively coupled plasma mass spectrometry. Plasma V levels were significantly lower in the GDM group than those in the control group (p p p < 0.05). The present study indicates an inverse association of plasma V with GDM. Further prospective cohort studies are required to validate our results

    Differences in seed characteristics, germination and seedling growth of Suaeda salsa grown in intertidal zone and on saline inland

    No full text
    The ecological restoration of saline land in the Yellow River Delta is essential for the sustainability of this region. Halophytic species, like Suaeda salsa, are critical for the restoration process. However, potential differences in traits of heteromorphic seeds collected from the intertidal zone and inland condition have been largely overlooked. The seeds were analyzed for hardness, nutrient elements, and secretions, while structural differences were observed under a stereomicroscope. Germination percentages of the different seed types and subsequent seedling growth were also recorded. Our study found that the black seeds from intertidal zone had the highest hardness when compared to the three other types of seeds. Nutrient analysis revealed that brown seeds had a higher iron (Fe) content than black seeds. Accordingly, brown seed embryos were greener compared to their black seed counterparts due to the iron's role in chlorophyll synthesis. Our results also revealed that brown seeds secreted greater amounts of exudates than black seeds. Finally, both the intertidal brown seeds and the inland-grown brown seeds had higher germination percentages and better early seedling growth than the corresponding black seeds. The differential characteristics between dimorphic seeds and seedlings may influence their environmental adaptation in different saline environments

    Tanshinone IIA Pretreatment Renders Free Flaps against Hypoxic Injury through Activating Wnt Signaling and Upregulating Stem Cell-Related Biomarkers

    No full text
    Partial or total flap necrosis after flap transplantation is sometimes clinically encountered in reconstructive surgery, often as a result of a period of hypoxia that exceeds the tolerance of the flap tissue. In this study, we determine whether tanshinone IIA (TSA) pretreatment can protect flap tissue against hypoxic injury and improve its viability. Primary epithelial cells isolated from the dorsal skin of mice were pretreated with TSA for two weeks. Cell counting kit-8 and Trypan Blue assays were carried out to examine the proliferation of TSA-pretreated cells after exposure to cobalt chloride. Then, Polymerase chain reaction and Western blot analysis were used to determine the expression of β-catenin, GSK-3β, SOX2, and OCT4 in TSA-treated cells. In vivo, after mice were pretreated with TSA for two weeks, a reproducible ischemic flap model was implemented, and the area of surviving tissue in the transplanted flaps was measured. Immunohistochemistry was also conducted to examine the related biomarkers mentioned above. Results show that epidermal cells, pretreated with TSA, showed enhanced resistance to hypoxia. Activation of the Wnt signaling pathway in TSA-pretreated cells was characterized by the upregulation of β-catenin and the downregulation of GSK-3β. The expression of SOX2 and OCT4 controlled by Wnt signaling were also found higher in TSA pretreated epithelial cells. In the reproducible ischaemic flap model, pretreatment with TSA enhanced resistance to hypoxia and increased the area of surviving tissue in transplanted flaps. The expression of Wnt signaling pathway components, stem-cell related biomarkers, and CD34, which are involved in the regeneration of blood vessels, was also upregulated in TSA-pretreated flap tissue. The results show that TSA pretreatment protects free flaps against hypoxic injury and increases the area of surviving tissue by activating Wnt signaling and upregulating stem cell-related biomarkers
    corecore