64 research outputs found
Evaluating Large Language Models: A Comprehensive Survey
Large language models (LLMs) have demonstrated remarkable capabilities across
a broad spectrum of tasks. They have attracted significant attention and been
deployed in numerous downstream applications. Nevertheless, akin to a
double-edged sword, LLMs also present potential risks. They could suffer from
private data leaks or yield inappropriate, harmful, or misleading content.
Additionally, the rapid progress of LLMs raises concerns about the potential
emergence of superintelligent systems without adequate safeguards. To
effectively capitalize on LLM capacities as well as ensure their safe and
beneficial development, it is critical to conduct a rigorous and comprehensive
evaluation of LLMs.
This survey endeavors to offer a panoramic perspective on the evaluation of
LLMs. We categorize the evaluation of LLMs into three major groups: knowledge
and capability evaluation, alignment evaluation and safety evaluation. In
addition to the comprehensive review on the evaluation methodologies and
benchmarks on these three aspects, we collate a compendium of evaluations
pertaining to LLMs' performance in specialized domains, and discuss the
construction of comprehensive evaluation platforms that cover LLM evaluations
on capabilities, alignment, safety, and applicability.
We hope that this comprehensive overview will stimulate further research
interests in the evaluation of LLMs, with the ultimate goal of making
evaluation serve as a cornerstone in guiding the responsible development of
LLMs. We envision that this will channel their evolution into a direction that
maximizes societal benefit while minimizing potential risks. A curated list of
related papers has been publicly available at
https://github.com/tjunlp-lab/Awesome-LLMs-Evaluation-Papers.Comment: 111 page
Deep Fuzzy Tree for Large-Scale Hierarchical Visual Classification
Deep learning models often use a flat softmax layer to classify samples after feature extraction in visual classification tasks. However, it is hard to make a single decision of finding the true label from massive classes. In this scenario, hierarchical classification is proved to be an effective solution and can be utilized to replace the softmax layer. A key issue of hierarchical classification is to construct a good label structure, which is very significant for classification performance. Several works have been proposed to address the issue, but they have some limits and are almost designed heuristically. In this paper, inspired by fuzzy rough set theory, we propose a deep fuzzy tree model which learns a better tree structure and classifiers for hierarchical classification with theory guarantee. Experimental results show the effectiveness and efficiency of the proposed model in various visual classification datasets
Superior energy density through tailored dopant strategies in multilayer ceramic capacitors
The Gerson–Marshall (1959) relationship predicts an increase in dielectric breakdown strength (BDS) and therefore, recoverable energy density (Wrec) with decreasing dielectric layer thickness. This relationship only operates however, if the total resistivity of the dielectric is sufficiently high and the electrical microstructure is homogeneous (no short circuit diffusion paths). BiFeO3–SrTiO3 (BF–ST) is a promising base for developing high energy density capacitors but Bi-rich compositions which have the highest polarisability per unit volume are ferroelectric rather than relaxor and are electrically too conductive. Here, we present a systematic strategy to optimise BDS and maximum polarisation via: (i) Nb-doping to increase resistivity by eliminating hole conduction and promoting electrical homogeneity and (ii) alloying with a third perovskite end-member, BiMg2/3Nb1/3O3 (BMN), to reduce long range polar coupling without decreasing the average ionic polarisability. These strategies result in an increase in BDS to give Wrec = 8.2 J cm−3 at 460 kV cm−1 for BF–ST–0.03Nb–0.1BMN ceramics, which when incorporated in a multilayer capacitor with dielectric layers of 8 μm thickness gives BDS > 1000 kV cm−1 and Wrec = 15.8 J cm−3
Mechanism of enhanced energy storage density in AgNbO3-based lead-free antiferroelectrics
The mechanisms underpinning high energy storage density in lead-free Ag1–3xNdxTayNb1-yO3 antiferroelectric (AFE) ceramics have been investigated. Rietveld refinements of in-situ synchrotron X-ray data reveal that the structure remains quadrupled and orthorhombic under electric field (E) but adopts a non-centrosymmetric space group, Pmc21, in which the cations exhibit a ferrielectric configuration. Nd and Ta doping both stabilize the AFE structure, thereby increasing the AFE-ferrielectric switching field from 150 to 350 kV cm−1. Domain size and correlation length of AFE/ferrielectric coupling reduce with Nd doping, leading to slimmer hysteresis loops. The maximum polarization (Pmax) is optimized through A-site aliovalent doping which also decreases electrical conductivity, permitting the application of a larger E. These effects combine to enhance energy storage density to give Wrec = 6.5 J cm−3 for Ag0.97Nd0.01Ta0.20Nb0.80O3
Overcoming Wnt–β-catenin dependent anticancer therapy resistance in leukaemia stem cells
Leukaemia stem cells (LSCs) underlie cancer therapy resistance but targeting these cells remains difficult. The Wnt–β-catenin and PI3K–Akt pathways cooperate to promote tumorigenesis and resistance to therapy. In a mouse model in which both pathways are activated in stem and progenitor cells, LSCs expanded under chemotherapy-induced stress. Since Akt can activate β-catenin, inhibiting this interaction might target therapy-resistant LSCs. High-throughput screening identified doxorubicin (DXR) as an inhibitor of the Akt–β-catenin interaction at low doses. Here we repurposed DXR as a targeted inhibitor rather than a broadly cytotoxic chemotherapy. Targeted DXR reduced Akt-activated β-catenin levels in chemoresistant LSCs and reduced LSC tumorigenic activity. Mechanistically, β-catenin binds multiple immune-checkpoint gene loci, and targeted DXR treatment inhibited expression of multiple immune checkpoints specifically in LSCs, including PD-L1, TIM3 and CD24. Overall, LSCs exhibit distinct properties of immune resistance that are reduced by inhibiting Akt-activated β-catenin. These findings suggest a strategy for overcoming cancer therapy resistance and immune escape
Study on Serviceability of Transition Section Between Road and Tunnel
The determination of allowable differential settlement in bridge transition is a key problem to prevent vehicle jump at bridge head, but there are few theoretical research achievements in this aspect atroduction home and abroad. In this paper, four different structures of the road surface of The Sanyangchuan tunnel and the lead project are studied. The allowable differential settlement of asphalt pavement is calculated by asphalt pavement-layer system, and the allowable differential settlement is calculated by Ladan Lasse transform
Brain Iron Metabolism, Redox Balance and Neurological Diseases
The incidence of neurological diseases, such as Parkinson’s disease, Alzheimer’s disease and stroke, is increasing. An increasing number of studies have correlated these diseases with brain iron overload and the resulting oxidative damage. Brain iron deficiency has also been closely linked to neurodevelopment. These neurological disorders seriously affect the physical and mental health of patients and bring heavy economic burdens to families and society. Therefore, it is important to maintain brain iron homeostasis and to understand the mechanism of brain iron disorders affecting reactive oxygen species (ROS) balance, resulting in neural damage, cell death and, ultimately, leading to the development of disease. Evidence has shown that many therapies targeting brain iron and ROS imbalances have good preventive and therapeutic effects on neurological diseases. This review highlights the molecular mechanisms, pathogenesis and treatment strategies of brain iron metabolism disorders in neurological diseases
Solvent- And Base-Free Oxidation of 5-Hydroxymethylfurfural over a PdO/AlPO4-5 Catalyst under Mild Conditions
Funding Information: Financial support from the Natural Science Foundation of China under Contracts 21808163 and 21690083 is gratefully acknowledged. Publisher Copyright: © 2021 American Chemical Society.A solvent-free method was proposed to upgrade the biomass-derived compound 5-hydroxymethylfurfural (HMF). The oxidation of HMF to produce 2,5-furandicarboxylic acid (FDCA) has been examined in the presence of O2 without the addition of solvent and base. Different from the conversion of the aldehyde group on HMF as the initial oxidation step in H2O solvent, the hydroxyl group on HMF was first oxidized and FDCA was finally generated without the addition of solvent. The role of O2 is to replenish the consumption of active oxygen species on the catalyst surface. The oxidation of HMF to FDCA proceeded due to the solvent-free effect. A 83.6% FDCA selectivity at 38.8% HMF conversion was measured with a PdO/AlPO4-5 catalyst at 80 °C for 5 h and the reaction mechanism was proposed.Peer reviewe
- …