38 research outputs found

    Benchmarking Large Language Models in Retrieval-Augmented Generation

    Full text link
    Retrieval-Augmented Generation (RAG) is a promising approach for mitigating the hallucination of large language models (LLMs). However, existing research lacks rigorous evaluation of the impact of retrieval-augmented generation on different large language models, which make it challenging to identify the potential bottlenecks in the capabilities of RAG for different LLMs. In this paper, we systematically investigate the impact of Retrieval-Augmented Generation on large language models. We analyze the performance of different large language models in 4 fundamental abilities required for RAG, including noise robustness, negative rejection, information integration, and counterfactual robustness. To this end, we establish Retrieval-Augmented Generation Benchmark (RGB), a new corpus for RAG evaluation in both English and Chinese. RGB divides the instances within the benchmark into 4 separate testbeds based on the aforementioned fundamental abilities required to resolve the case. Then we evaluate 6 representative LLMs on RGB to diagnose the challenges of current LLMs when applying RAG. Evaluation reveals that while LLMs exhibit a certain degree of noise robustness, they still struggle significantly in terms of negative rejection, information integration, and dealing with false information. The aforementioned assessment outcomes indicate that there is still a considerable journey ahead to effectively apply RAG to LLMs

    A novel fireworks factor and improved elite strategy based on back propagation neural networks for state-of-charge estimation of lithium-ion batteries.

    Get PDF
    The state of charge (SOC) of Lithium-ion battery is one of the key parameters of the battery management system. In the SOC estimation algorithm, the Back Propagation (BP) neural network algorithm is easy to converge to the local optimal solution, which leads to the problem of low accuracy based on the BP network. It is proposed that the Fireworks Elite Genetic Algorithm (FEG-BP) is used to optimize the BP neural network, which can not only solve the problem of the traditional neural network algorithm that is easy to fall into the local maximum optimal solution but also solve the limitation of the traditional neural network algorithm. The searchability of the improved algorithm has been significantly enhanced, and the error has become smaller and the propagation speed is faster. Combining the experimental data of charging and discharging, the proposed FEG-BP neural network is compared with the traditional genetic neural network algorithm (GA-BP), and the results are analyzed. The results show that the standard BP neural network genetic algorithm predicts error within 7%, while FEG-BP reduces the error to within 3%

    An improved rainflow algorithm combined with linear criterion for the accurate li-ion battery residual life prediction.

    Get PDF
    Li-ion battery health assessment has been widely used in electric vehicles, unmanned aerial vehicle and other fields. In this paper, a new linear prediction method is proposed. By weakening the sensitivity of the Rainflow algorithm to the peak data, it can be applied to the field of battery, and can accurately count the number of Li-ion battery cycles, and skip the cumbersome link of parameter identification. Then, a linear criterion is proposed based on the idea of proportion, which makes the life prediction of Li-ion battery linear. Under the verification of multiple sets of data, the prediction error of this method is kept within 2.53%. This method has the advantages of high operation efficiency and simple operation, which provides a new idea for battery life prediction in the field of electric vehicles and aerospace

    SWE-SPHysics Simulation of Dam Break Flows at South-Gate Gorges Reservoir

    Get PDF
    This paper applied a Smoothed Particle Hydrodynamics (SPH) approach to solve Shallow Water Equations (SWEs) to study practical dam-break flows. The computational program is based on the open source code SWE-SPHysics, where a Monotone Upstream-centered Scheme for Conservation Laws (MUSCL) reconstruction method is used to improve the Riemann solution with Lax-Friedrichs flux. A virtual boundary particle method is applied to treat the solid boundary. The model is first tested on two benchmark collapses of water columns with the existence of downstream obstacle. Subsequently the model is applied to forecast a prototype dam-break flood, which might occur in South-Gate Gorges Reservoir area of Qinghai Province, China. It shows that the SWE-SPH modeling approach could provide a promising simulation tool for practical dam-break flows in engineering scale

    Benchmarking Knowledge-Enhanced Commonsense Question Answering via Knowledge-to-Text Transformation

    No full text
    A fundamental ability of humans is to utilize commonsense knowledge in language understanding and question answering. In recent years, many knowledge-enhanced Commonsense Question Answering (CQA) approaches have been proposed. However, it remains unclear: (1) How far can we get by exploiting external knowledge for CQA? (2) How much potential of knowledge has been exploited in current CQA models? (3) Which are the most promising directions for future CQA? To answer these questions, we benchmark knowledge-enhanced CQA by conducting extensive experiments on multiple standard CQA datasets using a simple and effective knowledge-to-text transformation framework. Experiments show that: (1) Our knowledge-to-text framework is effective and achieves state-of-the-art performance on CommonsenseQA dataset, providing a simple and strong knowledge-enhanced baseline for CQA; (2) The potential of knowledge is still far from being fully exploited in CQA — there is a significant performance gap from current models to our models with golden knowledge; and (3) Context-sensitive knowledge selection, heterogeneous knowledge exploitation, and commonsense-rich language models are promising CQA directions
    corecore