62 research outputs found
The Life Cycle of Knowledge in Big Language Models: A Survey
Knowledge plays a critical role in artificial intelligence. Recently, the
extensive success of pre-trained language models (PLMs) has raised significant
attention about how knowledge can be acquired, maintained, updated and used by
language models. Despite the enormous amount of related studies, there still
lacks a unified view of how knowledge circulates within language models
throughout the learning, tuning, and application processes, which may prevent
us from further understanding the connections between current progress or
realizing existing limitations. In this survey, we revisit PLMs as
knowledge-based systems by dividing the life circle of knowledge in PLMs into
five critical periods, and investigating how knowledge circulates when it is
built, maintained and used. To this end, we systematically review existing
studies of each period of the knowledge life cycle, summarize the main
challenges and current limitations, and discuss future directions.Comment: paperlist: https://github.com/c-box/KnowledgeLifecycl
Benchmarking Large Language Models in Retrieval-Augmented Generation
Retrieval-Augmented Generation (RAG) is a promising approach for mitigating
the hallucination of large language models (LLMs). However, existing research
lacks rigorous evaluation of the impact of retrieval-augmented generation on
different large language models, which make it challenging to identify the
potential bottlenecks in the capabilities of RAG for different LLMs. In this
paper, we systematically investigate the impact of Retrieval-Augmented
Generation on large language models. We analyze the performance of different
large language models in 4 fundamental abilities required for RAG, including
noise robustness, negative rejection, information integration, and
counterfactual robustness. To this end, we establish Retrieval-Augmented
Generation Benchmark (RGB), a new corpus for RAG evaluation in both English and
Chinese. RGB divides the instances within the benchmark into 4 separate
testbeds based on the aforementioned fundamental abilities required to resolve
the case. Then we evaluate 6 representative LLMs on RGB to diagnose the
challenges of current LLMs when applying RAG. Evaluation reveals that while
LLMs exhibit a certain degree of noise robustness, they still struggle
significantly in terms of negative rejection, information integration, and
dealing with false information. The aforementioned assessment outcomes indicate
that there is still a considerable journey ahead to effectively apply RAG to
LLMs
Toward Unified Controllable Text Generation via Regular Expression Instruction
Controllable text generation is a fundamental aspect of natural language
generation, with numerous methods proposed for different constraint types.
However, these approaches often require significant architectural or decoding
modifications, making them challenging to apply to additional constraints or
resolve different constraint combinations. To address this, our paper
introduces Regular Expression Instruction (REI), which utilizes an
instruction-based mechanism to fully exploit regular expressions' advantages to
uniformly model diverse constraints. Specifically, our REI supports all popular
fine-grained controllable generation constraints, i.e., lexical, positional,
and length, as well as their complex combinations, via regular expression-style
instructions. Our method only requires fine-tuning on medium-scale language
models or few-shot, in-context learning on large language models, and requires
no further adjustment when applied to various constraint combinations.
Experiments demonstrate that our straightforward approach yields high success
rates and adaptability to various constraints while maintaining competitiveness
in automatic metrics and outperforming most previous baselines.Comment: Accepted on IJCNLP-AACL 202
- …