28,378 research outputs found
A Survey on Legal Question Answering Systems
Many legal professionals think that the explosion of information about local,
regional, national, and international legislation makes their practice more
costly, time-consuming, and even error-prone. The two main reasons for this are
that most legislation is usually unstructured, and the tremendous amount and
pace with which laws are released causes information overload in their daily
tasks. In the case of the legal domain, the research community agrees that a
system allowing to generate automatic responses to legal questions could
substantially impact many practical implications in daily activities. The
degree of usefulness is such that even a semi-automatic solution could
significantly help to reduce the workload to be faced. This is mainly because a
Question Answering system could be able to automatically process a massive
amount of legal resources to answer a question or doubt in seconds, which means
that it could save resources in the form of effort, money, and time to many
professionals in the legal sector. In this work, we quantitatively and
qualitatively survey the solutions that currently exist to meet this challenge.Comment: 57 pages, 1 figure, 10 table
Exploring the State of the Art in Legal QA Systems
Answering questions related to the legal domain is a complex task, primarily
due to the intricate nature and diverse range of legal document systems.
Providing an accurate answer to a legal query typically necessitates
specialized knowledge in the relevant domain, which makes this task all the
more challenging, even for human experts. QA (Question answering systems) are
designed to generate answers to questions asked in human languages. They use
natural language processing to understand questions and search through
information to find relevant answers. QA has various practical applications,
including customer service, education, research, and cross-lingual
communication. However, they face challenges such as improving natural language
understanding and handling complex and ambiguous questions. Answering questions
related to the legal domain is a complex task, primarily due to the intricate
nature and diverse range of legal document systems. Providing an accurate
answer to a legal query typically necessitates specialized knowledge in the
relevant domain, which makes this task all the more challenging, even for human
experts. At this time, there is a lack of surveys that discuss legal question
answering. To address this problem, we provide a comprehensive survey that
reviews 14 benchmark datasets for question-answering in the legal field as well
as presents a comprehensive review of the state-of-the-art Legal Question
Answering deep learning models. We cover the different architectures and
techniques used in these studies and the performance and limitations of these
models. Moreover, we have established a public GitHub repository where we
regularly upload the most recent articles, open data, and source code. The
repository is available at:
\url{https://github.com/abdoelsayed2016/Legal-Question-Answering-Review}
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity
This survey addresses the crucial issue of factuality in Large Language
Models (LLMs). As LLMs find applications across diverse domains, the
reliability and accuracy of their outputs become vital. We define the
Factuality Issue as the probability of LLMs to produce content inconsistent
with established facts. We first delve into the implications of these
inaccuracies, highlighting the potential consequences and challenges posed by
factual errors in LLM outputs. Subsequently, we analyze the mechanisms through
which LLMs store and process facts, seeking the primary causes of factual
errors. Our discussion then transitions to methodologies for evaluating LLM
factuality, emphasizing key metrics, benchmarks, and studies. We further
explore strategies for enhancing LLM factuality, including approaches tailored
for specific domains. We focus two primary LLM configurations standalone LLMs
and Retrieval-Augmented LLMs that utilizes external data, we detail their
unique challenges and potential enhancements. Our survey offers a structured
guide for researchers aiming to fortify the factual reliability of LLMs.Comment: 62 pages; 300+ reference
Scaling Laws for Forgetting When Fine-Tuning Large Language Models
We study and quantify the problem of forgetting when fine-tuning pre-trained
large language models (LLMs) on a downstream task. We find that
parameter-efficient fine-tuning (PEFT) strategies, such as Low-Rank Adapters
(LoRA), still suffer from catastrophic forgetting. In particular, we identify a
strong inverse linear relationship between the fine-tuning performance and the
amount of forgetting when fine-tuning LLMs with LoRA. We further obtain precise
scaling laws that show forgetting increases as a shifted power law in the
number of parameters fine-tuned and the number of update steps. We also examine
the impact of forgetting on knowledge, reasoning, and the safety guardrails
trained into Llama 2 7B chat. Our study suggests that forgetting cannot be
avoided through early stopping or by varying the number of parameters
fine-tuned. We believe this opens up an important safety-critical direction for
future research to evaluate and develop fine-tuning schemes which mitigate
forgettin
Memory-Aware Attentive Control for Community Question Answering With Knowledge-Based Dual Refinement
- …