4 research outputs found
Memory-Augmented LLM Personalization with Short- and Long-Term Memory Coordination
Large Language Models (LLMs), such as GPT3.5, have exhibited remarkable
proficiency in comprehending and generating natural language. However, their
unpersonalized generation paradigm may result in suboptimal user-specific
outcomes. Typically, users converse differently based on their knowledge and
preferences. This necessitates the task of enhancing user-oriented LLM which
remains unexplored. While one can fully train an LLM for this objective, the
resource consumption is unaffordable. Prior research has explored memory-based
methods to store and retrieve knowledge to enhance generation without
retraining for new queries. However, we contend that a mere memory module is
inadequate to comprehend a user's preference, and fully training an LLM can be
excessively costly. In this study, we propose a novel computational bionic
memory mechanism, equipped with a parameter-efficient fine-tuning schema, to
personalize LLMs. Our extensive experimental results demonstrate the
effectiveness and superiority of the proposed approach. To encourage further
research into this area, we are releasing a new conversation dataset generated
entirely by LLM based on an open-source medical corpus, as well as our
implementation code
RexUIE: A Recursive Method with Explicit Schema Instructor for Universal Information Extraction
Universal Information Extraction (UIE) is an area of interest due to the
challenges posed by varying targets, heterogeneous structures, and
demand-specific schemas. However, previous works have only achieved limited
success by unifying a few tasks, such as Named Entity Recognition (NER) and
Relation Extraction (RE), which fall short of being authentic UIE models
particularly when extracting other general schemas such as quadruples and
quintuples. Additionally, these models used an implicit structural schema
instructor, which could lead to incorrect links between types, hindering the
model's generalization and performance in low-resource scenarios. In this
paper, we redefine the authentic UIE with a formal formulation that encompasses
almost all extraction schemas. To the best of our knowledge, we are the first
to introduce UIE for any kind of schemas. In addition, we propose RexUIE, which
is a Recursive Method with Explicit Schema Instructor for UIE. To avoid
interference between different types, we reset the position ids and attention
mask matrices. RexUIE shows strong performance under both full-shot and
few-shot settings and achieves State-of-the-Art results on the tasks of
extracting complex schemas
PPN: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts
Key Information Extraction (KIE) is a challenging multimodal task that aims
to extract structured value semantic entities from visually rich documents.
Although significant progress has been made, there are still two major
challenges that need to be addressed. Firstly, the layout of existing datasets
is relatively fixed and limited in the number of semantic entity categories,
creating a significant gap between these datasets and the complex real-world
scenarios. Secondly, existing methods follow a two-stage pipeline strategy,
which may lead to the error propagation problem. Additionally, they are
difficult to apply in situations where unseen semantic entity categories
emerge. To address the first challenge, we propose a new large-scale
human-annotated dataset named Complex Layout form for key information
EXtraction (CLEX), which consists of 5,860 images with 1,162 semantic entity
categories. To solve the second challenge, we introduce Parallel Pointer-based
Network (PPN), an end-to-end model that can be applied in zero-shot and
few-shot scenarios. PPN leverages the implicit clues between semantic entities
to assist extracting, and its parallel extraction mechanism allows it to
extract multiple results simultaneously and efficiently. Experiments on the
CLEX dataset demonstrate that PPN outperforms existing state-of-the-art methods
while also offering a much faster inference speed