4 research outputs found
ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation
With large language models (LLMs) achieving remarkable breakthroughs in
natural language processing (NLP) domains, LLM-enhanced recommender systems
have received much attention and have been actively explored currently. In this
paper, we focus on adapting and empowering a pure large language model for
zero-shot and few-shot recommendation tasks. First and foremost, we identify
and formulate the lifelong sequential behavior incomprehension problem for LLMs
in recommendation domains, i.e., LLMs fail to extract useful information from a
textual context of long user behavior sequence, even if the length of context
is far from reaching the context limitation of LLMs. To address such an issue
and improve the recommendation performance of LLMs, we propose a novel
framework, namely Retrieval-enhanced Large Language models (ReLLa) for
recommendation tasks in both zero-shot and few-shot settings. For zero-shot
recommendation, we perform semantic user behavior retrieval (SUBR) to improve
the data quality of testing samples, which greatly reduces the difficulty for
LLMs to extract the essential knowledge from user behavior sequences. As for
few-shot recommendation, we further design retrieval-enhanced instruction
tuning (ReiT) by adopting SUBR as a data augmentation technique for training
samples. Specifically, we develop a mixed training dataset consisting of both
the original data samples and their retrieval-enhanced counterparts. We conduct
extensive experiments on a real-world public dataset (i.e., MovieLens-1M) to
demonstrate the superiority of ReLLa compared with existing baseline models, as
well as its capability for lifelong sequential behavior comprehension.Comment: Under Revie
CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models
With the emergence of Large Language Models (LLMs), there has been a
significant improvement in the programming capabilities of models, attracting
growing attention from researchers. We propose CodeApex, a bilingual benchmark
dataset focusing on the programming comprehension and code generation abilities
of LLMs. CodeApex comprises three types of multiple-choice questions:
conceptual understanding, commonsense reasoning, and multi-hop reasoning,
designed to evaluate LLMs on programming comprehension tasks. Additionally,
CodeApex utilizes algorithmic questions and corresponding test cases to assess
the code quality generated by LLMs. We evaluate 14 state-of-the-art LLMs,
including both general-purpose and specialized models. GPT exhibits the best
programming capabilities, achieving approximate accuracies of 50% and 56% on
the two tasks, respectively. There is still significant room for improvement in
programming tasks. We hope that CodeApex can serve as a reference for
evaluating the coding capabilities of LLMs, further promoting their development
and growth. Datasets are released at https://github.com/APEXLAB/CodeApex.git.
CodeApex submission website is https://apex.sjtu.edu.cn/codeapex/.Comment: 21 page
Learning Enhanced Representations for Tabular Data via Neighborhood Propagation
Prediction over tabular data is an essential and fundamental problem in many
important downstream tasks. However, existing methods either take a data
instance of the table independently as input or do not fully utilize the
multi-rows features and labels to directly change and enhance the target data
representations. In this paper, we propose to 1) construct a hypergraph from
relevant data instance retrieval to model the cross-row and cross-column
patterns of those instances, and 2) perform message Propagation to Enhance the
target data instance representation for Tabular prediction tasks. Specifically,
our specially-designed message propagation step benefits from 1) fusion of
label and features during propagation, and 2) locality-aware high-order feature
interactions. Experiments on two important tabular data prediction tasks
validate the superiority of the proposed PET model against other baselines.
Additionally, we demonstrate the effectiveness of the model components and the
feature enhancement ability of PET via various ablation studies and
visualizations. The code is included in https://github.com/KounianhuaDu/PET