31 research outputs found
MIRACLE: Towards Personalized Dialogue Generation with Latent-Space Multiple Personal Attribute Control
Personalized dialogue systems aim to endow the chatbot agent with more
anthropomorphic traits for human-like interactions. Previous approaches have
explored explicitly user profile modeling using text descriptions, implicit
derivation of user embeddings, or utilizing handicraft prompts for ChatGPT-like
models. However, textual personas are limited in describing multi-faceted
attributes (\emph{e.g.}, \emph{language style, inner character nuances}),
implicit embedding suffers from personality sparsity, and handicraft prompts
lack fine-grained and stable controllability. Hence, these approaches may
struggle with complex personalized dialogue generation tasks that require
generating controllable responses with multiple personal attributes. To this
end, we propose \textbf{\textsc{Miracle}}, a novel personalized dialogue
generation method through \textbf{M}ult\textbf{I}ple Pe\textbf{R}sonal
\textbf{A}ttributes \textbf{C}ontrol within \textbf{L}atent-Space
\textbf{E}nergy-based Models. ttributes \textbf{C}ontrol within
\textbf{L}atent-Space \textbf{E}nergy-based Models. Specifically, our approach
first disentangles complex personality into multi-faceted attributes.
Subsequently, we employ a conditional variational auto-encoder to align with
the dense personalized responses within a latent joint attribute space. We have
also tailored a dedicated energy function and customized the ordinary
differential equations sampling method to offer flexible attribute composition
and precise attribute control. Extensive experiments demonstrate that
\textsc{Miracle} outperforms several strong baselines in terms of personality
controllability and response generation quality. Our dataset and code are
available at \url{https://github.com/LZY-the-boys/MIRACLE}Comment: Accepted by EMNLP2023 finding
Distantly-Supervised Named Entity Recognition with Adaptive Teacher Learning and Fine-grained Student Ensemble
Distantly-Supervised Named Entity Recognition (DS-NER) effectively alleviates
the data scarcity problem in NER by automatically generating training samples.
Unfortunately, the distant supervision may induce noisy labels, thus
undermining the robustness of the learned models and restricting the practical
application. To relieve this problem, recent works adopt self-training
teacher-student frameworks to gradually refine the training labels and improve
the generalization ability of NER models. However, we argue that the
performance of the current self-training frameworks for DS-NER is severely
underestimated by their plain designs, including both inadequate student
learning and coarse-grained teacher updating. Therefore, in this paper, we make
the first attempt to alleviate these issues by proposing: (1) adaptive teacher
learning comprised of joint training of two teacher-student networks and
considering both consistent and inconsistent predictions between two teachers,
thus promoting comprehensive student learning. (2) fine-grained student
ensemble that updates each fragment of the teacher model with a temporal moving
average of the corresponding fragment of the student, which enhances consistent
predictions on each model fragment against noise. To verify the effectiveness
of our proposed method, we conduct experiments on four DS-NER datasets. The
experimental results demonstrate that our method significantly surpasses
previous SOTA methods.Comment: Accepted at AAAI 202
A Survey on Arabic Named Entity Recognition: Past, Recent Advances, and Future Trends
As more and more Arabic texts emerged on the Internet, extracting important
information from these Arabic texts is especially useful. As a fundamental
technology, Named entity recognition (NER) serves as the core component in
information extraction technology, while also playing a critical role in many
other Natural Language Processing (NLP) systems, such as question answering and
knowledge graph building. In this paper, we provide a comprehensive review of
the development of Arabic NER, especially the recent advances in deep learning
and pre-trained language model. Specifically, we first introduce the background
of Arabic NER, including the characteristics of Arabic and existing resources
for Arabic NER. Then, we systematically review the development of Arabic NER
methods. Traditional Arabic NER systems focus on feature engineering and
designing domain-specific rules. In recent years, deep learning methods achieve
significant progress by representing texts via continuous vector
representations. With the growth of pre-trained language model, Arabic NER
yields better performance. Finally, we conclude the method gap between Arabic
NER and NER methods from other languages, which helps outline future directions
for Arabic NER.Comment: Accepted by IEEE TKD
Efficient Document-level Event Extraction via Pseudo-Trigger-aware Pruned Complete Graph
Most previous studies of document-level event extraction mainly focus on
building argument chains in an autoregressive way, which achieves a certain
success but is inefficient in both training and inference. In contrast to the
previous studies, we propose a fast and lightweight model named as PTPCG. In
our model, we design a novel strategy for event argument combination together
with a non-autoregressive decoding algorithm via pruned complete graphs, which
are constructed under the guidance of the automatically selected pseudo
triggers. Compared to the previous systems, our system achieves competitive
results with 19.8\% of parameters and much lower resource consumption, taking
only 3.8\% GPU hours for training and up to 8.5 times faster for inference.
Besides, our model shows superior compatibility for the datasets with (or
without) triggers and the pseudo triggers can be the supplements for annotated
triggers to make further improvements. Codes are available at
https://github.com/Spico197/DocEE .Comment: Accepted to IJCAI'202
TREA: Tree-Structure Reasoning Schema for Conversational Recommendation
Conversational recommender systems (CRS) aim to timely trace the dynamic
interests of users through dialogues and generate relevant responses for item
recommendations. Recently, various external knowledge bases (especially
knowledge graphs) are incorporated into CRS to enhance the understanding of
conversation contexts. However, recent reasoning-based models heavily rely on
simplified structures such as linear structures or fixed-hierarchical
structures for causality reasoning, hence they cannot fully figure out
sophisticated relationships among utterances with external knowledge. To
address this, we propose a novel Tree structure Reasoning schEmA named TREA.
TREA constructs a multi-hierarchical scalable tree as the reasoning structure
to clarify the causal relationships between mentioned entities, and fully
utilizes historical conversations to generate more reasonable and suitable
responses for recommended results. Extensive experiments on two public CRS
datasets have demonstrated the effectiveness of our approach.Comment: Accepted by ACL2023 main conferenc