Text style is highly abstract, as it encompasses various aspects of a
speaker's characteristics, habits, logical thinking, and the content they
express. However, previous text-style transfer tasks have primarily focused on
data-driven approaches, lacking in-depth analysis and research from the
perspectives of linguistics and cognitive science. In this paper, we introduce
a novel task called Text Speech-Style Transfer (TSST). The main objective is to
further explore topics related to human cognition, such as personality and
emotion, based on the capabilities of existing LLMs. Considering the objective
of our task and the distinctive characteristics of oral speech in real-life
scenarios, we trained multi-dimension (i.e. filler words, vividness,
interactivity, emotionality) evaluation models for the TSST and validated their
correlation with human assessments. We thoroughly analyze the performance of
several large language models (LLMs) and identify areas where further
improvement is needed. Moreover, driven by our evaluation models, we have
released a new corpus that improves the capabilities of LLMs in generating text
with speech-style characteristics. In summary, we present the TSST task, a new
benchmark for style transfer and emphasizing human-oriented evaluation,
exploring and advancing the performance of current LLMs.Comment: Working in progres