Conversational recommendation systems (CRS) effectively address information
asymmetry by dynamically eliciting user preferences through multi-turn
interactions. Existing CRS widely assumes that users have clear preferences.
Under this assumption, the agent will completely trust the user feedback and
treat the accepted or rejected signals as strong indicators to filter items and
reduce the candidate space, which may lead to the problem of over-filtering.
However, in reality, users' preferences are often vague and volatile, with
uncertainty about their desires and changing decisions during interactions.
To address this issue, we introduce a novel scenario called Vague Preference
Multi-round Conversational Recommendation (VPMCR), which considers users' vague
and volatile preferences in CRS.VPMCR employs a soft estimation mechanism to
assign a non-zero confidence score for all candidate items to be displayed,
naturally avoiding the over-filtering problem. In the VPMCR setting, we
introduce an solution called Adaptive Vague Preference Policy Learning (AVPPL),
which consists of two main components: Uncertainty-aware Soft Estimation (USE)
and Uncertainty-aware Policy Learning (UPL). USE estimates the uncertainty of
users' vague feedback and captures their dynamic preferences using a
choice-based preferences extraction module and a time-aware decaying strategy.
UPL leverages the preference distribution estimated by USE to guide the
conversation and adapt to changes in users' preferences to make recommendations
or ask for attributes.
Our extensive experiments demonstrate the effectiveness of our method in the
VPMCR scenario, highlighting its potential for practical applications and
improving the overall performance and applicability of CRS in real-world
settings, particularly for users with vague or dynamic preferences