Recommendation systems are ubiquitous yet often difficult for users to
control and adjust when recommendation quality is poor. This has motivated the
development of conversational recommendation systems (CRSs), with control over
recommendations provided through natural language feedback. However, building
conversational recommendation systems requires conversational training data
involving user utterances paired with items that cover a diverse range of
preferences. Such data has proved challenging to collect scalably using
conventional methods like crowdsourcing. We address it in the context of
item-set recommendation, noting the increasing attention to this task motivated
by use cases like music, news and recipe recommendation. We present a new
technique, TalkTheWalk, that synthesizes realistic high-quality conversational
data by leveraging domain expertise encoded in widely available curated item
collections, showing how these can be transformed into corresponding item set
curation conversations. Specifically, TalkTheWalk generates a sequence of
hypothetical yet plausible item sets returned by a system, then uses a language
model to produce corresponding user utterances. Applying TalkTheWalk to music
recommendation, we generate over one million diverse playlist curation
conversations. A human evaluation shows that the conversations contain
consistent utterances with relevant item sets, nearly matching the quality of
small human-collected conversational data for this task. At the same time, when
the synthetic corpus is used to train a CRS, it improves Hits@100 by 10.5
points on a benchmark dataset over standard baselines and is preferred over the
top-performing baseline in an online evaluation