Query-document relevance prediction is a critical problem in Information
Retrieval systems. This problem has increasingly been tackled using
(pretrained) transformer-based models which are finetuned using large
collections of labeled data. However, in specialized domains such as e-commerce
and healthcare, the viability of this approach is limited by the dearth of
large in-domain data. To address this paucity, recent methods leverage these
powerful models to generate high-quality task and domain-specific synthetic
data. Prior work has largely explored synthetic data generation or query
generation (QGen) for Question-Answering (QA) and binary (yes/no) relevance
prediction, where for instance, the QGen models are given a document, and
trained to generate a query relevant to that document. However in many
problems, we have a more fine-grained notion of relevance than a simple yes/no
label. Thus, in this work, we conduct a detailed study into how QGen approaches
can be leveraged for nuanced relevance prediction. We demonstrate that --
contrary to claims from prior works -- current QGen approaches fall short of
the more conventional cross-domain transfer-learning approaches. Via empirical
studies spanning 3 public e-commerce benchmarks, we identify new shortcomings
of existing QGen approaches -- including their inability to distinguish between
different grades of relevance. To address this, we introduce label-conditioned
QGen models which incorporates knowledge about the different relevance. While
our experiments demonstrate that these modifications help improve performance
of QGen techniques, we also find that QGen approaches struggle to capture the
full nuance of the relevance label space and as a result the generated queries
are not faithful to the desired relevance label.Comment: In Proceedings of ACM SIGIRWorkshop on eCommerce (SIGIR eCom 23