Dialogue State Tracking (DST) is critical for comprehensively interpreting
user and system utterances, thereby forming the cornerstone of efficient
dialogue systems. Despite past research efforts focused on enhancing DST
performance through alterations to the model structure or integrating
additional features like graph relations, they often require additional
pre-training with external dialogue corpora. In this study, we propose DSTEA,
improving Dialogue State Tracking via Entity Adaptive pre-training, which can
enhance the encoder through by intensively training key entities in dialogue
utterances. DSTEA identifies these pivotal entities from input dialogues
utilizing four different methods: ontology information, named-entity
recognition, the spaCy, and the flair library. Subsequently, it employs
selective knowledge masking to train the model effectively. Remarkably, DSTEA
only requires pre-training without the direct infusion of extra knowledge into
the DST model. This approach resulted in substantial performance improvements
of four robust DST models on MultiWOZ 2.0, 2.1, and 2.2, with joint goal
accuracy witnessing an increase of up to 2.69% (from 52.41% to 55.10%). Further
validation of DSTEA's efficacy was provided through comparative experiments
considering various entity types and different entity adaptive pre-training
configurations such as masking strategy and masking rate