Recently, resources and tasks were proposed to go beyond state tracking in
dialogue systems. An example is the frame tracking task, which requires
recording multiple frames, one for each user goal set during the dialogue. This
allows a user, for instance, to compare items corresponding to different goals.
This paper proposes a model which takes as input the list of frames created so
far during the dialogue, the current user utterance as well as the dialogue
acts, slot types, and slot values associated with this utterance. The model
then outputs the frame being referenced by each triple of dialogue act, slot
type, and slot value. We show that on the recently published Frames dataset,
this model significantly outperforms a previously proposed rule-based baseline.
In addition, we propose an extensive analysis of the frame tracking task by
dividing it into sub-tasks and assessing their difficulty with respect to our
model