Despite the great impact of lies in human societies and a meager 54% human
accuracy for Deception Detection (DD), Machine Learning systems that perform
automated DD are still not viable for proper application in real-life settings
due to data scarcity. Few publicly available DD datasets exist and the creation
of new datasets is hindered by the conceptual distinction between low-stakes
and high-stakes lies. Theoretically, the two kinds of lies are so distinct that
a dataset of one kind could not be used for applications for the other kind.
Even though it is easier to acquire data on low-stakes deception since it can
be simulated (faked) in controlled settings, these lies do not hold the same
significance or depth as genuine high-stakes lies, which are much harder to
obtain and hold the practical interest of automated DD systems. To investigate
whether this distinction holds true from a practical perspective, we design
several experiments comparing a high-stakes DD dataset and a low-stakes DD
dataset evaluating their results on a Deep Learning classifier working
exclusively from video data. In our experiments, a network trained in
low-stakes lies had better accuracy classifying high-stakes deception than
low-stakes, although using low-stakes lies as an augmentation strategy for the
high-stakes dataset decreased its accuracy.Comment: 11 pages, 3 figure