Verifying the integrity of voice recording evidence for criminal
investigations is an integral part of an audio forensic analyst's work. Here,
one focus is on detecting deletion or insertion operations, so called audio
splicing. While this is a rather easy approach to alter spoken statements,
careful editing can yield quite convincing results. For difficult cases or big
amounts of data, automated tools can support in detecting potential editing
locations. To this end, several analytical and deep learning methods have been
proposed by now. Still, few address unconstrained splicing scenarios as
expected in practice. With SigPointer, we propose a pointer network framework
for continuous input that uncovers splice locations naturally and more
efficiently than existing works. Extensive experiments on forensically
challenging data like strongly compressed and noisy signals quantify the
benefit of the pointer mechanism with performance increases between about 6 to
10 percentage points.Comment: accepted at Interspeech 202