5 research outputs found
AMR Compressed-Domain Analysis for Multimedia Forensics Double Compression Detection
An audio recording must be authentic to be admitted as evidence in a criminal prosecution so that the speech is saved with maximum fidelity and interpretation mistakes are prevented. AMR (adaptive multi-rate) encoder is a worldwide standard for speech compression and for GSM mobile network transmission, including 3G and 4G. In addition, such encoder is an audio file format standard with extension AMR which uses the same compression algorithm. Due to its extensive usage in mobile networks and high availability in modern smartphones, AMR format has been found in audio authenticity cases for forgery searching. Such exams compound the multimedia forensics field which consists of, among other techniques, double compression detection, i. e., to determine if a given AMR file was decompressed and compressed again. AMR double compression detection is a complex engineering problem whose solution is still underway. In general terms, if an AMR file is double compressed, it is not an original one and it was likely doctored. The published works in literature about double compression detection are based on decoded waveform AMR files to extract features. In this paper, a new approach is proposed to AMR double compression detection which, in spite of processing decoded audio, uses its encoded version to extract compressed-domain linear prediction (LP) coefficient-based features. By means of feature statistical analysis, it is possible to show that they can be used to achieve AMR double compression detection in an effective way, so that they can be considered a promising path to solve AMR double compression problem by artificial neural networks
Speech assessment and characterization for law enforcement applications
Speech signals acquired, transmitted or stored in non-ideal conditions are often degraded by
one or more effects including, for example, additive noise. These degradations alter the signal
properties in a manner that deteriorates the intelligibility or quality of the speech signal. In
the law enforcement context such degradations are commonplace due to the limitations in
the audio collection methodology, which is often required to be covert. In severe degradation
conditions, the acquired signal may become unintelligible, losing its value in an investigation
and in less severe conditions, a loss in signal quality may be encountered, which can lead to
higher transcription time and cost.
This thesis proposes a non-intrusive speech assessment framework from which algorithms for
speech quality and intelligibility assessment are derived, to guide the collection and transcription
of law enforcement audio. These methods are trained on a large database labelled using
intrusive techniques (whose performance is verified with subjective scores) and shown to perform
favorably when compared with existing non-intrusive techniques. Additionally, a non-intrusive
CODEC identification and verification algorithm is developed which can identify a CODEC with
an accuracy of 96.8 % and detect the presence of a CODEC with an accuracy higher than 97 %
in the presence of additive noise.
Finally, the speech description taxonomy framework is developed, with the aim of characterizing
various aspects of a degraded speech signal, including the mechanism that results in a signal
with particular characteristics, the vocabulary that can be used to describe those degradations
and the measurable signal properties that can characterize the degradations. The taxonomy is
implemented as a relational database that facilitates the modeling of the relationships between
various attributes of a signal and promises to be a useful tool for training and guiding audio
analysts
Statistical pattern recognition for audio-forensics : empirical investigations on the application scenarios audio steganalysis and microphone forensics
Magdeburg, Univ., Fak. fĂŒr Informatik, Diss., 2013von Christian KrĂ€tze