Search CORE

421 research outputs found

AMR Compressed-Domain Analysis for Multimedia Forensics Double Compression Detection

Author: José Fabrizio Pereira Sampaio
Publication venue: ANP Editora
Publication date: 01/06/2019
Field of study

An audio recording must be authentic to be admitted as evidence in a criminal prosecution so that the speech is saved with maximum fidelity and interpretation mistakes are prevented. AMR (adaptive multi-rate) encoder is a worldwide standard for speech compression and for GSM mobile network transmission, including 3G and 4G. In addition, such encoder is an audio file format standard with extension AMR which uses the same compression algorithm. Due to its extensive usage in mobile networks and high availability in modern smartphones, AMR format has been found in audio authenticity cases for forgery searching. Such exams compound the multimedia forensics field which consists of, among other techniques, double compression detection, i. e., to determine if a given AMR file was decompressed and compressed again. AMR double compression detection is a complex engineering problem whose solution is still underway. In general terms, if an AMR file is double compressed, it is not an original one and it was likely doctored. The published works in literature about double compression detection are based on decoded waveform AMR files to extract features. In this paper, a new approach is proposed to AMR double compression detection which, in spite of processing decoded audio, uses its encoded version to extract compressed-domain linear prediction (LP) coefficient-based features. By means of feature statistical analysis, it is possible to show that they can be used to achieve AMR double compression detection in an effective way, so that they can be considered a promising path to solve AMR double compression problem by artificial neural networks

Directory of Open Access Journals

Towards Unconstrained Audio Splicing Detection and Localization with Neural Networks

Author: Hirsch Germans
Moussa Denise
Riess Christian
Publication venue
Publication date: 31/03/2023
Field of study

Freely available and easy-to-use audio editing tools make it straightforward to perform audio splicing. Convincing forgeries can be created by combining various speech samples from the same person. Detection of such splices is important both in the public sector when considering misinformation, and in a legal context to verify the integrity of evidence. Unfortunately, most existing detection algorithms for audio splicing use handcrafted features and make specific assumptions. However, criminal investigators are often faced with audio samples from unconstrained sources with unknown characteristics, which raises the need for more generally applicable methods. With this work, we aim to take a first step towards unconstrained audio splicing detection to address this need. We simulate various attack scenarios in the form of post-processing operations that may disguise splicing. We propose a Transformer sequence-to-sequence (seq2seq) network for splicing detection and localization. Our extensive evaluation shows that the proposed method outperforms existing dedicated approaches for splicing detection [3, 10] as well as the general-purpose networks EfficientNet [28] and RegNet [25].Comment: Accepted at MMFORWILD 2022, ICPR Workshops - Code: https://faui1-gitlab.cs.fau.de/denise.moussa/audio-splicing-localizatio

arXiv.org e-Print Archive

Point to the Hidden: Exposing Speech Audio Splicing via Signal Pointer Nets

Author: Hirsch Germans
Moussa Denise
Riess Christian
Wankerl Sebastian
Publication venue
Publication date: 02/08/2023
Field of study

Verifying the integrity of voice recording evidence for criminal investigations is an integral part of an audio forensic analyst's work. Here, one focus is on detecting deletion or insertion operations, so called audio splicing. While this is a rather easy approach to alter spoken statements, careful editing can yield quite convincing results. For difficult cases or big amounts of data, automated tools can support in detecting potential editing locations. To this end, several analytical and deep learning methods have been proposed by now. Still, few address unconstrained splicing scenarios as expected in practice. With SigPointer, we propose a pointer network framework for continuous input that uncovers splice locations naturally and more efficiently than existing works. Extensive experiments on forensically challenging data like strongly compressed and noisy signals quantify the benefit of the pointer mechanism with performance increases between about 6 to 10 percentage points.Comment: accepted at Interspeech 202

arXiv.org e-Print Archive

Phone based heart and lung functions monitor

Author: Silva João Filipe Trindade da
Publication venue
Publication date: 01/01/2011
Field of study

Tese de Mestrado Integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201

Repositório Aberto da Universidade do Porto

VoipLoc : VoIP call provenance using acoustic side-channels

Author: Nagaraja Shishir
Shah Ryan
Publication venue
Publication date: 31/07/2019
Field of study

We develop a novel technique to determine call provenance in anonymous VoIP communications using acoustic side-channels. The technique exploits location-attributable information embedded within audio speech data. The victim’s speech is exploited as an excitation signal, which is modulated (acted upon) by the acoustic reflection characteristics of the victim’s location. We show that leading VoIP communication channels faithfully transfer this information between sender-receiver pairs, enabling passive receivers to extract a location fingerprint, to establish call provenance. To establish provenance, a fingerprint is compared against a database of labelled fingerprints to identify a match. The technique is fully passive and does not depend on any characteristic background sounds, is speaker independent, and is robust to lossy network conditions. Evaluation using a corpus of recordings of VoIP conversations, over the Tor network, confirms that recording locations can be fingerprinted and detected remotely with low false-positive rate

University of Strathclyde Institutional Repository

Detecting autism, emotions and social signals using AdaBoost

Author: Busa-Fekete Róbert
Gosztolya Gábor
Tóth László
Publication venue: Interspeech
Publication date: 01/01/2013
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Recommended from our members

Roadmap for Music Information ReSearch

Author: Benetos E.
Chudy M.
Dixon S.
Flexer A.
Gomez E.
Gouyon F.
Herrera P.
Jorda S.
Magas M.
Paytuvi O.
Peeters G.
Schlüter J.
Serra X.
Vinet H.
Widmer G.
Publication venue: MIRES Consortium
Publication date: 01/01/2013
Field of study

City Research Online

UPF Digital Repository

Secure covert communications over streaming media using dynamic steganography

Author: Peng Jinghui
Publication venue
Publication date
Field of study

Streaming technologies such as VoIP are widely embedded into commercial and industrial applications, so it is imperative to address data security issues before the problems get really serious. This thesis describes a theoretical and experimental investigation of secure covert communications over streaming media using dynamic steganography. A covert VoIP communications system was developed in C++ to enable the implementation of the work being carried out. A new information theoretical model of secure covert communications over streaming media was constructed to depict the security scenarios in streaming media-based steganographic systems with passive attacks. The model involves a stochastic process that models an information source for covert VoIP communications and the theory of hypothesis testing that analyses the adversary‘s detection performance. The potential of hardware-based true random key generation and chaotic interval selection for innovative applications in covert VoIP communications was explored. Using the read time stamp counter of CPU as an entropy source was designed to generate true random numbers as secret keys for streaming media steganography. A novel interval selection algorithm was devised to choose randomly data embedding locations in VoIP streams using random sequences generated from achaotic process. A dynamic key updating and transmission based steganographic algorithm that includes a one-way cryptographical accumulator integrated into dynamic key exchange for covert VoIP communications, was devised to provide secure key exchange for covert communications over streaming media. The discrete logarithm problem in mathematics and steganalysis using t-test revealed the algorithm has the advantage of being the most solid method of key distribution over a public channel. The effectiveness of the new steganographic algorithm for covert communications over streaming media was examined by means of security analysis, steganalysis using non parameter Mann-Whitney-Wilcoxon statistical testing, and performance and robustness measurements. The algorithm achieved the average data embedding rate of 800 bps, comparable to other related algorithms. The results indicated that the algorithm has no or little impact on real-time VoIP communications in terms of speech quality (< 5% change in PESQ with hidden data), signal distortion (6% change in SNR after steganography) and imperceptibility, and it is more secure and effective in addressing the security problems than other related algorithms

UWL Repository