466 research outputs found

    Detection and localization of double compression in MP3 audio tracks

    Get PDF
    In this work, by exploiting the traces left by double compression in the statistics of quantized modified discrete cosine transform coefficients, a single measure has been derived that allows to decide whether an MP3 file is singly or doubly compressed and, in the last case, to devise also the bit-rate of the first compression. Moreover, the proposed method as well as two state-of-the-art methods have been applied to analyze short temporal windows of the track, allowing the localization of possible tampered portions in the MP3 file under analysis. Experiments confirm the good performance of the proposed scheme and demonstrate that current detection methods are useful for tampering localization, thus offering a new tool for the forensic analysis of MP3 audio tracks

    Audio Signal Processing Using Time-Frequency Approaches: Coding, Classification, Fingerprinting, and Watermarking

    Get PDF
    Audio signals are information rich nonstationary signals that play an important role in our day-to-day communication, perception of environment, and entertainment. Due to its non-stationary nature, time- or frequency-only approaches are inadequate in analyzing these signals. A joint time-frequency (TF) approach would be a better choice to efficiently process these signals. In this digital era, compression, intelligent indexing for content-based retrieval, classification, and protection of digital audio content are few of the areas that encapsulate a majority of the audio signal processing applications. In this paper, we present a comprehensive array of TF methodologies that successfully address applications in all of the above mentioned areas. A TF-based audio coding scheme with novel psychoacoustics model, music classification, audio classification of environmental sounds, audio fingerprinting, and audio watermarking will be presented to demonstrate the advantages of using time-frequency approaches in analyzing and extracting information from audio signals.</p

    Point to the Hidden: Exposing Speech Audio Splicing via Signal Pointer Nets

    Full text link
    Verifying the integrity of voice recording evidence for criminal investigations is an integral part of an audio forensic analyst's work. Here, one focus is on detecting deletion or insertion operations, so called audio splicing. While this is a rather easy approach to alter spoken statements, careful editing can yield quite convincing results. For difficult cases or big amounts of data, automated tools can support in detecting potential editing locations. To this end, several analytical and deep learning methods have been proposed by now. Still, few address unconstrained splicing scenarios as expected in practice. With SigPointer, we propose a pointer network framework for continuous input that uncovers splice locations naturally and more efficiently than existing works. Extensive experiments on forensically challenging data like strongly compressed and noisy signals quantify the benefit of the pointer mechanism with performance increases between about 6 to 10 percentage points.Comment: accepted at Interspeech 202

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    An SNMP-based audio distribution service architecture

    Get PDF
    Dissertação de mestrado em Engenharia de Redes e Serviços TelemáticosThe constant growth of integration and popularity of “Internet of Things” devices is affecting home automation systems, where new technologies were introduced, in the recent years for this particular sector. These automation systems integrate devices that can be anywhere in the house, connected to a home network, either through a wire or wireless connection. A home automation system can be used to control air conditioning, lighting, pool control systems, home-entertainment systems and much more. Within the field of home-entertainment systems, the best known technologies are the Digital Living Network Alliance and the Digital Audio Access Protocol, which provide interoperability to allow sharing of digital media content between devices across a home network. However, these technologies have the disadvantage of being proprietary, maintaining restrict documentation access, complex architectures and concepts and not optimal to specific purposes, like audio distribution. The main goal of this project was to prove that is possible to use standardized protocols, such as the Simple Network Manager Protocol and open source tools in order to develop a music distribution service that allows the implementation of similar features than the ones already existing proprietary technologies. As such, the implementation prototype system allows a user to manage and play audio from a music collection that is stored in a single home audio server. The system architecture enables audio streaming between the server and the various devices in the same local network. Further more, the music collection, can integrate virtual audio files that are available from external music sources, like iTunes, etc.O constante crescimento de integração e popularidade da “Internet das coisas” tem atualmente afetado sistemas de domótica, onde cada vez mais tecnologias têm vindo a ser desenvolvidas nos últimos anos para este sector em particular. Estes sistemas de domótica integram dispositivos que podem estar em qualquer parte de uma casa, ligados à rede seja através de um cabo ou por wireless. Um sistema de domótica pode ser usado para controlar: ar condicionado, iluminação, sistemas de controlo de piscinas, sistemas de entretenimento, entre outros. Na área de sistemas de entretenimento, as tecnologias mais conhecidas são Digital Living Network Alliance e Digital Audio Access Protocol, que fornecem interoperabilidade de modo a permitir a partilha de conteúdos digitais multimédia entre dispositivos que se encontram na mesma rede local. Contudo, possuem a desvantagem de serem tecnologias proprietárias, com documentação e manuais restritos, arquiteturas e conceitos complexos, e não otimizados para fins específicos, tal distribuição de áudio. O principal objetivo deste projeto foi provar que é possível usar protocolos normalizados, como o Simple Network Manager Protocol e ferramentas open source de forma a desenvolver um serviço de distribuição de música que permite a implementação de funcionalidades semelhantes às tecnologias proprietárias já existentes. Assim, o protótipo implementado permite a um utilizador gerir e reproduzir áudio de uma coleção de música que se esteja armazenada num servidor de áudio domestico. A arquitetura permite streaming de áudio entre o servidor e os diferentes dispositivos que se encontram na mesma rede local. Consequentemente, a coleção de música pode integrar ficheiros de áudio visuais que estejam acessíveis através de fontes externas de música, como por exemplo: iTunes, etc

    Digital audio watermarking for broadcast monitoring and content identification

    Get PDF
    Copyright legislation was prompted exactly 300 years ago by a desire to protect authors against exploitation of their work by others. With regard to modern content owners, Digital Rights Management (DRM) issues have become very important since the advent of the Internet. Piracy, or illegal copying, costs content owners billions of dollars every year. DRM is just one tool that can assist content owners in exercising their rights. Two categories of DRM technologies have evolved in digital signal processing recently, namely digital fingerprinting and digital watermarking. One area of Copyright that is consistently overlooked in DRM developments is 'Public Performance'. The research described in this thesis analysed the administration of public performance rights within the music industry in general, with specific focus on the collective rights and broadcasting sectors in Ireland. Limitations in the administration of artists' rights were identified. The impact of these limitations on the careers of developing artists was evaluated. A digital audio watermarking scheme is proposed that would meet the requirements of both the broadcast and collective rights sectors. The goal of the scheme is to embed a standard identifier within an audio signal via modification of its spectral properties in such a way that it would be robust and perceptually transparent. Modification of the audio signal spectrum was attempted in a variety of ways. A method based on a super-resolution frequency identification technique was found to be most effective. The watermarking scheme was evaluated for robustness and found to be extremely effective in recovering embedded watermarks in music signals using a semi-blind decoding process. The final digital audio watermarking algorithm proposed facilitates the development of other applications in the domain of broadcast monitoring for the purposes of equitable royalty distribution along with additional applications and extension to other domains

    Improved steganalysis technique based on least significant bit using artificial neural network for MP3 files

    Get PDF
    MP3 files are one of the most widely used digital audio formats that provide a high compression ratio with reliable quality. Their widespread use has resulted in MP3 audio files becoming excellent covers to carry hidden information in audio steganography on the Internet. Emerging interest in uncovering such hidden information has opened up a field of research called steganalysis that looked at the detection of hidden messages in a specific media. Unfortunately, the detection accuracy in steganalysis is affected by bit rates, sampling rate of the data type, compression rates, file track size and standard, as well as benchmark dataset of the MP3 files. This thesis thus proposed an effective technique to steganalysis of MP3 audio files by deriving a combination of features from MP3 file properties. Several trials were run in selecting relevant features of MP3 files like the total harmony distortion, power spectrum density, and peak signal-to-noise ratio (PSNR) for investigating the correlation between different channels of MP3 signals. The least significant bit (LSB) technique was used in the detection of embedded secret files in stego-objects. This involved reading the stego-objects for statistical evaluation for possible points of secret messages and classifying these points into either high or low tendencies for containing secret messages. Feed Forward Neural Network with 3 layers and traingdx function with an activation function for each layer were also used. The network vector contains information about all features, and is used to create a network for the given learning process. Finally, an evaluation process involving the ANN test that compared the results with previous techniques, was performed. A 97.92% accuracy rate was recorded when detecting MP3 files under 96 kbps compression. These experimental results showed that the proposed approach was effective in detecting embedded information in MP3 files. It demonstrated significant improvement in detection accuracy at low embedding rates compared with previous work

    Audio content identification

    Get PDF
    Die Entwicklung und Erforschung von inhaltsbasierenden "Music Information Retrieval (MIR)'' - Anwendungen in den letzten Jahren hat gezeigt, dass die automatische Generierung von Inhaltsbeschreibungen, die eine Identifikation oder Klassifikation von Musik oder Musikteilen ermöglichen, eine bewältigbare Aufgabe darstellt. Aufgrund der großen Massen an verfügbarer digitaler Musik und des enormen Wachstums der entsprechenden Datenbanken, werden Untersuchungen durchgeführt, die eine möglichst automatisierte Ausführung der typischen Managementprozesse von digitaler Musik ermöglichen. In dieser Arbeit stelle ich eine allgemeine Einführung in das Gebiet des ``Music Information Retrieval'' vor, insbesondere die automatische Identifikation von Audiomaterial und den Vergleich von ähnlichkeitsbasierenden Ansätzen mit reinen inhaltsbasierenden “Fingerprint”-Technologien. Einerseits versuchen Systeme, den menschlichen Hörapparat bzw. die Wahrnehmung und Definition von "Ähnlichkeit'' zu modellieren, um eine Klassifikation in Gruppen von verwandten Musiktiteln und im Weiteren eine Identifikation zu ermöglichen. Andererseits liegt der Fokus auf der Erstellung von Signaturen, die auf eine eindeutige Wiedererkennung abzielen ohne jede Aussage über ähnlich klingende Alternativen. In der Arbeit werden eine Reihe von Tests durchgeführt, die deutlich machen sollen, wie robust, zuverlässig und anpassbar Erkennungssysteme arbeiten sollen, wobei eine möglichst hohe Rate an richtig erkannten Musikstücken angestrebt wird. Dafür werden zwei Algorithmen, Rhythm Patterns, ein ähnlichkeitsbasierter Ansatz, und FDMF, ein frei verfügbarer Fingerprint-Extraktionsalgorithmus mittels 24 durchgeführten Testfällen gegenübergestellt, um die Arbeitsweisen der Verfahren zu vergleichen. Diese Untersuchungen zielen darauf ab, eine möglichst hohe Genauigkeit in der Wiedererkennung zu erreichen. Ähnlichkeitsbasierte Ansätze wie Rhythm Patterns erreichen bei der Identifikation Wiedererkennungsraten bis zu 89.53% und übertreffen in den durchgeführten Testszenarien somit den untersuchten Fingerprint-Ansatz deutlich. Eine sorgfältige Auswahl relevanter Features, die zur Berechnung von Ähnlichkeit herangezogen werden, führen zu äußerst vielversprechenden Ergebnissen sowohl bei variierten Ausschnitten der Musikstücke als auch nach erheblichen Signalveränderungen.The development and research of content-based music information retrieval (MIR) applications in the last years have shown that the generation of descriptions enabling the identification and classification of pieces of musical audio is a challenge that can be coped with. Due to the huge masses of digital music available and the growth of the particular databases, there are investigations of how to automatically perform tasks concerning the management of audio data. In this thesis I will provide a general introduction of the music information retrieval techniques, especially the identification of audio material and the comparison of similarity-based approaches with content-based fingerprint technology. On the one hand, similarity retrieval systems try to model the human auditory system in various aspects and therewith the model of perceptual similarity. On the other hand there are fingerprints or signatures which try to exactly identify music without any assessment of similarity of sound titles. To figure out the differences and consequences of using these approaches I have performed several experiments that make clear how robust and adaptable an identification system must work. Rhythm Patterns, a similarity based feature extraction scheme and FDMF, a free fingerprint algorithm have been investigated by performing 24 test cases in order to compare the principle behind. This evaluation has also been done focusing on the greatest possible accuracy. It has come out that similarity features like Rhythm Patterns are able to identify audio titles promisingly as well (i.e. up to 89.53 %) in the introduced test scenarios. The proper choice of features enables that music tracks are identified at best when focusing on the highest similarity between the candidates both for varied excerpts and signal modifications
    corecore