122 research outputs found

    New Digital Audio Watermarking Algorithms for Copyright Protection

    Get PDF
    This thesis investigates the development of digital audio watermarking in addressing issues such as copyright protection. Over the past two decades, many digital watermarking algorithms have been developed, each with its own advantages and disadvantages. The main aim of this thesis was to develop a new watermarking algorithm within an existing Fast Fourier Transform framework. This resulted in the development of a Complex Spectrum Phase Evolution based watermarking algorithm. In this new implementation, the embedding positions were generated dynamically thereby rendering it more difficult for an attacker to remove, and watermark information was embedded by manipulation of the spectral components in the time domain thereby reducing any audible distortion. Further improvements were attained when the embedding criteria was based on bin location comparison instead of magnitude, thereby rendering it more robust against those attacks that interfere with the spectral magnitudes. However, it was discovered that this new audio watermarking algorithm has some disadvantages such as a relatively low capacity and a non-consistent robustness for different audio files. Therefore, a further aim of this thesis was to improve the algorithm from a different perspective. Improvements were investigated using an Singular Value Decomposition framework wherein a novel observation was discovered. Furthermore, a psychoacoustic model was incorporated to suppress any audible distortion. This resulted in a watermarking algorithm which achieved a higher capacity and a more consistent robustness. The overall result was that two new digital audio watermarking algorithms were developed which were complementary in their performance thereby opening more opportunities for further research

    Digital audio watermarking with semi-blind detection for in-car music content identification

    Get PDF
    Recent developments in audio watermarking techniques have gone some way towards promoting an industry-wide acceptance of digital audio watermarking as a process that will eventually be used in all audio (and video) production. The predominant focus of such watermarking research has been in the area of content protection, because the prevention of illegal copying is an area of concern for content owners. However, digital audio watermarking may also be used for other purposes, such as the added-value option of real-time content identification of music. While computer-based users of music enjoy the opportunity to identify unknown audio using online tools, identification of audio in an offline domestic or in-car scenario is not so easily achieved. This paper discusses with an area of digital audio watermarking that would facilitate real-time in-car identification of the artists, title and/or other meta-data relating to music being broadcast by radio

    A robust audio watermarking scheme based on reduced singular value decomposition and distortion removal

    Get PDF
    This paper presents a blind audio watermarking algorithm based on the reduced singular value decomposition(RSVD). A new observation on one of the resulting unitary matrices is uncovered. The proposed scheme manipulates coefficients based on this observation in order to embed watermark bits. To preserve audio fidelity a threshold- based distortion control technique is applied and this is further supplemented by distortion suppression utilizing psychoacoustic principles. Test results on real music signals show that this watermarking scheme is in the range of imperceptibility for human hearing, is accurate and also robust against MP3 compression at various bit rates as well as other selected attacks. The data payload is comparatively high compared to existing audio watermarking schemes

    Audio Signal Processing Using Time-Frequency Approaches: Coding, Classification, Fingerprinting, and Watermarking

    Get PDF
    Audio signals are information rich nonstationary signals that play an important role in our day-to-day communication, perception of environment, and entertainment. Due to its non-stationary nature, time- or frequency-only approaches are inadequate in analyzing these signals. A joint time-frequency (TF) approach would be a better choice to efficiently process these signals. In this digital era, compression, intelligent indexing for content-based retrieval, classification, and protection of digital audio content are few of the areas that encapsulate a majority of the audio signal processing applications. In this paper, we present a comprehensive array of TF methodologies that successfully address applications in all of the above mentioned areas. A TF-based audio coding scheme with novel psychoacoustics model, music classification, audio classification of environmental sounds, audio fingerprinting, and audio watermarking will be presented to demonstrate the advantages of using time-frequency approaches in analyzing and extracting information from audio signals.</p

    Digital audio watermarking for broadcast monitoring and content identification

    Get PDF
    Copyright legislation was prompted exactly 300 years ago by a desire to protect authors against exploitation of their work by others. With regard to modern content owners, Digital Rights Management (DRM) issues have become very important since the advent of the Internet. Piracy, or illegal copying, costs content owners billions of dollars every year. DRM is just one tool that can assist content owners in exercising their rights. Two categories of DRM technologies have evolved in digital signal processing recently, namely digital fingerprinting and digital watermarking. One area of Copyright that is consistently overlooked in DRM developments is 'Public Performance'. The research described in this thesis analysed the administration of public performance rights within the music industry in general, with specific focus on the collective rights and broadcasting sectors in Ireland. Limitations in the administration of artists' rights were identified. The impact of these limitations on the careers of developing artists was evaluated. A digital audio watermarking scheme is proposed that would meet the requirements of both the broadcast and collective rights sectors. The goal of the scheme is to embed a standard identifier within an audio signal via modification of its spectral properties in such a way that it would be robust and perceptually transparent. Modification of the audio signal spectrum was attempted in a variety of ways. A method based on a super-resolution frequency identification technique was found to be most effective. The watermarking scheme was evaluated for robustness and found to be extremely effective in recovering embedded watermarks in music signals using a semi-blind decoding process. The final digital audio watermarking algorithm proposed facilitates the development of other applications in the domain of broadcast monitoring for the purposes of equitable royalty distribution along with additional applications and extension to other domains

    Applications de la représentation parcimonieuse perceptuelle par graphe de décharges (Spikegramme) pour la protection du droit d’auteur des signaux sonores

    Get PDF
    Chaque année, le piratage mondial de la musique coûte plusieurs milliards de dollars en pertes économiques, pertes d’emplois et pertes de gains des travailleurs ainsi que la perte de millions de dollars en recettes fiscales. La plupart du piratage de la musique est dû à la croissance rapide et à la facilité des technologies actuelles pour la copie, le partage, la manipulation et la distribution de données musicales [Domingo, 2015], [Siwek, 2007]. Le tatouage des signaux sonores a été proposé pour protéger les droit des auteurs et pour permettre la localisation des instants où le signal sonore a été falsifié. Dans cette thèse, nous proposons d’utiliser la représentation parcimonieuse bio-inspirée par graphe de décharges (spikegramme), pour concevoir une nouvelle méthode permettant la localisation de la falsification dans les signaux sonores. Aussi, une nouvelle méthode de protection du droit d’auteur. Finalement, une nouvelle attaque perceptuelle, en utilisant le spikegramme, pour attaquer des systèmes de tatouage sonore. Nous proposons tout d’abord une technique de localisation des falsifications (‘tampering’) des signaux sonores. Pour cela nous combinons une méthode à spectre étendu modifié (‘modified spread spectrum’, MSS) avec une représentation parcimonieuse. Nous utilisons une technique de poursuite perceptive adaptée (perceptual marching pursuit, PMP [Hossein Najaf-Zadeh, 2008]) pour générer une représentation parcimonieuse (spikegramme) du signal sonore d’entrée qui est invariante au décalage temporel [E. C. Smith, 2006] et qui prend en compte les phénomènes de masquage tels qu’ils sont observés en audition. Un code d’authentification est inséré à l’intérieur des coefficients de la représentation en spikegramme. Puis ceux-ci sont combinés aux seuils de masquage. Le signal tatoué est resynthétisé à partir des coefficients modifiés, et le signal ainsi obtenu est transmis au décodeur. Au décodeur, pour identifier un segment falsifié du signal sonore, les codes d’authentification de tous les segments intacts sont analysés. Si les codes ne peuvent être détectés correctement, on sait qu’alors le segment aura été falsifié. Nous proposons de tatouer selon le principe à spectre étendu (appelé MSS) afin d’obtenir une grande capacité en nombre de bits de tatouage introduits. Dans les situations où il y a désynchronisation entre le codeur et le décodeur, notre méthode permet quand même de détecter des pièces falsifiées. Par rapport à l’état de l’art, notre approche a le taux d’erreur le plus bas pour ce qui est de détecter les pièces falsifiées. Nous avons utilisé le test de l’opinion moyenne (‘MOS’) pour mesurer la qualité des systèmes tatoués. Nous évaluons la méthode de tatouage semi-fragile par le taux d’erreur (nombre de bits erronés divisé par tous les bits soumis) suite à plusieurs attaques. Les résultats confirment la supériorité de notre approche pour la localisation des pièces falsifiées dans les signaux sonores tout en préservant la qualité des signaux. Ensuite nous proposons une nouvelle technique pour la protection des signaux sonores. Cette technique est basée sur la représentation par spikegrammes des signaux sonores et utilise deux dictionnaires (TDA pour Two-Dictionary Approach). Le spikegramme est utilisé pour coder le signal hôte en utilisant un dictionnaire de filtres gammatones. Pour le tatouage, nous utilisons deux dictionnaires différents qui sont sélectionnés en fonction du bit d’entrée à tatouer et du contenu du signal. Notre approche trouve les gammatones appropriés (appelés noyaux de tatouage) sur la base de la valeur du bit à tatouer, et incorpore les bits de tatouage dans la phase des gammatones du tatouage. De plus, il est montré que la TDA est libre d’erreur dans le cas d’aucune situation d’attaque. Il est démontré que la décorrélation des noyaux de tatouage permet la conception d’une méthode de tatouage sonore très robuste. Les expériences ont montré la meilleure robustesse pour la méthode proposée lorsque le signal tatoué est corrompu par une compression MP3 à 32 kbits par seconde avec une charge utile de 56.5 bps par rapport à plusieurs techniques récentes. De plus nous avons étudié la robustesse du tatouage lorsque les nouveaux codec USAC (Unified Audion and Speech Coding) à 24kbps sont utilisés. La charge utile est alors comprise entre 5 et 15 bps. Finalement, nous utilisons les spikegrammes pour proposer trois nouvelles méthodes d’attaques. Nous les comparons aux méthodes récentes d’attaques telles que 32 kbps MP3 et 24 kbps USAC. Ces attaques comprennent l’attaque par PMP, l’attaque par bruit inaudible et l’attaque de remplacement parcimonieuse. Dans le cas de l’attaque par PMP, le signal de tatouage est représenté et resynthétisé avec un spikegramme. Dans le cas de l’attaque par bruit inaudible, celui-ci est généré et ajouté aux coefficients du spikegramme. Dans le cas de l’attaque de remplacement parcimonieuse, dans chaque segment du signal, les caractéristiques spectro-temporelles du signal (les décharges temporelles ;‘time spikes’) se trouvent en utilisant le spikegramme et les spikes temporelles et similaires sont remplacés par une autre. Pour comparer l’efficacité des attaques proposées, nous les comparons au décodeur du tatouage à spectre étendu. Il est démontré que l’attaque par remplacement parcimonieux réduit la corrélation normalisée du décodeur de spectre étendu avec un plus grand facteur par rapport à la situation où le décodeur de spectre étendu est attaqué par la transformation MP3 (32 kbps) et 24 kbps USAC.Abstract : Every year global music piracy is making billion dollars of economic, job, workers’ earnings losses and also million dollars loss in tax revenues. Most of the music piracy is because of rapid growth and easiness of current technologies for copying, sharing, manipulating and distributing musical data [Domingo, 2015], [Siwek, 2007]. Audio watermarking has been proposed as one approach for copyright protection and tamper localization of audio signals to prevent music piracy. In this thesis, we use the spikegram- which is a bio-inspired sparse representation- to propose a novel approach to design an audio tamper localization method as well as an audio copyright protection method and also a new perceptual attack against any audio watermarking system. First, we propose a tampering localization method for audio signal, based on a Modified Spread Spectrum (MSS) approach. Perceptual Matching Pursuit (PMP) is used to compute the spikegram (which is a sparse and time-shift invariant representation of audio signals) as well as 2-D masking thresholds. Then, an authentication code (which includes an Identity Number, ID) is inserted inside the sparse coefficients. For high quality watermarking, the watermark data are multiplied with masking thresholds. The time domain watermarked signal is re-synthesized from the modified coefficients and the signal is sent to the decoder. To localize a tampered segment of the audio signal, at the decoder, the ID’s associated to intact segments are detected correctly, while the ID associated to a tampered segment is mis-detected or not detected. To achieve high capacity, we propose a modified version of the improved spread spectrum watermarking called MSS (Modified Spread Spectrum). We performed a mean opinion test to measure the quality of the proposed watermarking system. Also, the bit error rates for the presented tamper localization method are computed under several attacks. In comparison to conventional methods, the proposed tamper localization method has the smallest number of mis-detected tampered frames, when only one frame is tampered. In addition, the mean opinion test experiments confirms that the proposed method preserves the high quality of input audio signals. Moreover, we introduce a new audio watermarking technique based on a kernel-based representation of audio signals. A perceptive sparse representation (spikegram) is combined with a dictionary of gammatone kernels to construct a robust representation of sounds. Compared to traditional phase embedding methods where the phase of signal’s Fourier coefficients are modified, in this method, the watermark bit stream is inserted by modifying the phase of gammatone kernels. Moreover, the watermark is automatically embedded only into kernels with high amplitudes where all masked (non-meaningful) gammatones have been already removed. Two embedding methods are proposed, one based on the watermark embedding into the sign of gammatones (one dictionary method) and another one based on watermark embedding into both sign and phase of gammatone kernels (two-dictionary method). The robustness of the proposed method is shown against 32 kbps MP3 with an embedding rate of 56.5 bps while the state of the art payload for 32 kbps MP3 robust iii iv watermarking is lower than 50.3 bps. Also, we showed that the proposed method is robust against unified speech and audio codec (24 kbps USAC, Linear predictive and Fourier domain modes) with an average payload of 5 − 15 bps. Moreover, it is shown that the proposed method is robust against a variety of signal processing transforms while preserving quality. Finally, three perceptual attacks are proposed in the perceptual sparse domain using spikegram. These attacks are called PMP, inaudible noise adding and the sparse replacement attacks. In PMP attack, the host signals are represented and re-synthesized with spikegram. In inaudible noise attack, the inaudible noise is generated and added to the spikegram coefficients. In sparse replacement attack, each specific frame of the spikegram representation - when possible - is replaced with a combination of similar frames located in other parts of the spikegram. It is shown than the PMP and inaudible noise attacks have roughly the same efficiency as the 32 kbps MP3 attack, while the replacement attack reduces the normalized correlation of the spread spectrum decoder with a greater factor than when attacking with 32 kbps MP3 or 24 kbps unified speech and audio coding (USAC)

    Digital Watermarking for Verification of Perception-based Integrity of Audio Data

    Get PDF
    In certain application fields digital audio recordings contain sensitive content. Examples are historical archival material in public archives that preserve our cultural heritage, or digital evidence in the context of law enforcement and civil proceedings. Because of the powerful capabilities of modern editing tools for multimedia such material is vulnerable to doctoring of the content and forgery of its origin with malicious intent. Also inadvertent data modification and mistaken origin can be caused by human error. Hence, the credibility and provenience in terms of an unadulterated and genuine state of such audio content and the confidence about its origin are critical factors. To address this issue, this PhD thesis proposes a mechanism for verifying the integrity and authenticity of digital sound recordings. It is designed and implemented to be insensitive to common post-processing operations of the audio data that influence the subjective acoustic perception only marginally (if at all). Examples of such operations include lossy compression that maintains a high sound quality of the audio media, or lossless format conversions. It is the objective to avoid de facto false alarms that would be expectedly observable in standard crypto-based authentication protocols in the presence of these legitimate post-processing. For achieving this, a feasible combination of the techniques of digital watermarking and audio-specific hashing is investigated. At first, a suitable secret-key dependent audio hashing algorithm is developed. It incorporates and enhances so-called audio fingerprinting technology from the state of the art in contentbased audio identification. The presented algorithm (denoted as ”rMAC” message authentication code) allows ”perception-based” verification of integrity. This means classifying integrity breaches as such not before they become audible. As another objective, this rMAC is embedded and stored silently inside the audio media by means of audio watermarking technology. This approach allows maintaining the authentication code across the above-mentioned admissible post-processing operations and making it available for integrity verification at a later date. For this, an existent secret-key ependent audio watermarking algorithm is used and enhanced in this thesis work. To some extent, the dependency of the rMAC and of the watermarking processing from a secret key also allows authenticating the origin of a protected audio. To elaborate on this security aspect, this work also estimates the brute-force efforts of an adversary attacking this combined rMAC-watermarking approach. The experimental results show that the proposed method provides a good distinction and classification performance of authentic versus doctored audio content. It also allows the temporal localization of audible data modification within a protected audio file. The experimental evaluation finally provides recommendations about technical configuration settings of the combined watermarking-hashing approach. Beyond the main topic of perception-based data integrity and data authenticity for audio, this PhD work provides new general findings in the fields of audio fingerprinting and digital watermarking. The main contributions of this PhD were published and presented mainly at conferences about multimedia security. These publications were cited by a number of other authors and hence had some impact on their works
    corecore