152 research outputs found
New Digital Audio Watermarking Algorithms for Copyright Protection
This thesis investigates the development of digital audio watermarking in addressing issues such as copyright protection. Over the past two decades, many digital watermarking algorithms have been developed, each with its own advantages and disadvantages. The main aim of this thesis was to develop a new watermarking algorithm within an existing Fast Fourier Transform framework. This resulted in the development of a Complex Spectrum Phase Evolution based watermarking algorithm. In this new implementation, the embedding positions were generated dynamically thereby rendering it more difficult for an attacker to remove, and watermark information was embedded by manipulation of the spectral components in the time domain thereby reducing any audible distortion. Further improvements were attained when the embedding criteria was based on bin location comparison instead of magnitude, thereby rendering it more robust against those attacks that interfere with the spectral magnitudes.
However, it was discovered that this new audio watermarking algorithm has some disadvantages such as a relatively low capacity and a non-consistent robustness for different audio files. Therefore, a further aim of this thesis was to improve the algorithm from a different perspective.
Improvements were investigated using an Singular Value Decomposition framework wherein a novel observation was discovered. Furthermore, a psychoacoustic model was incorporated to suppress any audible distortion. This resulted in a watermarking algorithm which achieved a higher capacity and a more consistent robustness.
The overall result was that two new digital audio watermarking algorithms were developed which were complementary in their performance thereby opening more opportunities for further research
On the data hiding theory and multimedia content security applications
This dissertation is a comprehensive study of digital steganography for multimedia content protection. With the increasing development of Internet technology, protection and enforcement of multimedia property rights has become a great concern to multimedia authors and distributors. Watermarking technologies provide a possible solution for this problem.
The dissertation first briefly introduces the current watermarking schemes, including their applications in video,, image and audio. Most available embedding schemes are based on direct Spread Sequence (SS) modulation. A small value pseudo random signature sequence is embedded into the host signal and the information is extracted via correlation. The correlation detection problem is discussed at the beginning. It is concluded that the correlator is not optimum in oblivious detection. The Maximum Likelihood detector is derived and some feasible suboptimal detectors are also analyzed. Through the calculation of extraction Bit Error Rate (BER), it is revealed that the SS scheme is not very efficient due to its poor host noise suppression. The watermark domain selection problem is addressed subsequently. Some implications on hiding capacity and reliability are also studied. The last topic in SS modulation scheme is the sequence selection. The relationship between sequence bandwidth and synchronization requirement is detailed in the work. It is demonstrated that the white sequence commonly used in watermarking may not really boost watermark security.
To address the host noise suppression problem, the hidden communication is modeled as a general hypothesis testing problem and a set partitioning scheme is proposed. Simulation studies and mathematical analysis confirm that it outperforms the SS schemes in host noise suppression. The proposed scheme demonstrates improvement over the existing embedding schemes.
Data hiding in audio signals are explored next. The audio data hiding is believed a more challenging task due to the human sensitivity to audio artifacts and advanced feature of current compression techniques. The human psychoacoustic model and human music understanding are also covered in the work. Then as a typical audio perceptual compression scheme, the popular MP3 compression is visited in some length. Several schemes, amplitude modulation, phase modulation and noise substitution are presented together with some experimental results. As a case study, a music bitstream encryption scheme is proposed. In all these applications, human psychoacoustic model plays a very important role. A more advanced audio analysis model is introduced to reveal implications on music understanding. In the last part, conclusions and future research are presented
Security of Electronic Patient Record using Imperceptible DCT-SVD based Audio Watermarking Technique
Abstract—A robust and highly imperceptible audio watermarkingtechnique is presented to secure the electronic patientrecord of Parkinson’s Disease (PD) affected patient. The proposedDCT-SVD based watermarking technique introduces minimalchanges in speech such that the accuracy in classification of PDaffected person’s speech and healthy person’s speech is retained.To achieve high imperceptibility the voiced part of the speech isconsidered for embedding the watermark. It is shown that theproposed watermarking technique is robust to common signalprocessing attacks. The practicability of the proposed technique istested: by creating an android application to record & watermarkthe speech signal. The classification of PD affected speech is doneusing Support Vector Machine (SVM) classifier in cloud server
High capacity data embedding schemes for digital media
High capacity image data hiding methods and robust high capacity digital audio watermarking algorithms are studied in this thesis. The main results of this work are the development of novel algorithms with state-of-the-art performance, high capacity and transparency for image data hiding and robustness, high capacity and low distortion for audio watermarking.En esta tesis se estudian y proponen diversos métodos de data hiding de imágenes y watermarking de audio de alta capacidad. Los principales resultados de este trabajo consisten en la publicación de varios algoritmos novedosos con rendimiento a la altura de los mejores métodos del estado del arte, alta capacidad y transparencia, en el caso de data hiding de imágenes, y robustez, alta capacidad y baja distorsión para el watermarking de audio.En aquesta tesi s'estudien i es proposen diversos mètodes de data hiding d'imatges i watermarking d'àudio d'alta capacitat. Els resultats principals d'aquest treball consisteixen en la publicació de diversos algorismes nous amb rendiment a l'alçada dels millors mètodes de l'estat de l'art, alta capacitat i transparència, en el cas de data hiding d'imatges, i robustesa, alta capacitat i baixa distorsió per al watermarking d'àudio.Societat de la informació i el coneixemen
Collaborative Watermarking for Adversarial Speech Synthesis
Advances in neural speech synthesis have brought us technology that is not
only close to human naturalness, but is also capable of instant voice cloning
with little data, and is highly accessible with pre-trained models available.
Naturally, the potential flood of generated content raises the need for
synthetic speech detection and watermarking. Recently, considerable research
effort in synthetic speech detection has been related to the Automatic Speaker
Verification and Spoofing Countermeasure Challenge (ASVspoof), which focuses on
passive countermeasures. This paper takes a complementary view to generated
speech detection: a synthesis system should make an active effort to watermark
the generated speech in a way that aids detection by another machine, but
remains transparent to a human listener. We propose a collaborative training
scheme for synthetic speech watermarking and show that a HiFi-GAN neural
vocoder collaborating with the ASVspoof 2021 baseline countermeasure models
consistently improves detection performance over conventional classifier
training. Furthermore, we demonstrate how collaborative training can be paired
with augmentation strategies for added robustness against noise and
time-stretching. Finally, listening tests demonstrate that collaborative
training has little adverse effect on perceptual quality of vocoded speech.Comment: Accepted to ICASSP 202
- …