1,657 research outputs found

    New Digital Audio Watermarking Algorithms for Copyright Protection

    Get PDF
    This thesis investigates the development of digital audio watermarking in addressing issues such as copyright protection. Over the past two decades, many digital watermarking algorithms have been developed, each with its own advantages and disadvantages. The main aim of this thesis was to develop a new watermarking algorithm within an existing Fast Fourier Transform framework. This resulted in the development of a Complex Spectrum Phase Evolution based watermarking algorithm. In this new implementation, the embedding positions were generated dynamically thereby rendering it more difficult for an attacker to remove, and watermark information was embedded by manipulation of the spectral components in the time domain thereby reducing any audible distortion. Further improvements were attained when the embedding criteria was based on bin location comparison instead of magnitude, thereby rendering it more robust against those attacks that interfere with the spectral magnitudes. However, it was discovered that this new audio watermarking algorithm has some disadvantages such as a relatively low capacity and a non-consistent robustness for different audio files. Therefore, a further aim of this thesis was to improve the algorithm from a different perspective. Improvements were investigated using an Singular Value Decomposition framework wherein a novel observation was discovered. Furthermore, a psychoacoustic model was incorporated to suppress any audible distortion. This resulted in a watermarking algorithm which achieved a higher capacity and a more consistent robustness. The overall result was that two new digital audio watermarking algorithms were developed which were complementary in their performance thereby opening more opportunities for further research

    Security of Electronic Patient Record using Imperceptible DCT-SVD based Audio Watermarking Technique

    Get PDF
    Abstract—A robust and highly imperceptible audio watermarkingtechnique is presented to secure the electronic patientrecord of Parkinson’s Disease (PD) affected patient. The proposedDCT-SVD based watermarking technique introduces minimalchanges in speech such that the accuracy in classification of PDaffected person’s speech and healthy person’s speech is retained.To achieve high imperceptibility the voiced part of the speech isconsidered for embedding the watermark. It is shown that theproposed watermarking technique is robust to common signalprocessing attacks. The practicability of the proposed technique istested: by creating an android application to record & watermarkthe speech signal. The classification of PD affected speech is doneusing Support Vector Machine (SVM) classifier in cloud server

    A new data embedding method for mpeg layer III audio steganography

    Get PDF
    A new method of MP3 steganography is proposed with emphasis on increasing the steganography capacity of the carrier medium. This paper proposes a data embedding algorithm to hide more information for compressed bitstream of MP3 audio files. The sign bits of Huffman codes are selected as the stego-object according to the Huffman coding characteristic in region of Count1. Embedding process does not require the main MP3 audio file during the extraction of hidden message and the size of MP3 file cannot be changed in this step. Our proposed method caused much higher information embedding capacity with lower computational complexity compared with MP3Stego tools. Experimental results show an excellent imperceptibility for the new algorithm

    CHORUS Deliverable 3.3: Vision Document - Intermediate version

    Get PDF
    The goal of the CHORUS vision document is to create a high level vision on audio-visual search engines in order to give guidance to the future R&D work in this area (in line with the mandate of CHORUS as a Coordination Action). This current intermediate draft of the CHORUS vision document (D3.3) is based on the previous CHORUS vision documents D3.1 to D3.2 and on the results of the six CHORUS Think-Tank meetings held in March, September and November 2007 as well as in April, July and October 2008, and on the feedback from other CHORUS events. The outcome of the six Think-Thank meetings will not just be to the benefit of the participants which are stakeholders and experts from academia and industry – CHORUS, as a coordination action of the EC, will feed back the findings (see Summary) to the projects under its purview and, via its website, to the whole community working in the domain of AV content search. A few subjections of this deliverable are to be completed after the eights (and presumably last) Think-Tank meeting in spring 2009

    A robust audio watermarking scheme based on reduced singular value decomposition and distortion removal

    Get PDF
    This paper presents a blind audio watermarking algorithm based on the reduced singular value decomposition(RSVD). A new observation on one of the resulting unitary matrices is uncovered. The proposed scheme manipulates coefficients based on this observation in order to embed watermark bits. To preserve audio fidelity a threshold- based distortion control technique is applied and this is further supplemented by distortion suppression utilizing psychoacoustic principles. Test results on real music signals show that this watermarking scheme is in the range of imperceptibility for human hearing, is accurate and also robust against MP3 compression at various bit rates as well as other selected attacks. The data payload is comparatively high compared to existing audio watermarking schemes

    On the data hiding theory and multimedia content security applications

    Get PDF
    This dissertation is a comprehensive study of digital steganography for multimedia content protection. With the increasing development of Internet technology, protection and enforcement of multimedia property rights has become a great concern to multimedia authors and distributors. Watermarking technologies provide a possible solution for this problem. The dissertation first briefly introduces the current watermarking schemes, including their applications in video,, image and audio. Most available embedding schemes are based on direct Spread Sequence (SS) modulation. A small value pseudo random signature sequence is embedded into the host signal and the information is extracted via correlation. The correlation detection problem is discussed at the beginning. It is concluded that the correlator is not optimum in oblivious detection. The Maximum Likelihood detector is derived and some feasible suboptimal detectors are also analyzed. Through the calculation of extraction Bit Error Rate (BER), it is revealed that the SS scheme is not very efficient due to its poor host noise suppression. The watermark domain selection problem is addressed subsequently. Some implications on hiding capacity and reliability are also studied. The last topic in SS modulation scheme is the sequence selection. The relationship between sequence bandwidth and synchronization requirement is detailed in the work. It is demonstrated that the white sequence commonly used in watermarking may not really boost watermark security. To address the host noise suppression problem, the hidden communication is modeled as a general hypothesis testing problem and a set partitioning scheme is proposed. Simulation studies and mathematical analysis confirm that it outperforms the SS schemes in host noise suppression. The proposed scheme demonstrates improvement over the existing embedding schemes. Data hiding in audio signals are explored next. The audio data hiding is believed a more challenging task due to the human sensitivity to audio artifacts and advanced feature of current compression techniques. The human psychoacoustic model and human music understanding are also covered in the work. Then as a typical audio perceptual compression scheme, the popular MP3 compression is visited in some length. Several schemes, amplitude modulation, phase modulation and noise substitution are presented together with some experimental results. As a case study, a music bitstream encryption scheme is proposed. In all these applications, human psychoacoustic model plays a very important role. A more advanced audio analysis model is introduced to reveal implications on music understanding. In the last part, conclusions and future research are presented

    Human visual based perception of steganographic images

    Get PDF
    In 2014 it was estimated that 1.8 billion images were uploaded daily to the Internet, and in 2018 it is estimated that 3.2 billion images are shared daily. Some of these uploaded images may contain hidden information that can potentially be malicious (e.g. an image that contains hidden information regarding terrorism recruitment) or may cause serious damage (e.g. an employee wishing to hide sensitive company details in an image file and exporting the image to third parties). This research studied the most effective methods in manipulating images to hide information (Data Loss). Significant work has been done on computational algorithmic detection. Yet the desired output from this work was to find the point at which a human can no longer visually establish the difference between an original image and a manipulated image. This research examines the extent of use for file formats, bit depth alterations, least significant bits, message and audio concealment and watermark and filtering techniques for image steganography. The findings of this study indicated that audio insertion and picture insertion into cover image files are the strongest in deceiving the human eye. These results have been categorised for human visual perception in image-based steganography.PostprintPeer reviewe

    The Effect Of Acoustic Variability On Automatic Speaker Recognition Systems

    Get PDF
    This thesis examines the influence of acoustic variability on automatic speaker recognition systems (ASRs) with three aims. i. To measure ASR performance under 5 commonly encountered acoustic conditions; ii. To contribute towards ASR system development with the provision of new research data; iii. To assess ASR suitability for forensic speaker comparison (FSC) application and investigative/pre-forensic use. The thesis begins with a literature review and explanation of relevant technical terms. Five categories of research experiments then examine ASR performance, reflective of conditions influencing speech quantity (inhibitors) and speech quality (contaminants), acknowledging quality often influences quantity. Experiments pertain to: net speech duration, signal to noise ratio (SNR), reverberation, frequency bandwidth and transcoding (codecs). The ASR system is placed under scrutiny with examination of settings and optimum conditions (e.g. matched/unmatched test audio and speaker models). Output is examined in relation to baseline performance and metrics assist in informing if ASRs should be applied to suboptimal audio recordings. Results indicate that modern ASRs are relatively resilient to low and moderate levels of the acoustic contaminants and inhibitors examined, whilst remaining sensitive to higher levels. The thesis provides discussion on issues such as the complexity and fragility of the speech signal path, speaker variability, difficulty in measuring conditions and mitigation (thresholds and settings). The application of ASRs to casework is discussed with recommendations, acknowledging the different modes of operation (e.g. investigative usage) and current UK limitations regarding presenting ASR output as evidence in criminal trials. In summary, and in the context of acoustic variability, the thesis recommends that ASRs could be applied to pre-forensic cases, accepting extraneous issues endure which require governance such as validation of method (ASR standardisation) and population data selection. However, ASRs remain unsuitable for broad forensic application with many acoustic conditions causing irrecoverable speech data loss contributing to high error rates

    Final Research Report for Sound Design and Audio Player

    Get PDF
    This deliverable describes the work on Task 4.3 Algorithms for sound design and feature developments for audio player. The audio player runs on the in-store player (ISP) and takes care of rendering the music playlists via beat-synchronous automatic DJ mixing, taking advantage of the rich musical content description extracted in T4.2 (beat markers, structural segmentation into intro and outro, musical and sound content classification). The deliverable covers prototypes and final results on: (1) automatic beat-synchronous mixing by beat alignment and time stretching – we developed an algorithm for beat alignment and scheduling of time-stretched tracks; (2) compensation of play duration changes introduced by time stretching – in order to make the playlist generator independent of beat mixing, we chose to readjust the tempo of played tracks such that their stretched duration is the same as their original duration; (3) prospective research on the extraction of data from DJ mixes – to alleviate the lack of extensive ground truth databases of DJ mixing practices, we propose steps towards extracting this data from existing mixes by alignment and unmixing of the tracks in a mix. We also show how these methods can be evaluated even without labelled test data, and propose an open dataset for further research; (4) a description of the software player module, a GUI-less application to run on the ISP that performs streaming of tracks from disk and beat-synchronous mixing. The estimation of cue points where tracks should cross-fade is now described in D4.7 Final Research Report on Auto-Tagging of Music.EC/H2020/688122/EU/Artist-to-Business-to-Business-to-Consumer Audio Branding System/ABC D
    corecore