9 research outputs found

    반향 환경에 강인한 음향 데이터 전송을 위한 오디오 정보 은닉 기법 연구

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 2. 김남수.In this dissertation, audio data hiding methods suitable for acoustic data transmission are studied. Acoustic data transmission implies a technique which communicates data in short-range aerial space between a loudspeaker and a microphone. Audio data hiding method implies a technique that embeds message signals into audio such as music or speech. The audio signal with embedded message is played back by the loudspeaker at a transmitter and the signal is recorded by the microphone at a receiver without any additional communication devices. The data hiding methods for acoustic data transmission require a high level of robustness and data rate than those for other applications. For one of the conventional methods, the acoustic orthogonal frequency division multiplexing (AOFDM) technique was developed as a reliable communication with reasonable bit rate. The conventional methods including AOFDM, however, are considered deficient in transmission performance or audio quality. To overcome this limitation, the modulated complex lapped transform (MCLT) is introduced in the second chapter of the dissertation. The system using MCLT does not produce blocking artifacts which may degrade the quality of the resulting data-embedded audio signal. Moreover, the interference among adjacent coefficients due to the overlap property is analyzed to take advantage of it for data embedding and extraction. In the third chapter of the dissertation, a novel audio data hiding method for the acoustic data transmission using MCLT is proposed. In the proposed system, audio signal is transformed by the MCLT and the phases of the coefficients are modified to embed message based on the fact that human auditory perception is more sensitive to the variation in magnitude spectra. In the proposed method, the perceived quality of the data-embedded audio signal can be kept almost similar to that of the original audio while transmitting data at several hundreds of bits per second (bps). The experimental results have shown that the audio quality and transmission performance of proposed system are better than those of the AOFDM based system. Moreover, several techniques have been found to further improve the performance of the proposed acoustic data transmission system which are listed as follows: incorporating a masking threshold (MM), clustering based decoding (CLS), and a spectral magnitude adjustment (SMA). In the fourth chapter of the dissertation, an audio data hiding technique more suitable for acoustic data transmission in reverberant environments is proposed. In this approach, sophisticated techniques widely deployed in wireless communication is incorporated which can be summarized as follows: First, a proper range of MCLT length to cope with reverberant environments is analyzed based on the wireless communication theory. Second, a channel estimation technique based on the Wiener estimator to compensate the effect of channel is applied in conjunction with a suitable data packet structure. From the experimental result, the MCLT length longer than the reverberation time is found to be robust against the reverberant environments at the cost of the quality of the data-embedded audio. The experimental results have also shown that the proposed method is robust against various forms of attacks such as signal processing, overwriting, and malicious removal methods. However, it would be the most severe problem to find a proper window length which satisfies both the inaudible distortion and robust data transmission in the reverberant environments. For the phase modification of the audio signal, it would be highly likely to incur a significant quality degradation if the length of time-frequency transform is very long due to the pre-echo phenomena. In the fifth chapter, therefore, segmental SNR adjustment (SSA) technique is proposed to further modify the spectral components for attenuating the pre-echo. In the proposed SSA technique, segmenatal SNR is calculated from short-length MCLT analysis and its minimum value is limited to a desired value. The experimental results have shown that the SSA algorithm with a long MCLT length can attenuate the pre-echo effectively such that it can transmit data more reliably while preserving good audio quality. In addition, a good trade-off between the audio quality and transmission performance can be achieved by adjusting only a single parameter in the SSA algorithm. If the number of microphones is more than one, the diversity technique which takes advantage of transmitting duplicates through statistically independent channel could be useful to enhance the transmission reliability. In the sixth chapter, the acoustic data transmission technique is extended to take advantage of the multi-microphone scheme based on combining. In the combining-based multichannel method, the synchronization and channel estimation are respectively performed at each received signal and then the received signals are linearly combined so that the SNR is increased. The most noticeable property for combining-based technique is to provide compatibility with the acoustic data transmission system using a single microphone. From the series of the experiments, the proposed multichannel method have been found to be useful to enhance the transmission performance despite of the statistical dependency between the channels.Abstract i List of Figures ix List of Tables xv Chapter 1 Introduction 1 1.1 Audio Data Hiding and Acoustic Data Transmission 1 1.2 Previous Methods 4 1.2.1 Audio Watermarking Based Methods 4 1.2.2 Wireless Communication Based Methods 6 1.3 Performance Evaluation 9 1.3.1 Audio Quality 9 1.3.2 Data Transmission Performance 10 1.4 Outline of the Dissertation 10 Chapter 2 Modulated Complex Lapped Transform 13 2.1 Introduction 13 2.2 MCLT 14 2.3 Fast Computation Algorithm 18 2.4 Derivation of Interference Terms in MCLT 19 2.5 Summary 24 Chapter 3 Acoustic Data Transmission Based on MCLT 25 3.1 Introduction 25 3.2 Data Embedding 27 3.2.1 Message Frame 27 3.2.2 Synchronization Frame 29 3.2.3 Data Packet Structure 32 3.3 Data Extraction 32 3.4 Techniques for Performance Enhancement 33 3.4.1 Magnitude Modification Based on Frequency Masking 33 3.4.2 Clustering-based Decoding 35 3.4.3 Spectral Magnitude Adjustment Algorithm 37 3.5 Experimental Results 39 3.5.1 Comparison with Acoustic OFDM 39 3.5.2 Performance Improvements by Magnitude Modification and Clustering based Decoding 47 3.5.3 Performance Improvements by Spectral Magnitude Adjustment 50 3.6 Summary 52 Chapter 4 Robust Acoustic Data Transmission against Reverberant Environments 55 4.1 Introduction 55 4.2 Data Embedding 56 4.2.1 Data Embedding 57 4.2.2 MCLT Length 58 4.2.3 Data Packet Structure 60 4.3 Data Extraction 61 4.3.1 Synchronization 61 4.3.2 Channel Estimation and Compensation 62 4.3.3 Data Decoding 65 4.4 Experimental Results 66 4.4.1 Robustness to Reverberation 69 4.4.2 Audio Quality 71 4.4.3 Robustness to Doppler Effect 71 4.4.4 Robustness to Attacks 71 4.5 Summary 75 Chapter 5 Segmental SNR Adjustment for Audio Quality Enhancement 77 5.1 Introduction 77 5.2 Segmental SNR Adjustment Algorithm 79 5.3 Experimental Results 83 5.3.1 System Configurations 83 5.3.2 Audio Quality Test 84 5.3.3 Robustness to Attacks 86 5.3.4 Transmission Performance of Recorded Signals in Indoor Environment 87 5.3.5 Error correction using convolutional coding 89 5.4 Summary 91 Chapter 6 Multichannel Acoustic Data Transmission 93 6.1 Introduction 93 6.2 Multichannel Techniques for Robust Data Transmission 94 6.2.1 Diversity Techniques for Multichannel System 94 6.2.2 Combining-based Multichannel Acoustic Data Transmission 98 6.3 Experimental Results 100 6.3.1 Room Environments 101 6.3.2 Transmission Performance of Simulated Environments 102 6.3.3 Transmission Performance of Recorded Signals in Reverberant Environment 105 6.4 Summary 106 Chapter 7 Conclusions 109 Bibliography 113 국문초록 121Docto

    New Digital Audio Watermarking Algorithms for Copyright Protection

    Get PDF
    This thesis investigates the development of digital audio watermarking in addressing issues such as copyright protection. Over the past two decades, many digital watermarking algorithms have been developed, each with its own advantages and disadvantages. The main aim of this thesis was to develop a new watermarking algorithm within an existing Fast Fourier Transform framework. This resulted in the development of a Complex Spectrum Phase Evolution based watermarking algorithm. In this new implementation, the embedding positions were generated dynamically thereby rendering it more difficult for an attacker to remove, and watermark information was embedded by manipulation of the spectral components in the time domain thereby reducing any audible distortion. Further improvements were attained when the embedding criteria was based on bin location comparison instead of magnitude, thereby rendering it more robust against those attacks that interfere with the spectral magnitudes. However, it was discovered that this new audio watermarking algorithm has some disadvantages such as a relatively low capacity and a non-consistent robustness for different audio files. Therefore, a further aim of this thesis was to improve the algorithm from a different perspective. Improvements were investigated using an Singular Value Decomposition framework wherein a novel observation was discovered. Furthermore, a psychoacoustic model was incorporated to suppress any audible distortion. This resulted in a watermarking algorithm which achieved a higher capacity and a more consistent robustness. The overall result was that two new digital audio watermarking algorithms were developed which were complementary in their performance thereby opening more opportunities for further research

    Digital Watermarking for Verification of Perception-based Integrity of Audio Data

    Get PDF
    In certain application fields digital audio recordings contain sensitive content. Examples are historical archival material in public archives that preserve our cultural heritage, or digital evidence in the context of law enforcement and civil proceedings. Because of the powerful capabilities of modern editing tools for multimedia such material is vulnerable to doctoring of the content and forgery of its origin with malicious intent. Also inadvertent data modification and mistaken origin can be caused by human error. Hence, the credibility and provenience in terms of an unadulterated and genuine state of such audio content and the confidence about its origin are critical factors. To address this issue, this PhD thesis proposes a mechanism for verifying the integrity and authenticity of digital sound recordings. It is designed and implemented to be insensitive to common post-processing operations of the audio data that influence the subjective acoustic perception only marginally (if at all). Examples of such operations include lossy compression that maintains a high sound quality of the audio media, or lossless format conversions. It is the objective to avoid de facto false alarms that would be expectedly observable in standard crypto-based authentication protocols in the presence of these legitimate post-processing. For achieving this, a feasible combination of the techniques of digital watermarking and audio-specific hashing is investigated. At first, a suitable secret-key dependent audio hashing algorithm is developed. It incorporates and enhances so-called audio fingerprinting technology from the state of the art in contentbased audio identification. The presented algorithm (denoted as ”rMAC” message authentication code) allows ”perception-based” verification of integrity. This means classifying integrity breaches as such not before they become audible. As another objective, this rMAC is embedded and stored silently inside the audio media by means of audio watermarking technology. This approach allows maintaining the authentication code across the above-mentioned admissible post-processing operations and making it available for integrity verification at a later date. For this, an existent secret-key ependent audio watermarking algorithm is used and enhanced in this thesis work. To some extent, the dependency of the rMAC and of the watermarking processing from a secret key also allows authenticating the origin of a protected audio. To elaborate on this security aspect, this work also estimates the brute-force efforts of an adversary attacking this combined rMAC-watermarking approach. The experimental results show that the proposed method provides a good distinction and classification performance of authentic versus doctored audio content. It also allows the temporal localization of audible data modification within a protected audio file. The experimental evaluation finally provides recommendations about technical configuration settings of the combined watermarking-hashing approach. Beyond the main topic of perception-based data integrity and data authenticity for audio, this PhD work provides new general findings in the fields of audio fingerprinting and digital watermarking. The main contributions of this PhD were published and presented mainly at conferences about multimedia security. These publications were cited by a number of other authors and hence had some impact on their works

    Digital audio watermarking for broadcast monitoring and content identification

    Get PDF
    Copyright legislation was prompted exactly 300 years ago by a desire to protect authors against exploitation of their work by others. With regard to modern content owners, Digital Rights Management (DRM) issues have become very important since the advent of the Internet. Piracy, or illegal copying, costs content owners billions of dollars every year. DRM is just one tool that can assist content owners in exercising their rights. Two categories of DRM technologies have evolved in digital signal processing recently, namely digital fingerprinting and digital watermarking. One area of Copyright that is consistently overlooked in DRM developments is 'Public Performance'. The research described in this thesis analysed the administration of public performance rights within the music industry in general, with specific focus on the collective rights and broadcasting sectors in Ireland. Limitations in the administration of artists' rights were identified. The impact of these limitations on the careers of developing artists was evaluated. A digital audio watermarking scheme is proposed that would meet the requirements of both the broadcast and collective rights sectors. The goal of the scheme is to embed a standard identifier within an audio signal via modification of its spectral properties in such a way that it would be robust and perceptually transparent. Modification of the audio signal spectrum was attempted in a variety of ways. A method based on a super-resolution frequency identification technique was found to be most effective. The watermarking scheme was evaluated for robustness and found to be extremely effective in recovering embedded watermarks in music signals using a semi-blind decoding process. The final digital audio watermarking algorithm proposed facilitates the development of other applications in the domain of broadcast monitoring for the purposes of equitable royalty distribution along with additional applications and extension to other domains

    Machine Annotation of Traditional Irish Dance Music

    Get PDF
    The work presented in this thesis is validated in experiments using 130 realworld field recordings of traditional music from sessions, classes, concerts and commercial recordings. Test audio includes solo and ensemble playing on a variety of instruments recorded in real-world settings such as noisy public sessions. Results are reported using standard measures from the field of information retrieval (IR) including accuracy, error, precision and recall and the system is compared to alternative approaches for CBMIR common in the literature

    Proceedings of the 7th Sound and Music Computing Conference

    Get PDF
    Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010

    Unified Theory for Biorthogonal Modulated Filter Banks

    Get PDF
    Modulated filter banks (MFBs) are practical signal decomposition tools for M -channel multirate systems. They combine high subfilter selectivity with efficient realization based on polyphase filters and block transforms. Consequently, the O(M 2 ) burden of computations in a general filter bank (FB) is reduced to O(M log2 M ) - the latter being a complexity order comparable with the FFT-like transforms.Often hiding from the plain sight, these versatile digital signal processing tools have important role in various professional and everyday life applications of information and communications technology, including audiovisual communications and media storage (e.g., audio codecs for low-energy music playback in portable devices, as well as communication waveform processing and channelization). The algorithmic efficiency implies low cost, small size, and extended battery life, bringing the devices close to our skins.The main objective of this thesis is to formulate a generalized and unified approach to the MFBs, which includes, in addition to the deep theoretical background behind these banks, both their design by using appropriate optimization techniques and efficient algorithmic realizations. The FBs discussed in this thesis are discrete-time time-frequency decomposition/reconstruction, or equivalently, analysis-synthesis systems, where the subfilters are generated through modulation from either a single or two prototype filters. The perfect reconstruction (PR) property is a particularly important characteristics of the MFBs and this is the core theme of this thesis. In the presented biorthogonal arbitrary-delay exponentially modulated filter bank (EMFB), the PR property can be maintained also for complex-valued signals.The EMFB concept is quite flexible, since it may respond to the various requirements given to a subband processing system: low-delay PR prototype design, subfilters having symmetric impulse responses, efficient algorithms, and the definition covers odd and even-stacked cosine-modulated FBs as special cases. Oversampling schemes for the subsignals prove out to be advantageous in subband processing problems requiring phase information about the localized frequency components. In addition, the MFBs have strong connections with the lapped transform (LT) theory, especially with the class of LTs grounded in parametric window functions.<br/

    The Whitworthian 2009-2010

    Get PDF
    The Whitworthian student newspaper, September 2009-May 2010.https://digitalcommons.whitworth.edu/whitworthian/1094/thumbnail.jp
    corecore