15 research outputs found

    반향 환경에 강인한 음향 데이터 전송을 위한 오디오 정보 은닉 기법 연구

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 2. 김남수.In this dissertation, audio data hiding methods suitable for acoustic data transmission are studied. Acoustic data transmission implies a technique which communicates data in short-range aerial space between a loudspeaker and a microphone. Audio data hiding method implies a technique that embeds message signals into audio such as music or speech. The audio signal with embedded message is played back by the loudspeaker at a transmitter and the signal is recorded by the microphone at a receiver without any additional communication devices. The data hiding methods for acoustic data transmission require a high level of robustness and data rate than those for other applications. For one of the conventional methods, the acoustic orthogonal frequency division multiplexing (AOFDM) technique was developed as a reliable communication with reasonable bit rate. The conventional methods including AOFDM, however, are considered deficient in transmission performance or audio quality. To overcome this limitation, the modulated complex lapped transform (MCLT) is introduced in the second chapter of the dissertation. The system using MCLT does not produce blocking artifacts which may degrade the quality of the resulting data-embedded audio signal. Moreover, the interference among adjacent coefficients due to the overlap property is analyzed to take advantage of it for data embedding and extraction. In the third chapter of the dissertation, a novel audio data hiding method for the acoustic data transmission using MCLT is proposed. In the proposed system, audio signal is transformed by the MCLT and the phases of the coefficients are modified to embed message based on the fact that human auditory perception is more sensitive to the variation in magnitude spectra. In the proposed method, the perceived quality of the data-embedded audio signal can be kept almost similar to that of the original audio while transmitting data at several hundreds of bits per second (bps). The experimental results have shown that the audio quality and transmission performance of proposed system are better than those of the AOFDM based system. Moreover, several techniques have been found to further improve the performance of the proposed acoustic data transmission system which are listed as follows: incorporating a masking threshold (MM), clustering based decoding (CLS), and a spectral magnitude adjustment (SMA). In the fourth chapter of the dissertation, an audio data hiding technique more suitable for acoustic data transmission in reverberant environments is proposed. In this approach, sophisticated techniques widely deployed in wireless communication is incorporated which can be summarized as follows: First, a proper range of MCLT length to cope with reverberant environments is analyzed based on the wireless communication theory. Second, a channel estimation technique based on the Wiener estimator to compensate the effect of channel is applied in conjunction with a suitable data packet structure. From the experimental result, the MCLT length longer than the reverberation time is found to be robust against the reverberant environments at the cost of the quality of the data-embedded audio. The experimental results have also shown that the proposed method is robust against various forms of attacks such as signal processing, overwriting, and malicious removal methods. However, it would be the most severe problem to find a proper window length which satisfies both the inaudible distortion and robust data transmission in the reverberant environments. For the phase modification of the audio signal, it would be highly likely to incur a significant quality degradation if the length of time-frequency transform is very long due to the pre-echo phenomena. In the fifth chapter, therefore, segmental SNR adjustment (SSA) technique is proposed to further modify the spectral components for attenuating the pre-echo. In the proposed SSA technique, segmenatal SNR is calculated from short-length MCLT analysis and its minimum value is limited to a desired value. The experimental results have shown that the SSA algorithm with a long MCLT length can attenuate the pre-echo effectively such that it can transmit data more reliably while preserving good audio quality. In addition, a good trade-off between the audio quality and transmission performance can be achieved by adjusting only a single parameter in the SSA algorithm. If the number of microphones is more than one, the diversity technique which takes advantage of transmitting duplicates through statistically independent channel could be useful to enhance the transmission reliability. In the sixth chapter, the acoustic data transmission technique is extended to take advantage of the multi-microphone scheme based on combining. In the combining-based multichannel method, the synchronization and channel estimation are respectively performed at each received signal and then the received signals are linearly combined so that the SNR is increased. The most noticeable property for combining-based technique is to provide compatibility with the acoustic data transmission system using a single microphone. From the series of the experiments, the proposed multichannel method have been found to be useful to enhance the transmission performance despite of the statistical dependency between the channels.Abstract i List of Figures ix List of Tables xv Chapter 1 Introduction 1 1.1 Audio Data Hiding and Acoustic Data Transmission 1 1.2 Previous Methods 4 1.2.1 Audio Watermarking Based Methods 4 1.2.2 Wireless Communication Based Methods 6 1.3 Performance Evaluation 9 1.3.1 Audio Quality 9 1.3.2 Data Transmission Performance 10 1.4 Outline of the Dissertation 10 Chapter 2 Modulated Complex Lapped Transform 13 2.1 Introduction 13 2.2 MCLT 14 2.3 Fast Computation Algorithm 18 2.4 Derivation of Interference Terms in MCLT 19 2.5 Summary 24 Chapter 3 Acoustic Data Transmission Based on MCLT 25 3.1 Introduction 25 3.2 Data Embedding 27 3.2.1 Message Frame 27 3.2.2 Synchronization Frame 29 3.2.3 Data Packet Structure 32 3.3 Data Extraction 32 3.4 Techniques for Performance Enhancement 33 3.4.1 Magnitude Modification Based on Frequency Masking 33 3.4.2 Clustering-based Decoding 35 3.4.3 Spectral Magnitude Adjustment Algorithm 37 3.5 Experimental Results 39 3.5.1 Comparison with Acoustic OFDM 39 3.5.2 Performance Improvements by Magnitude Modification and Clustering based Decoding 47 3.5.3 Performance Improvements by Spectral Magnitude Adjustment 50 3.6 Summary 52 Chapter 4 Robust Acoustic Data Transmission against Reverberant Environments 55 4.1 Introduction 55 4.2 Data Embedding 56 4.2.1 Data Embedding 57 4.2.2 MCLT Length 58 4.2.3 Data Packet Structure 60 4.3 Data Extraction 61 4.3.1 Synchronization 61 4.3.2 Channel Estimation and Compensation 62 4.3.3 Data Decoding 65 4.4 Experimental Results 66 4.4.1 Robustness to Reverberation 69 4.4.2 Audio Quality 71 4.4.3 Robustness to Doppler Effect 71 4.4.4 Robustness to Attacks 71 4.5 Summary 75 Chapter 5 Segmental SNR Adjustment for Audio Quality Enhancement 77 5.1 Introduction 77 5.2 Segmental SNR Adjustment Algorithm 79 5.3 Experimental Results 83 5.3.1 System Configurations 83 5.3.2 Audio Quality Test 84 5.3.3 Robustness to Attacks 86 5.3.4 Transmission Performance of Recorded Signals in Indoor Environment 87 5.3.5 Error correction using convolutional coding 89 5.4 Summary 91 Chapter 6 Multichannel Acoustic Data Transmission 93 6.1 Introduction 93 6.2 Multichannel Techniques for Robust Data Transmission 94 6.2.1 Diversity Techniques for Multichannel System 94 6.2.2 Combining-based Multichannel Acoustic Data Transmission 98 6.3 Experimental Results 100 6.3.1 Room Environments 101 6.3.2 Transmission Performance of Simulated Environments 102 6.3.3 Transmission Performance of Recorded Signals in Reverberant Environment 105 6.4 Summary 106 Chapter 7 Conclusions 109 Bibliography 113 국문초록 121Docto

    New Digital Audio Watermarking Algorithms for Copyright Protection

    Get PDF
    This thesis investigates the development of digital audio watermarking in addressing issues such as copyright protection. Over the past two decades, many digital watermarking algorithms have been developed, each with its own advantages and disadvantages. The main aim of this thesis was to develop a new watermarking algorithm within an existing Fast Fourier Transform framework. This resulted in the development of a Complex Spectrum Phase Evolution based watermarking algorithm. In this new implementation, the embedding positions were generated dynamically thereby rendering it more difficult for an attacker to remove, and watermark information was embedded by manipulation of the spectral components in the time domain thereby reducing any audible distortion. Further improvements were attained when the embedding criteria was based on bin location comparison instead of magnitude, thereby rendering it more robust against those attacks that interfere with the spectral magnitudes. However, it was discovered that this new audio watermarking algorithm has some disadvantages such as a relatively low capacity and a non-consistent robustness for different audio files. Therefore, a further aim of this thesis was to improve the algorithm from a different perspective. Improvements were investigated using an Singular Value Decomposition framework wherein a novel observation was discovered. Furthermore, a psychoacoustic model was incorporated to suppress any audible distortion. This resulted in a watermarking algorithm which achieved a higher capacity and a more consistent robustness. The overall result was that two new digital audio watermarking algorithms were developed which were complementary in their performance thereby opening more opportunities for further research

    Efficient and robust audio fingerprinting

    Get PDF

    Single-Microphone Speech Dereverberation based on Multiple-Step Linear Predictive Inverse Filtering and Spectral Subtraction

    Get PDF
    Single-channel speech dereverberation is a challenging problem of deconvolution of reverberation, produced by the room impulse response, from the speech signal, when only one observation of the reverberant signal (one microphone) is available. Although reverberation in mild levels is helpful in perceiving the speech (or any audio) signal, the adverse effect of reverberation, particularly at high levels, could both deteriorate the performance of automatic recognition systems and make it less intelligible by humans. Single-microphone speech dereverberation is more challenging than multi-microphone speech dereverberation, since it does not allow for spatial processing of different observations of the signal. A review of the recent single-channel dereverberation techniques reveals that, those based on LP-residual enhancement are the most promising ones. On the other hand, spectral subtraction has also been effectively used for dereverberation particularly when long reflections are involved. By using LP-residuals and spectral subtraction as two promising tools for dereverberation, a new dereverberation technique is proposed. The first stage of the proposed technique consists of pre-whitening followed by a delayed long-term LP filtering whose kurtosis or skewness of LP-residuals is maximized to control the weight updates of the inverse filter. The second stage consists of nonlinear spectral subtraction. The proposed two-stage dereverberation scheme leads to two separate algorithms depending on whether kurtosis or skewness maximization is used to establish a feedback function for the weight updates of the adaptive inverse filter. It is shown that the proposed algorithms have several advantages over the existing major single-microphone methods, including a reduction in both early and late reverberations, speech enhancement even in the case of very high reverberation time, robustness to additive background noise, and introducing only a few minor artifacts. Equalized room impulse responses by the proposed algorithms have less reverberation times. This means the inverse-filtering by the proposed algorithms is more successful in dereverberating the speech signal. For short, medium and high reverberation times, the signal-to-reverberation ratio of the proposed technique is significantly higher than that of the existing major algorithms. The waveforms and spectrograms of the inverse-filtered and fully-processed signals indicate the superiority of the proposed algorithms. Assessment of the overall quality of the processed speech signals by automatic speech recognition and perceptual evaluation of speech quality test also confirms that in most cases the proposed technique yields higher scores and in the cases that it does not do so, the difference is not as significant as the other aspects of the performance evaluation. Finally, the robustness of the proposed algorithms against the background noise is investigated and compared to that of the benchmark algorithms, which shows that the proposed algorithms are capable of maintaining a rather stable performance for contaminated speech signals with SNR levels as low as 0 dB

    처프 신호를 이용한 음파 통신 기법 연구

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 8. 최성현.Todays smart devices such as smartphones and tablet/wearable PCs are equipped with voice user interface (UI) in order to support intuitive command input from users. Speakers and microphones of the voice UI are generally used to play and record human voice and/or environmental sound, respectively. Accordingly, various aerial acoustic communication techniques have been introduced to utilize the voice UI as an additional communication interface beyond WiFi and/or Bluetooth. Smart devices are especially suitable for the aerial acoustic communication since the application processor (AP) of smart devices can process the sound to embed or fetch information in it. That is, smart devices work similar to software defined radio platform. The aerial acoustic communication is also very versatile as any audio interface can be utilized as a communication interface. In this dissertation, we propose an aerial acoustic communication technique using inaudible chirp signal as well as corresponding receiver architecture for smart devices. We additionally introduce the applications of the proposed communication technique in indoor environments. We begin the receiver design for aerial acoustic communication by measuring the characteristics of indoor acoustic channel, composed of speaker, air-medium, and microphone. Our experimental research reveals that the indoor acoustic channel typically has long delay spread (approximately 40 msec), and it is very frequency-selective due to the frequency response of audio interfaces. We also show that legacy physical layer (PHY) modulation schemes such as phase/frequency shift keying (PSK/FSK) are likely to fail in this indoor acoustic channel, especially in long communication scenarios, due mainly to the instability of local oscillator and frequency selectivity of audio interfaces. In order to resolve the above-mentioned problems, we use chirp signals for the aerial acoustic communication. The proposed acoustic receiver supports long-range communication independent of the device characteristics over the severely frequency-selective acoustic channel with large delay spread. The chirp signal has time-varying frequency with a specific frequency sweeping rate. The chirp signal was widely used for radar applications due to its capability of resolving multi-path propagation. However, this dissertation is the first study of adopting chirp signal in aerial acoustic communications for smart devices. The proposed receiver architecture of chirp binary orthogonal keying (BOK) can be easily implemented via fast Fourier transform (FFT) in smart devices application layer. Via extensive experimental results, we verify that the proposed chirp signal can deliver data at 16 bps up to 25 m distance in typical indoor environments, which is drastically extended compared to the few meters of previous research. The data rate of 16 bps is enough to deliver short identification (ID) in indoor environments. The exemplary applications with this short ID can be multimedia content recognition and indoor location tracking. The low data rate, however, might be a huddle of the proposed system to be applied to the services that require high data rate. We design a backend server architecture in order to compensate for the low data rate and widen the application extent of the proposed receiver. The smart devices can send queries in order to refer to the backend server for additional information that is related with the received ID. We also propose an energy-efficient recording and processing method for the acoustic signal detection. Note that it would consume huge amount of energy if the smart devices contiguously sensed the acoustic signal for 24 hours. The smart devices instead control the sensing (i.e., recording) timing so that it is activated only when there exists chirp signal. This can drastically extend the battery lifetime by removing unnecessary signal processing. We also present two application examples of the proposed receiver, namely, (1) TV content recognition, and (2) indoor location tracking, including technical discussions on their implementations. Experiments and field tests validate the feasibility of the proposed aerial acoustic communication in practical environments.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Acoustic communication . . . . . . . . . . . . . . . . . 1 1.1.1 Underwater acoustic communication . . . . . 2 1.1.2 Aerial acoustic communication . . . . . . . . . . 3 1.2 Overview of Existing Approaches . . . . . . . . . . . 5 1.2.1 Indoor Location Tracking . . . . . . . . . . . . . . . 5 1.2.2 Data Communication using Acoustic Signal . 7 1.2.3 Commercial Services . . . . . . . . . . . . . . . . . . 9 1.2.4 Limitations of Previous Work . . . . . . . . . . . 10 1.3 Main Contributions . . . . . . . . . . . . . . . . . . . . . 11 1.3.1 Acoustic Channel and PHY Analysis . . . . . . 12 1.3.2 Receiver Design for Acoustic Chirp BOK . . . 12 1.3.3 Applications of Chirp BOK Receiver . . . . . . 13 1.4 Organization of the Dissertation . . . . . . . . . . 13 2 Acoustic Channel and PHY Analysis . . . . . . . . . . 15 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 Characteristics of Indoor Acoustic Channel . . 18 2.2.1 Hearing Threshold of Human . . . . . . . . . . . 18 2.2.2 Frequency Response of Various Audio Interfaces . 21 2.2.3 Delay Spread of Acoustic Channel . . . . . . . . 25 2.3 Revisit of Existing Modulation Schemes . . . . . . 26 2.3.1 Case Study: Phase Shift Keying . . . . . . . . . . 28 2.3.2 Case Study: Frequency Shift Keying . . . . . . . 35 2.3.3 Chirp Binary Orthogonal Keying (BOK) . . . . 40 2.4 Performance Evaluation of PHY Modulation Schemes . 42 2.4.1 Experimental Environment . . . . . . . . . . . . . . 44 2.4.2 PSK Demodulator . . . . . . . . . . . . . . . . . . . . . 44 2.4.3 FSK Demodulator . . . . . . . . . . . . . . . . . . . . . 45 2.4.4 BER of PHY Modulation Schemes . . . . . . . . . 46 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3 Receiver Design for Acoustic Chirp BOK . . . . . . . 49 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2 Chirp Signals and Matched Filter Receiver . . . . . 51 3.2.1 Notation of Chirp Signals . . . . . . . . . . . . . . . 51 3.2.2 Matched Filter and FFT . . . . . . . . . . . . . . . . . 53 3.2.3 Envelope Detection of Chirp Auto Correlation . 55 3.3 System Design and Receiver Architecture . . . . . . 59 3.3.1 Frame and Symbol Design . . . . . . . . . . . . . . . 60 3.3.2 Signal Reception Process . . . . . . . . . . . . . . . . 63 3.3.3 Receiver Architecture . . . . . . . . . . . . . . . . . . . 65 3.3.4 Symbol combining for BER enhancement . . . . 68 3.4 Performance Evaluation of Chirp BOK Receiver . . 73 3.4.1 Experimental Environment . . . . . . . . . . . . . . . . 74 3.4.2 Transmission Range in Indoor Environment . . . 74 3.4.3 Multi-path Resolution Capability of Chirp Signal . 75 3.4.4 Symbol Sampling and Doppler Shift . . . . . . . . . 82 3.4.5 Selective combining . . . . . . . . . . . . . . . . . . . . . 85 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4 Applications of Chirp BOK Receiver . . . . . . . . . . . . . . 90 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2 Backend Server Architecture . . . . . . . . . . . . . . . . . . 93 4.2.1 Implementation of Backend Server . . . . . . . . . . 93 4.2.2 Operation of Backend Server . . . . . . . . . . . . . . 95 4.3 Low Power Operation for Smart Devices . . . . . . . . 98 4.3.1 Reception Process of Chirp BOK receiver . . . . . . 98 4.3.2 Revisit of Signal Detection in Wireless Communications ... 100 4.3.3 Chirp Signal Detection using PSD . . . . . . . . . . . 102 4.3.4 Performance Evaluation of Signal Detection Algorithm . 105 4.4 Applications of Chirp BOK Receiver and Feasibility Test . . 110 4.4.1 TV Content Recognition . . . . . . . . . . . . . . . . . . . 111 4.4.2 Indoor Location Tracking in Seoul Subway . . . . . 114 4.4.3 Device to Device Communication . . . . . . . . . . . . 118 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . 123 5.1 Research Contributions . . . . . . . . . . . . . . . . . . . . . . 123 5.2 Future Work and Concluding Remark . . . . . . . . . . 125 Abstract (In Korean) . . . . . . . . . . . . . . . . . . . . . . . . 136Docto

    31th International Conference on Information Modelling and Knowledge Bases

    Get PDF
    Information modelling is becoming more and more important topic for researchers, designers, and users of information systems.The amount and complexity of information itself, the number of abstractionlevels of information, and the size of databases and knowledge bases arecontinuously growing. Conceptual modelling is one of the sub-areas ofinformation modelling. The aim of this conference is to bring together experts from different areas of computer science and other disciplines, who have a common interest in understanding and solving problems on information modelling and knowledge bases, as well as applying the results of research to practice. We also aim to recognize and study new areas on modelling and knowledge bases to which more attention should be paid. Therefore philosophy and logic, cognitive science, knowledge management, linguistics and management science are relevant areas, too. In the conference, there will be three categories of presentations, i.e. full papers, short papers and position papers

    Investigation of the Jet Noise Prediction Theory and Application Utilizing the PAO Formulation

    Get PDF
    Application of the Phillips theory to engineering calculations of rocket and high speed jet noise radiation is reported. Presented are a detailed derivation of the theory, the composition of the numerical scheme, and discussions of the practical problems arising in the application of the present noise prediction method. The present method still contains some empirical elements, yet it provides a unified approach in the prediction of sound power, spectrum, and directivity

    Space transportation system and associated payloads: Glossary, acronyms, and abbreviations

    Get PDF
    A collection of some of the acronyms and abbreviations now in everyday use in the shuttle world is presented. It is a combination of lists that were prepared at Marshall Space Flight Center and Kennedy and Johnson Space Centers, places where intensive shuttle activities are being carried out. This list is intended as a guide or reference and should not be considered to have the status and sanction of a dictionary

    Space Transportation System and associated payloads: Glossary, acronyms, and abbreviations

    Get PDF
    A collection of acronyms in everyday use concerning shuttle activities is presented. A glossary of terms pertaining to the Space Transportation System is included