Search CORE

708 research outputs found

A modulation property of time-frequency derivatives of filtered phase and its application to aperiodicity and fo estimation

Author: Banno Hideki
Kawahara Hideki
Morise Masanori
Sakakibara Ken-Ichi
Toda Tomoki
Publication venue: 'International Speech Communication Association'
Publication date: 09/06/2017
Field of study

We introduce a simple and linear SNR (strictly speaking, periodic to random power ratio) estimator (0dB to 80dB without additional calibration/linearization) for providing reliable descriptions of aperiodicity in speech corpus. The main idea of this method is to estimate the background random noise level without directly extracting the background noise. The proposed method is applicable to a wide variety of time windowing functions with very low sidelobe levels. The estimate combines the frequency derivative and the time-frequency derivative of the mapping from filter center frequency to the output instantaneous frequency. This procedure can replace the periodicity detection and aperiodicity estimation subsystems of recently introduced open source vocoder, YANG vocoder. Source code of MATLAB implementation of this method will also be open sourced.Comment: 8 pages 9 figures, Submitted and accepted in Interspeech201

arXiv.org e-Print Archive

Crossref

New Digital Audio Watermarking Algorithms for Copyright Protection

Author: Wang Jian
Publication venue
Publication date: 01/09/2011
Field of study

This thesis investigates the development of digital audio watermarking in addressing issues such as copyright protection. Over the past two decades, many digital watermarking algorithms have been developed, each with its own advantages and disadvantages. The main aim of this thesis was to develop a new watermarking algorithm within an existing Fast Fourier Transform framework. This resulted in the development of a Complex Spectrum Phase Evolution based watermarking algorithm. In this new implementation, the embedding positions were generated dynamically thereby rendering it more difficult for an attacker to remove, and watermark information was embedded by manipulation of the spectral components in the time domain thereby reducing any audible distortion. Further improvements were attained when the embedding criteria was based on bin location comparison instead of magnitude, thereby rendering it more robust against those attacks that interfere with the spectral magnitudes. However, it was discovered that this new audio watermarking algorithm has some disadvantages such as a relatively low capacity and a non-consistent robustness for different audio files. Therefore, a further aim of this thesis was to improve the algorithm from a different perspective. Improvements were investigated using an Singular Value Decomposition framework wherein a novel observation was discovered. Furthermore, a psychoacoustic model was incorporated to suppress any audible distortion. This resulted in a watermarking algorithm which achieved a higher capacity and a more consistent robustness. The overall result was that two new digital audio watermarking algorithms were developed which were complementary in their performance thereby opening more opportunities for further research

MURAL - Maynooth University Research Archive Library

An overview of nearly a half century of microembolic signal processing techniques

Author: Charara Jamal
Geryes Maroun
Girault Jean Marc
Ménigot Sébastien
Publication venue: HAL CCSD
Publication date: 17/10/2019
Field of study

International audienceMicro-emboli detection for patients with high risk of strokes has been performed with transcranial Doppler (TCD) systems since 1969. As a consequence, instrumentation of TCD systems progressed with the introduction of multigate systems, power mode systems and robotized probes, to name but a few. These new types of TCD have increase the chance of robust detections of quite big micro-emboli and at the same time increased the efficiency of artefact rejection.For a couple of years now, it is now possible to prevent cerebrovascular accidents (CVA) by detecting very small micro-emboli, the latter being precursor signs of strokes. For this sake, a new generation of transcranial Doppler (TCD) systems (holter) is used to record examinations of long duration. In an attempt to detect the smallest possible micro-emboli, offline softwares based on recent signal processing techniques complete advantageously these holter systems.In this communication, an overview of fifty years of research developments in embolic signal processing is proposed. What is interesting during this adventure of a half century is that detection methods were inspired as signal processing discoveries coming from speech processing to econometric. With the advent of the artificial intelligence, new challenges are being drawn up

Hybrid Multiresolution Analysis Of ‘Punch’ In Musical Signals

Author: Fenton Steven
Lee Hyunkook
Wakefield Jonathan P.
Publication venue
Publication date: 01/01/2015
Field of study

This paper presents a hybrid multi-resolution technique for the extraction and measurement of attributes contained within a musical signal. Decomposing music into simpler percussive, harmonic and noise components is useful when detailed extraction of signal attributes is required. The key parameter of interest in this paper is that of punch. A methodology is explored that decomposes the musical signal using a critically sampled constant-Q filterbank of quadrature mirror filters (QMF) before adaptive windowed short term Fourier transforms (STFT). The proposed hybrid method offers accuracy in both the time and frequency domains. Following the decomposition transform process, attributes are analyzed. It is shown that analysis of these components may yield parameters that would be of use in both mixing/mastering and also audio transcription and retrieval

University of Huddersfield Repository

Huddersfield Research Portal

Hardware Implementation of Audio Watermarking Based on DWT Transform

Author: Joshi Amit M.
Publication venue: 'IntechOpen'
Publication date: 27/11/2019
Field of study

Presently, the duplicate copy of an audio can be generated with great ease using some smart devices, and transmitted over the internet which raises concern over copyright and privacy. Digital audio watermarking is a procedure to insert some data bits known as watermark into audio signal. Then the audio with watermark is to be transmitted to end user or made public. The proposed algorithm is used to insert a binary watermark image into a detailed coefficient of the Daubechies 9/7-based DWT transform. A watermark is dispersed consistently in low frequencies, which builds the robustness and inaudibility of the watermark data. Further, the watermark is embedded into an audio signal to have robust system against audio attacks and inaudible performance. The algorithm is verified using MATLAB and subsequently implemented on FPGA hardware to verify the real-time performance. Hardware implementation helps to embed the watermark at the same instance when audio is being captured. The results show promising application for real-time audio applications

IntechOpen

Crossref

Frequency-warped autoregressive modeling and filtering

Author: Härmä Aki
Publication venue: Teknillinen korkeakoulu
Publication date: 25/05/2001
Field of study

This thesis consists of an introduction and nine articles. The articles are related to the application of frequency-warping techniques to audio signal processing, and in particular, predictive coding of wideband audio signals. The introduction reviews the literature and summarizes the results of the articles. Frequency-warping, or simply warping techniques are based on a modification of a conventional signal processing system so that the inherent frequency representation in the system is changed. It is demonstrated that this may be done for basically all traditional signal processing algorithms. In audio applications it is beneficial to modify the system so that the new frequency representation is close to that of human hearing. One of the articles is a tutorial paper on the use of warping techniques in audio applications. Majority of the articles studies warped linear prediction, WLP, and its use in wideband audio coding. It is proposed that warped linear prediction would be particularly attractive method for low-delay wideband audio coding. Warping techniques are also applied to various modifications of classical linear predictive coding techniques. This was made possible partly by the introduction of a class of new implementation techniques for recursive filters in one of the articles. The proposed implementation algorithm for recursive filters having delay-free loops is a generic technique. This inspired to write an article which introduces a generalized warped linear predictive coding scheme. One example of the generalized approach is a linear predictive algorithm using almost logarithmic frequency representation.reviewe

Maastricht University Research Portal

Aaltodoc Publication Archive

Statistical Properties and Applications of Empirical Mode Decomposition

Author: Al-Badrawi Mahdi Hameed
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2017
Field of study

Signal analysis is key to extracting information buried in noise. The decomposition of signal is a data analysis tool for determining the underlying physical components of a processed data set. However, conventional signal decomposition approaches such as wavelet analysis, Wagner-Ville, and various short-time Fourier spectrograms are inadequate to process real world signals. Moreover, most of the given techniques require \emph{a prior} knowledge of the processed signal, to select the proper decomposition basis, which makes them improper for a wide range of practical applications. Empirical Mode Decomposition (EMD) is a non-parametric and adaptive basis driver that is capable of breaking-down non-linear, non-stationary signals into an intrinsic and finite components called Intrinsic Mode Functions (IMF). In addition, EMD approximates a dyadic filter that isolates high frequency components, e.g. noise, in higher index IMFs. Despite of being widely used in different applications, EMD is an ad hoc solution. The adaptive performance of EMD comes at the expense of formulating a theoretical base. Therefore, numerical analysis is usually adopted in literature to interpret the behavior. This dissertation involves investigating statistical properties of EMD and utilizing the outcome to enhance the performance of signal de-noising and spectrum sensing systems. The novel contributions can be broadly summarized in three categories: a statistical analysis of the probability distributions of the IMFs and a suggestion of Generalized Gaussian distribution (GGD) as a best fit distribution; a de-noising scheme based on a null-hypothesis of IMFs utilizing the unique filter behavior of EMD; and a novel noise estimation approach that is used to shift semi-blind spectrum sensing techniques into fully-blind ones based on the first IMF. These contributions are justified statistically and analytically and include comparison with other state of art techniques

UNH Scholars' Repository

Signal processing techniques for mobile multimedia systems

Author: Athanasiadis T
Publication venue: RMIT University
Publication date: 01/01/2007
Field of study

Recent trends in wireless communication systems show a significant demand for the delivery of multimedia services and applications over mobile networks - mobile multimedia - like video telephony, multimedia messaging, mobile gaming, interactive and streaming video, etc. However, despite the ongoing development of key communication technologies that support these applications, the communication resources and bandwidth available to wireless/mobile radio systems are often severely limited. It is well known, that these bottlenecks are inherently due to the processing capabilities of mobile transmission systems, and the time-varying nature of wireless channel conditions and propagation environments. Therefore, new ways of processing and transmitting multimedia data over mobile radio channels have become essential which is the principal focus of this thesis. In this work, the performance and suitability of various signal processing techniques and transmission strategies in the application of multimedia data over wireless/mobile radio links are investigated. The proposed transmission systems for multimedia communication employ different data encoding schemes which include source coding in the wavelet domain, transmit diversity coding (space-time coding), and adaptive antenna beamforming (eigenbeamforming). By integrating these techniques into a robust communication system, the quality (SNR, etc) of multimedia signals received on mobile devices is maximised while mitigating the fast fading and multi-path effects of mobile channels. To support the transmission of high data-rate multimedia applications, a well known multi-carrier transmission technology known as Orthogonal Frequency Division Multiplexing (OFDM) has been implemented. As shown in this study, this results in significant performance gains when combined with other signal-processing techniques such as spa ce-time block coding (STBC). To optimise signal transmission, a novel unequal adaptive modulation scheme for the communication of multimedia data over MIMO-OFDM systems has been proposed. In this system, discrete wavelet transform/subband coding is used to compress data into their respective low-frequency and high-frequency components. Unlike traditional methods, however, data representing the low-frequency data are processed and modulated separately as they are more sensitive to the distortion effects of mobile radio channels. To make use of a desirable subchannel state, such that the quality (SNR) of the multimedia data recovered at the receiver is optimized, we employ a lookup matrix-adaptive bit and power allocation (LM-ABPA) algorithm. Apart from improving the spectral efficiency of OFDM, the modified LM-ABPA scheme, sorts and allocates subcarriers with the highest SNR to low-frequency data and the remaining to the least important data. To maintain a target system SNR, the LM-ABPA loading scheme assigns appropriate signal constella tion sizes and transmit power levels (modulation type) across all subcarriers and is adapted to the varying channel conditions such that the average system error-rate (SER/BER) is minimised. When configured for a constant data-rate load, simulation results show significant performance gains over non-adaptive systems. In addition to the above studies, the simulation framework developed in this work is applied to investigate the performance of other signal processing techniques for multimedia communication such as blind channel equalization, and to examine the effectiveness of a secure communication system based on a logistic chaotic generator (LCG) for chaos shift-keying (CSK)

RMIT Research Repository