59 research outputs found

    Improved time-frequency de-noising of acoustic signals for underwater detection system

    Get PDF
    The capability to communicate and perform target localization efficiently in underwater environment is important in many applications. Sound waves are more suitable for underwater communication and target localization because attenuation in water is high for electromagnetic waves. Sound waves are subjected to underwater acoustic noise (UWAN), which is either man-made or natural. Optimum signal detection in UWAN can be achieved with the knowledge of noise statistics. The assumption of Additive White Gaussian noise (AWGN) allows the use of linear correlation (LC) detector. However, the non-Gaussian nature of UWAN results in the poor performance of such detector. This research presents an empirical model of the characteristics of UWAN in shallow waters. Data was measured in Tanjung Balau, Johor, Malaysia on 5 November 2013 and the analysis results showed that the UWAN has a non-Gaussian distribution with characteristics similar to 1/f noise. A complete detection system based on the noise models consisting of a broadband hydrophone, time-frequency distribution, de-noising method, and detection is proposed. In this research, S-transform and wavelet transform were used to generate the time-frequency representation before soft thresholding with modified universal threshold estimation was applied. A Gaussian noise injection detector (GNID) was used to overcome the problem of non-Gaussianity of the UWAN, and its performance was compared with other nonlinear detectors, such as locally optimal (LO) detector, sign correlation (SC) detector, and more conventionally matched filter (MF) detector. This system was evaluated on two types of signals, namely fixed-frequency and linear frequency modulated signals. For de-noising purposes, the S-transform outperformed the wavelet transform in terms of signal-to-noise ratio and root-mean-square error at 4 dB and 3 dB, respectively. The performance of the detectors was evaluated based on the energy-to-noise ratio (ENR) to achieve detection probability of 90% and a false alarm probability of 0.01. Thus, the ENR of the GNID using S-transform denoising, LO detector, SC detector, and MF detector were 8.89 dB, 10.66 dB, 12.7dB, and 12.5 dB, respectively, for the time-varying signal. Among the four detectors, the proposed GNID achieved the best performance, whereas the LC detector showed the weakest performance in the presence of UWAN

    Efficient Multiband Algorithms for Blind Source Separation

    Get PDF
    The problem of blind separation refers to recovering original signals, called source signals, from the mixed signals, called observation signals, in a reverberant environment. The mixture is a function of a sequence of original speech signals mixed in a reverberant room. The objective is to separate mixed signals to obtain the original signals without degradation and without prior information of the features of the sources. The strategy used to achieve this objective is to use multiple bands that work at a lower rate, have less computational cost and a quicker convergence than the conventional scheme. Our motivation is the competitive results of unequal-passbands scheme applications, in terms of the convergence speed. The objective of this research is to improve unequal-passbands schemes by improving the speed of convergence and reducing the computational cost. The first proposed work is a novel maximally decimated unequal-passbands scheme.This scheme uses multiple bands that make it work at a reduced sampling rate, and low computational cost. An adaptation approach is derived with an adaptation step that improved the convergence speed. The performance of the proposed scheme was measured in different ways. First, the mean square errors of various bands are measured and the results are compared to a maximally decimated equal-passbands scheme, which is currently the best performing method. The results show that the proposed scheme has a faster convergence rate than the maximally decimated equal-passbands scheme. Second, when the scheme is tested for white and coloured inputs using a low number of bands, it does not yield good results; but when the number of bands is increased, the speed of convergence is enhanced. Third, the scheme is tested for quick changes. It is shown that the performance of the proposed scheme is similar to that of the equal-passbands scheme. Fourth, the scheme is also tested in a stationary state. The experimental results confirm the theoretical work. For more challenging scenarios, an unequal-passbands scheme with over-sampled decimation is proposed; the greater number of bands, the more efficient the separation. The results are compared to the currently best performing method. Second, an experimental comparison is made between the proposed multiband scheme and the conventional scheme. The results show that the convergence speed and the signal-to-interference ratio of the proposed scheme are higher than that of the conventional scheme, and the computation cost is lower than that of the conventional scheme

    Improving Maternal and Fetal Cardiac Monitoring Using Artificial Intelligence

    Get PDF
    Early diagnosis of possible risks in the physiological status of fetus and mother during pregnancy and delivery is critical and can reduce mortality and morbidity. For example, early detection of life-threatening congenital heart disease may increase survival rate and reduce morbidity while allowing parents to make informed decisions. To study cardiac function, a variety of signals are required to be collected. In practice, several heart monitoring methods, such as electrocardiogram (ECG) and photoplethysmography (PPG), are commonly performed. Although there are several methods for monitoring fetal and maternal health, research is currently underway to enhance the mobility, accuracy, automation, and noise resistance of these methods to be used extensively, even at home. Artificial Intelligence (AI) can help to design a precise and convenient monitoring system. To achieve the goals, the following objectives are defined in this research: The first step for a signal acquisition system is to obtain high-quality signals. As the first objective, a signal processing scheme is explored to improve the signal-to-noise ratio (SNR) of signals and extract the desired signal from a noisy one with negative SNR (i.e., power of noise is greater than signal). It is worth mentioning that ECG and PPG signals are sensitive to noise from a variety of sources, increasing the risk of misunderstanding and interfering with the diagnostic process. The noises typically arise from power line interference, white noise, electrode contact noise, muscle contraction, baseline wandering, instrument noise, motion artifacts, electrosurgical noise. Even a slight variation in the obtained ECG waveform can impair the understanding of the patient's heart condition and affect the treatment procedure. Recent solutions, such as adaptive and blind source separation (BSS) algorithms, still have drawbacks, such as the need for noise or desired signal model, tuning and calibration, and inefficiency when dealing with excessively noisy signals. Therefore, the final goal of this step is to develop a robust algorithm that can estimate noise, even when SNR is negative, using the BSS method and remove it based on an adaptive filter. The second objective is defined for monitoring maternal and fetal ECG. Previous methods that were non-invasive used maternal abdominal ECG (MECG) for extracting fetal ECG (FECG). These methods need to be calibrated to generalize well. In other words, for each new subject, a calibration with a trustable device is required, which makes it difficult and time-consuming. The calibration is also susceptible to errors. We explore deep learning (DL) models for domain mapping, such as Cycle-Consistent Adversarial Networks, to map MECG to fetal ECG (FECG) and vice versa. The advantages of the proposed DL method over state-of-the-art approaches, such as adaptive filters or blind source separation, are that the proposed method is generalized well on unseen subjects. Moreover, it does not need calibration and is not sensitive to the heart rate variability of mother and fetal; it can also handle low signal-to-noise ratio (SNR) conditions. Thirdly, AI-based system that can measure continuous systolic blood pressure (SBP) and diastolic blood pressure (DBP) with minimum electrode requirements is explored. The most common method of measuring blood pressure is using cuff-based equipment, which cannot monitor blood pressure continuously, requires calibration, and is difficult to use. Other solutions use a synchronized ECG and PPG combination, which is still inconvenient and challenging to synchronize. The proposed method overcomes those issues and only uses PPG signal, comparing to other solutions. Using only PPG for blood pressure is more convenient since it is only one electrode on the finger where its acquisition is more resilient against error due to movement. The fourth objective is to detect anomalies on FECG data. The requirement of thousands of manually annotated samples is a concern for state-of-the-art detection systems, especially for fetal ECG (FECG), where there are few publicly available FECG datasets annotated for each FECG beat. Therefore, we will utilize active learning and transfer-learning concept to train a FECG anomaly detection system with the least training samples and high accuracy. In this part, a model is trained for detecting ECG anomalies in adults. Later this model is trained to detect anomalies on FECG. We only select more influential samples from the training set for training, which leads to training with the least effort. Because of physician shortages and rural geography, pregnant women's ability to get prenatal care might be improved through remote monitoring, especially when access to prenatal care is limited. Increased compliance with prenatal treatment and linked care amongst various providers are two possible benefits of remote monitoring. If recorded signals are transmitted correctly, maternal and fetal remote monitoring can be effective. Therefore, the last objective is to design a compression algorithm that can compress signals (like ECG) with a higher ratio than state-of-the-art and perform decompression fast without distortion. The proposed compression is fast thanks to the time domain B-Spline approach, and compressed data can be used for visualization and monitoring without decompression owing to the B-spline properties. Moreover, the stochastic optimization is designed to retain the signal quality and does not distort signal for diagnosis purposes while having a high compression ratio. In summary, components for creating an end-to-end system for day-to-day maternal and fetal cardiac monitoring can be envisioned as a mix of all tasks listed above. PPG and ECG recorded from the mother can be denoised using deconvolution strategy. Then, compression can be employed for transmitting signal. The trained CycleGAN model can be used for extracting FECG from MECG. Then, trained model using active transfer learning can detect anomaly on both MECG and FECG. Simultaneously, maternal BP is retrieved from the PPG signal. This information can be used for monitoring the cardiac status of mother and fetus, and also can be used for filling reports such as partogram

    Blind source separation for interference cancellation in CDMA systems

    Get PDF
    Communication is the science of "reliable" transfer of information between two parties, in the sense that the information reaches the intended party with as few errors as possible. Modern wireless systems have many interfering sources that hinder reliable communication. The performance of receivers severely deteriorates in the presence of unknown or unaccounted interference. The goal of a receiver is then to combat these sources of interference in a robust manner while trying to optimize the trade-off between gain and computational complexity. Conventional methods mitigate these sources of interference by taking into account all available information and at times seeking additional information e.g., channel characteristics, direction of arrival, etc. This usually costs bandwidth. This thesis examines the issue of developing mitigating algorithms that utilize as little as possible or no prior information about the nature of the interference. These methods are either semi-blind, in the former case, or blind in the latter case. Blind source separation (BSS) involves solving a source separation problem with very little prior information. A popular framework for solving the BSS problem is independent component analysis (ICA). This thesis combines techniques of ICA with conventional signal detection to cancel out unaccounted sources of interference. Combining an ICA element to standard techniques enables a robust and computationally efficient structure. This thesis proposes switching techniques based on BSS/ICA effectively to combat interference. Additionally, a structure based on a generalized framework termed as denoising source separation (DSS) is presented. In cases where more information is known about the nature of interference, it is natural to incorporate this knowledge in the separation process, so finally this thesis looks at the issue of using some prior knowledge in these techniques. In the simple case, the advantage of using priors should at least lead to faster algorithms.reviewe

    Efficient Blind Source Separation Algorithms with Applications in Speech and Biomedical Signal Processing

    Get PDF
    Blind source separation/extraction (BSS/BSE) is a powerful signal processing method and has been applied extensively in many fields such as biomedical sciences and speech signal processing, to extract a set of unknown input sources from a set of observations. Different algorithms of BSS were proposed in the literature, that need more investigations, related to the extraction approach, computational complexity, convergence speed, type of domain (time or frequency), mixture properties, and extraction performances. This work presents a three new BSS/BSE algorithms based on computing new transformation matrices used to extract the unknown signals. Type of signals considered in this dissertation are speech, Gaussian, and ECG signals. The first algorithm, named as the BSE-parallel linear predictor filter (BSE-PLP), computes a transformation matrix from the the covariance matrix of the whitened data. Then, use the matrix as an input to linear predictor filters whose coefficients being the unknown sources. The algorithm has very fast convergence in two iterations. Simulation results, using speech, Gaussian, and ECG signals, show that the model is capable of extracting the unknown source signals and removing noise when the input signal to noise ratio is varied from -20 dB to 80 dB. The second algorithm, named as the BSE-idempotent transformation matrix (BSE-ITM), computes its transformation matrix in iterative form, with less computational complexity. The proposed method is tested using speech, Gaussian, and ECG signals. Simulation results show that the proposed algorithm significantly separate the source signals with better performance measures as compared with other approaches used in the dissertation. The third algorithm, named null space idempotent transformation matrix (NSITM) has been designed using the principle of null space of the ITM, to separate the unknown sources. Simulation results show that the method is successfully separating speech, Gaussian, and ECG signals from their mixture. The algorithm has been used also to estimate average FECG heart rate. Results indicated considerable improvement in estimating the peaks over other algorithms used in this work

    Multichannel Online Dereverberation based on Spectral Magnitude Inverse Filtering

    Full text link
    This paper addresses the problem of multichannel online dereverberation. The proposed method is carried out in the short-time Fourier transform (STFT) domain, and for each frequency band independently. In the STFT domain, the time-domain room impulse response is approximately represented by the convolutive transfer function (CTF). The multichannel CTFs are adaptively identified based on the cross-relation method, and using the recursive least square criterion. Instead of the complex-valued CTF convolution model, we use a nonnegative convolution model between the STFT magnitude of the source signal and the CTF magnitude, which is just a coarse approximation of the former model, but is shown to be more robust against the CTF perturbations. Based on this nonnegative model, we propose an online STFT magnitude inverse filtering method. The inverse filters of the CTF magnitude are formulated based on the multiple-input/output inverse theorem (MINT), and adaptively estimated based on the gradient descent criterion. Finally, the inverse filtering is applied to the STFT magnitude of the microphone signals, obtaining an estimate of the STFT magnitude of the source signal. Experiments regarding both speech enhancement and automatic speech recognition are conducted, which demonstrate that the proposed method can effectively suppress reverberation, even for the difficult case of a moving speaker.Comment: Paper submitted to IEEE/ACM Transactions on Audio, Speech and Language Processing. IEEE Signal Processing Letters, 201

    Single-Microphone Speech Dereverberation based on Multiple-Step Linear Predictive Inverse Filtering and Spectral Subtraction

    Get PDF
    Single-channel speech dereverberation is a challenging problem of deconvolution of reverberation, produced by the room impulse response, from the speech signal, when only one observation of the reverberant signal (one microphone) is available. Although reverberation in mild levels is helpful in perceiving the speech (or any audio) signal, the adverse effect of reverberation, particularly at high levels, could both deteriorate the performance of automatic recognition systems and make it less intelligible by humans. Single-microphone speech dereverberation is more challenging than multi-microphone speech dereverberation, since it does not allow for spatial processing of different observations of the signal. A review of the recent single-channel dereverberation techniques reveals that, those based on LP-residual enhancement are the most promising ones. On the other hand, spectral subtraction has also been effectively used for dereverberation particularly when long reflections are involved. By using LP-residuals and spectral subtraction as two promising tools for dereverberation, a new dereverberation technique is proposed. The first stage of the proposed technique consists of pre-whitening followed by a delayed long-term LP filtering whose kurtosis or skewness of LP-residuals is maximized to control the weight updates of the inverse filter. The second stage consists of nonlinear spectral subtraction. The proposed two-stage dereverberation scheme leads to two separate algorithms depending on whether kurtosis or skewness maximization is used to establish a feedback function for the weight updates of the adaptive inverse filter. It is shown that the proposed algorithms have several advantages over the existing major single-microphone methods, including a reduction in both early and late reverberations, speech enhancement even in the case of very high reverberation time, robustness to additive background noise, and introducing only a few minor artifacts. Equalized room impulse responses by the proposed algorithms have less reverberation times. This means the inverse-filtering by the proposed algorithms is more successful in dereverberating the speech signal. For short, medium and high reverberation times, the signal-to-reverberation ratio of the proposed technique is significantly higher than that of the existing major algorithms. The waveforms and spectrograms of the inverse-filtered and fully-processed signals indicate the superiority of the proposed algorithms. Assessment of the overall quality of the processed speech signals by automatic speech recognition and perceptual evaluation of speech quality test also confirms that in most cases the proposed technique yields higher scores and in the cases that it does not do so, the difference is not as significant as the other aspects of the performance evaluation. Finally, the robustness of the proposed algorithms against the background noise is investigated and compared to that of the benchmark algorithms, which shows that the proposed algorithms are capable of maintaining a rather stable performance for contaminated speech signals with SNR levels as low as 0 dB

    Natural Image Coding in V1: How Much Use is Orientation Selectivity?

    Get PDF
    Orientation selectivity is the most striking feature of simple cell coding in V1 which has been shown to emerge from the reduction of higher-order correlations in natural images in a large variety of statistical image models. The most parsimonious one among these models is linear Independent Component Analysis (ICA), whereas second-order decorrelation transformations such as Principal Component Analysis (PCA) do not yield oriented filters. Because of this finding it has been suggested that the emergence of orientation selectivity may be explained by higher-order redundancy reduction. In order to assess the tenability of this hypothesis, it is an important empirical question how much more redundancies can be removed with ICA in comparison to PCA, or other second-order decorrelation methods. This question has not yet been settled, as over the last ten years contradicting results have been reported ranging from less than five to more than hundred percent extra gain for ICA. Here, we aim at resolving this conflict by presenting a very careful and comprehensive analysis using three evaluation criteria related to redundancy reduction: In addition to the multi-information and the average log-loss we compute, for the first time, complete rate-distortion curves for ICA in comparison with PCA. Without exception, we find that the advantage of the ICA filters is surprisingly small. Furthermore, we show that a simple spherically symmetric distribution with only two parameters can fit the data even better than the probabilistic model underlying ICA. Since spherically symmetric models are agnostic with respect to the specific filter shapes, we conlude that orientation selectivity is unlikely to play a critical role for redundancy reduction

    Example-based audio editing

    Get PDF
    Traditionally, audio recordings are edited through digital audio workstations (DAWs), which give users access to different tools and parameters through a graphical user interface (GUI) without prior knowledge in coding or signal processing. The complexity of working with DAWs and the undeniable need for strong listening skills have made audio editing unpopular among novice users and time consuming for professionals. We propose an intelligent audio editor (EBAE) that automates major audio editing routines with the use of an example sound and efficiently provides users with high-quality results. EBAE first extracts meaningful information from an example sound that already contains the desired effects and then applies them to a desired recording by employing signal processing and machine learning techniques

    Noise reduction in industry based on virtual instrumentation

    Get PDF
    This paper discusses the reduction of background noise in an industrial environment to extend human-machine-interaction. In the Industry 4.0 era, the mass development of voice control (speech recognition) in various industrial applications is possible, especially as related to augmented reality (such as hands-free control via voice commands). As Industry 4.0 relies heavily on radiofrequency technologies, some brief insight into this problem is provided, including the Internet of things (IoT) and 5G deployment. This study was carried out in cooperation with the industrial partner Brose CZ spol. s.r.o., where sound recordings were made to produce a dataset. The experimental environment comprised three workplaces with background noise above 100 dB, consisting of a laser/magnetic welder and a press. A virtual device was developed from a given dataset in order to test selected commands from a commercial speech recognizer from Microsoft. We tested a hybrid algorithm for noise reduction and its impact on voice command recognition efficiency. Using virtual devices, the study was carried out on large speakers with 20 participants (10 men and 10 women). The experiments included a large number of repetitions (100 times for each command under different noise conditions). Statistical results confirmed the efficiency of the tested algorithms. Laser welding environment efficiency was 27% before applied filtering, 76% using the least mean square (LMS) algorithm, and 79% using LMS + independent component analysis (ICA). Magnetic welding environment efficiency was 24% before applied filtering, 70% with LMS, and 75% with LMS + ICA. Press workplace environment efficiency showed no success before applied filtering, was 52% with LMS, and was 54% with LMS + ICA.Web of Science6911096107
    corecore