6 research outputs found
Gaussian filter based à -trous algorithm for image fusion
Image fusion integrates complementary information from various perspectives in order to provide a meaningful interpretation of useful features and textures in multisource images. Here, we present a multiresolution algorithm based on Stationary Wavelet Transform for fusion of two test images of same size. The algorithm uses a Gaussian low-pass filtering technique for the high frequency subbands of SWT decomposition. The new approach gave sharper edges and structural enhancement than region based approaches involving calculation of energy around salient features. The key feature of Gaussian filtering is the flexibility of using filters with different values for standard deviation depending on the application and the range of detail necessary for processing
Music Source Separation Based on a Lightweight Deep Learning Framework (DTTNET: DUAL-PATH TFC-TDF UNET)
Music source separation (MSS) aims to extract 'vocals', 'drums', 'bass' and
'other' tracks from a piece of mixed music. While deep learning methods have
shown impressive results, there is a trend toward larger models. In our paper,
we introduce a novel and lightweight architecture called DTTNet, which is based
on Dual-Path Module and Time-Frequency Convolutions Time-Distributed
Fully-connected UNet (TFC-TDF UNet). DTTNet achieves 10.12 dB cSDR on 'vocals'
compared to 10.01 dB reported for Bandsplit RNN (BSRNN) but with 86.7% fewer
parameters. We also assess pattern-specific performance and model
generalization for intricate audio patterns.Comment: Submitted to ICASSP 202
Dementia Detection from Speech Using Machine Learning and Deep Learning Architectures
Dementia affects the patient’s memory and leads to language impairment. Research has demonstrated that speech and language deterioration is often a clear indication of dementia and plays a crucial role in the recognition process. Even though earlier studies have used speech features to recognize subjects suffering from dementia, they are often used along with other linguistic features obtained from transcriptions. This study explores significant standalone speech features to recognize dementia. The primary contribution of this work is to identify a compact set of speech features that aid in the dementia recognition process. The secondary contribution is to leverage machine learning (ML) and deep learning (DL) models for the recognition task. Speech samples from the Pitt corpus in Dementia Bank are utilized for the present study. The critical speech feature set of prosodic, voice quality and cepstral features has been proposed for the task. The experimental results demonstrate the superiority of machine learning (87.6 percent) over deep learning (85 percent) models for recognizing Dementia using the compact speech feature combination, along with lower time and memory consumption. The results obtained using the proposed approach are promising compared with the existing works on dementia recognition using speech