Search CORE

24 research outputs found

Learning Front-end Filter-bank Parameters using Convolutional Neural Networks for Abnormal Heart Sound Detection

Author: Feng Zhe
Ghaffarzadegan Shabnam
Hasan Taufiq
Humayun Ahmed Imtiaz
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/06/2018
Field of study

Automatic heart sound abnormality detection can play a vital role in the early diagnosis of heart diseases, particularly in low-resource settings. The state-of-the-art algorithms for this task utilize a set of Finite Impulse Response (FIR) band-pass filters as a front-end followed by a Convolutional Neural Network (CNN) model. In this work, we propound a novel CNN architecture that integrates the front-end bandpass filters within the network using time-convolution (tConv) layers, which enables the FIR filter-bank parameters to become learnable. Different initialization strategies for the learnable filters, including random parameters and a set of predefined FIR filter-bank coefficients, are examined. Using the proposed tConv layers, we add constraints to the learnable FIR filters to ensure linear and zero phase responses. Experimental evaluations are performed on a balanced 4-fold cross-validation task prepared using the PhysioNet/CinC 2016 dataset. Results demonstrate that the proposed models yield superior performance compared to the state-of-the-art system, while the linear phase FIR filterbank method provides an absolute improvement of 9.54% over the baseline in terms of an overall accuracy metric.Comment: 4 pages, 6 figures, IEEE International Engineering in Medicine and Biology Conference (EMBC

arXiv.org e-Print Archive

Crossref

An Ensemble of Transfer, Semi-supervised and Supervised Learning Methods for Pathological Heart Sound Classification

Author: Feng Zhe
Ghaffarzadegan Shabnam
Hasan Taufiq
Humayun Ahmed Imtiaz
Khan Md. Tauhiduzzaman
Publication venue: 'International Speech Communication Association'
Publication date: 07/10/2018
Field of study

In this work, we propose an ensemble of classifiers to distinguish between various degrees of abnormalities of the heart using Phonocardiogram (PCG) signals acquired using digital stethoscopes in a clinical setting, for the INTERSPEECH 2018 Computational Paralinguistics (ComParE) Heart Beats SubChallenge. Our primary classification framework constitutes a convolutional neural network with 1D-CNN time-convolution (tConv) layers, which uses features transferred from a model trained on the 2016 Physionet Heart Sound Database. We also employ a Representation Learning (RL) approach to generate features in an unsupervised manner using Deep Recurrent Autoencoders and use Support Vector Machine (SVM) and Linear Discriminant Analysis (LDA) classifiers. Finally, we utilize an SVM classifier on a high-dimensional segment-level feature extracted using various functionals on short-term acoustic features, i.e., Low-Level Descriptors (LLD). An ensemble of the three different approaches provides a relative improvement of 11.13% compared to our best single sub-system in terms of the Unweighted Average Recall (UAR) performance metric on the evaluation dataset.Comment: 5 pages, 5 figures, Interspeech 2018 accepted manuscrip

arXiv.org e-Print Archive

Crossref

A Large Multi-Target Dataset of Common Bengali Handwritten Graphemes

Author: Alam Samiul
Hasan Mahady
Humayun Ahmed Imtiaz
Rahman Fuad
Reasat Tahsin
Siddiquee Sadi Mohammad
Sushmit Asif Shahriyar
Publication venue
Publication date: 13/01/2021
Field of study

Latin has historically led the state-of-the-art in handwritten optical character recognition (OCR) research. Adapting existing systems from Latin to alpha-syllabary languages is particularly challenging due to a sharp contrast between their orthographies. The segmentation of graphical constituents corresponding to characters becomes significantly hard due to a cursive writing system and frequent use of diacritics in the alpha-syllabary family of languages. We propose a labeling scheme based on graphemes (linguistic segments of word formation) that makes segmentation in-side alpha-syllabary words linear and present the first dataset of Bengali handwritten graphemes that are commonly used in an everyday context. The dataset contains 411k curated samples of 1295 unique commonly used Bengali graphemes. Additionally, the test set contains 900 uncommon Bengali graphemes for out of dictionary performance evaluation. The dataset is open-sourced as a part of a public Handwritten Grapheme Classification Challenge on Kaggle to benchmark vision algorithms for multi-target grapheme classification. The unique graphemes present in this dataset are selected based on commonality in the Google Bengali ASR corpus. From competition proceedings, we see that deep-learning methods can generalize to a large span of out of dictionary graphemes which are absent during training. Dataset and starter codes at www.kaggle.com/c/bengaliai-cv19.Comment: 15 pages, 12 figures, 6 Tables, Submitted to CVPR-2

arXiv.org e-Print Archive

Abugida Normalizer and Parser for Unicode texts

Author: Adib Quazi Adibur Rahman
Ansary Nazmuddoha
Humayun Ahmed Imtiaz
Mehnaz Sazia
Rashid Mohammad Mamun Or
Reasat Tahsin
Sadeque Farig
Sushmit Asif Shahriyar
Publication venue
Publication date: 11/05/2023
Field of study

This paper proposes two libraries to address common and uncommon issues with Unicode-based writing schemes for Indic languages. The first is a normalizer that corrects inconsistencies caused by the encoding scheme https://pypi.org/project/bnunicodenormalizer/ . The second is a grapheme parser for Abugida text https://pypi.org/project/indicparser/ . Both tools are more efficient and effective than previously used tools. We report 400% increase in speed and ensure significantly better performance for different language model based downstream tasks.Comment: 3 pages, 1 figur

arXiv.org e-Print Archive

BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset

Author: Ahmed Intesur
Ansary Md. Nazmuddoha
Chowdhury Sayma Sultana
Dhruvo Shahriar Elahi
Dip Souhardya Saha
Emon Mahfuzur Rahman
Haque Md. Rezwanul
Hasan Md. Rakibul
Hossen Syed Mobassir
Humayun Ahmed Imtiaz
Meghla Marsia Haque
Pavel Akib Hasan
Rakib Fazle Rabbi
Reasat Tahsin
Sadeque Farig
Shihab Md. Istiak Hossain
Sushmit Asif Shahriyar
Publication venue
Publication date: 10/03/2023
Field of study

While strides have been made in deep learning based Bengali Optical Character Recognition (OCR) in the past decade, the absence of large Document Layout Analysis (DLA) datasets has hindered the application of OCR in document transcription, e.g., transcribing historical documents and newspapers. Moreover, rule-based DLA systems that are currently being employed in practice are not robust to domain variations and out-of-distribution layouts. To this end, we present the first multidomain large Bengali Document Layout Analysis Dataset: BaDLAD. This dataset contains 33,695 human annotated document samples from six domains - i) books and magazines, ii) public domain govt. documents, iii) liberation war documents, iv) newspapers, v) historical newspapers, and vi) property deeds, with 710K polygon annotations for four unit types: text-box, paragraph, image, and table. Through preliminary experiments benchmarking the performance of existing state-of-the-art deep learning architectures for English DLA, we demonstrate the efficacy of our dataset in training deep learning based Bengali document digitization models

arXiv.org e-Print Archive