Search CORE

13 research outputs found

Acoustic Analysis of Infant Cry Signals

Author: Naithani Gaurav
Publication venue
Publication date: 03/06/2015
Field of study

Crying is the first means of communication for an infant through which it expresses its physiological and psychological needs. Infant cry analysis is the investigation of infant cry vocalizations in order to extract social and communicative information about infant behavior, and diagnostic information about infant health. This thesis is part of a larger study whose objective is to analyze the acoustic properties of infant cry signals and use it for early assessment of neurological developmental issues in infants. This thesis deals with two research problems in the context of infant cry signals: audio segmentation of cry recordings in order to extract relevant acoustic parts, and fundamental frequency (F0) estimation of the extracted acoustic regions. The extracted acoustic regions are relevant for extracting parameters useful for drawing correlation with developmental outcomes of the infants. Fundamental frequency (F0) is one such potentially useful parameter whose variation has been found to correlate with cases of neurological insults in infants. The cry recordings are captured in realistic hospital environments under varied contexts like infant crying out of hunger, pain etc. A hidden Markov model (HMM) based audio segmentation system is proposed. The performance of the system is evaluated for different configurations of HMM states, number of component Gaussians, and using different combinations of audio features. Frame based accuracy of 88.5 % is achieved. YIN algorithm, a popular F0 estimation algorithm, is utilized to deal with the fundamental frequency estimation problem, and a method to discard unreliable F0 estimates is suggested. The statistics associated with distribution of F0 estimates corresponding to different components of cry signals are reported. This work would be followed up to find meaningful correlations between extracted F0 estimates and developmental outcomes of the infants. Moreover, other acoustic parameters would also be investigated for the same purpose

Trepo - Institutional Repository of Tampere University

Deep neural network Based Low-latency Speech Separation with Asymmetric analysis-Synthesis Window Pair

Author: Naithani Gaurav
Politis Archontis
Virtanen Tuomas
Wang Shanshan
Publication venue
Publication date: 01/01/2021
Field of study

Time-frequency masking or spectrum prediction computed via short symmetric windows are commonly used in low-latency deep neural network (DNN) based source separation. In this paper, we propose the usage of an asymmetric analysis-synthesis window pair which allows for training with targets with better frequency resolution, while retaining the low-latency during inference suitable for real-time speech enhancement or assisted hearing applications. In order to assess our approach across various model types and datasets, we evaluate it with both speaker-independent deep clustering (DC) model and a speaker-dependent mask inference (MI) model. We report an improvement in separation performance of up to 1.5 dB in terms of source-to-distortion ratio (SDR) while maintaining an algorithmic latency of 8 ms.Comment: Accepted to EUSIPCO-202

arXiv.org e-Print Archive

Trepo - Institutional Repository of Tampere University

Dynamic Processing Neural Network Architecture For Hearing Loss Compensation

Author: Bramsløw Lars
Drgas Szymon
Naithani Gaurav
Politis Archontis
Virtanen Tuomas
Publication venue
Publication date: 25/10/2023
Field of study

This paper proposes neural networks for compensating sensorineural hearing loss. The aim of the hearing loss compensation task is to transform a speech signal to increase speech intelligibility after further processing by a person with a hearing impairment, which is modeled by a hearing loss model. We propose an interpretable model called dynamic processing network, which has a structure similar to band-wise dynamic compressor. The network is differentiable, and therefore allows to learn its parameters to maximize speech intelligibility. More generic models based on convolutional layers were tested as well. The performance of the tested architectures was assessed using spectro-temporal objective index (STOI) with hearing-threshold noise and hearing aid speech intelligibility (HASPI) metrics. The dynamic processing network gave a significant improvement of STOI and HASPI in comparison to popular compressive gain prescription rule Camfit. A large enough convolutional network could outperform the interpretable model with the cost of larger computational load. Finally, a combination of the dynamic processing network with convolutional neural network gave the best results in terms of STOI and HASPI

arXiv.org e-Print Archive

Automatic segmentation of infant cry signals using hidden Markov models

Author: A Branco
A Fort
AM Chilosi
AM Goberman
B Reggiannini
BM Lester
C Manfredi
D Lederman
D Reynolds
F Cunha
FL Porter
G Esposito
Gaurav Naithani
H Kawahara
HL Golub
J-J Aucouturier
Jaana Kivinummi
Jukka M. Leppänen
K Michelsson
K Wermke
KE Pape
L Abou-Abbas
L Abou-Abbas
L Rabiner
L Rabiner
L Rautava
LL LaGasse
M Gales
M Hadders-Algra
Mikko J. Peltola
MJ Corwin
Outi Tammela
P Zeskind
PS Douglas
S Davis
S Orlandi
S Orlandi
SJ Young
SM Grau
Tuomas Virtanen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Stochastic landslide vulnerability modeling in space and time in a part of the northern Himalayas, India

Author: A Ebert
A Papoulis
AK Naithani
AK Saha
Alfred Stein
AM Kaynia
Arunabha Bagchi
CJ Westen Van
CJ Westen van
D Hosmer
DJ Varnes
F Guzzetti
FC Dai
Gaurav Kumar
I Das
I Das
Iswar Das
J Birkmann
J Remondo
K Vinod Kumar
K Vinod Kumar
M Galli
MP Kohle
NC Agarwal
NRSA
SHK Fuchs
T Glade
Vinay K. Dadhwal
VM Choubey
W Roberds
X Liu
X Liu
X Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Acoustic Analysis of Infant Cry Signals

Author: Naithani Gaurav
Publication venue
Publication date: 03/06/2015
Field of study

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

TUT DPub

A Competing Voices Test for Hearing-Impaired Listeners Applied to Spatial Separation and Ideal Time-Frequency Masks

Author: Gaurav Naithani
Lars Bramsløw
Marianna Vatti
Niels Henrik Pontoppidan
Rikke Rossing
Publication venue: 'SAGE Publications'
Publication date: 01/01/2019
Field of study

People with hearing impairment find competing voices scenarios to be challenging, both with respect to switching attention from one talker to the other, as well as maintaining attention. With the Danish competing voices test (CVT) presented here, the dual-attention skills can be assessed. The CVT provides sentences spoken by three male and three female talkers, played in sentence pairs. The task of the listener is to repeat the target sentence from the sentence pair based on cueing either before or after playback. One potential way of assisting segregation of two talkers is to take advantage of spatial unmasking by presenting one talker per ear after application of time-frequency masks for separating the mixture. Using the CVT, this study evaluated four spatial conditions in 14 moderate-to-severely hearing-impaired listeners to establish benchmark results for this type of algorithm applied to hearing-impaired listeners. The four spatial conditions were as follows: summed (diotic), separate, the ideal ratio mask, and the ideal binary mask. The results show that the test is sensitive to the change in spatial condition. The temporal position of the cue has a large impact, as cueing the target talker before playback focuses the attention toward the target, whereas cueing after playback requires equal attention to the two talkers, which is more difficult. Furthermore, both applied ideal masks show test scores very close to the ideal separate spatial condition, suggesting that this technique is useful for future separation algorithms using estimated rather than ideal masks

Directory of Open Access Journals

Trepo - Institutional Repository of Tampere University