6 research outputs found
Rancang Bangun Aplikasi Pendeteksi Suara Tangisan Bayi
Suara tangisan bayi merupakan sebuah tanda dari bayi yang mengalami suatu masalah. Namun, tidak semua orang dapat mengenali arti tangis bayi. Beberapa penelitian tentang deteksi suara tangis bayi sudah dilakukan oleh beberapa peneliti, namun saat ini masih belum ada penelitian yang membuat sebuah aplikasi pendeteksi suara tangis bayi berbasis web. Pada penelitian ini, sebuah aplikasi dibuat untuk membantu pengguna mengenali suara tangis bayi berbasis Dunstan Baby Language. Metode yang diterapkan adalah ekstraksi fitur suara tangis bayi dengan algoritma Mel-Frequency Cepstrum Coefficient (MFCC), normalisasi hasil ekstraksi fitur, dan klasifikasi K-nearest Neighbor. Dari berbagai pengujian yang dilakukan, dapat disimpulkan bahwa akurasi rata-rata terbaik sebesar 75,95% dapat dicapai ketika menggunakan parameter wintime pada ekstraksi fitur MFCC sebesar 0,08 detik, proporsi data latih 85% dan data uji 15% dari setiap kelas, normalisasi ekstraksi fitur dengan Standard Deviation Normalization, dan klasifikasi K-nearest Neighbor dengan k=1. Pada pengujian aplikasi dengan seluruh data, akurasi rata-rata yang sebesar 96,57% dapat dicapai ketika menggunakan parameter wintime pada ekstraksi fitur MFCC sebesar 0,08 detik, proporsi data latih 85% setiap kelas, normalisasi ekstraksi fitur dengan Standard Deviation Normalization, dan klasifikasi K-nearest Neighbor dengan k=1
Application of Pattern Recognition Techniques to the Classification of Full-Term and Preterm Infant Cry
Objectives: Scientific and clinical advances in perinatology and neonatology have enhanced the chances of survival of preterm and very low weight neonates. Infant cry analysis is a suitable noninvasive complementary tool to assess the neurologic state of infants particularly important in the case of preterm neonates. This article aims at exploiting differences between full-term and preterm infant cry with robust automatic acoustical analysis and data mining techniques. Study design: Twenty-two acoustical parameters are estimated in more than 3000 cry units from cry recordings of 28 full-term and 10 preterm newborns. Methods: Feature extraction is performed through the BioVoice dedicated software tool, developed at the Biomedical Engineering Lab, University of Firenze, Italy. Classification and pattern recognition is based on genetic algorithms for the selection of the best attributes. Training is performed comparing four classifiers: Logistic Curve, Multilayer Perceptron, Support Vector Machine, and Random Forest and three different testing options: full training set, 10-fold cross-validation, and 66% split. Results: Results show that the best feature set is made up by 10 parameters capable to assess differences between preterm and full-term newborns with about 87% of accuracy. Best results are obtained with the Random Forest method (receiver operating characteristic area, 0.94). Conclusions: These 10 cry features might convey important additional information to assist the clinical specialist in the diagnosis and follow-up of possible delays or disorders in the neurologic development due to premature birth in this extremely vulnerable population of patients. The proposed approach is a first step toward an automatic infant cry recognition system for fast and proper identification of risk in preterm babies
Rancang Bangun Aplikasi Pendeteksi Suara Tangisan Bayi
mengalami suatu masalah. Suara tangisan tersebut dapat
digunakan untuk mengidentifikasi masalah pada bayi, seperti
kelaparan, kesakitan, rasa kantuk, rasa tidak nyaman,
kedingingan atau kepanasan, dan lain-lain. Namun, tidak
semua orang dapat mengenali arti tangis bayi tersebut.
Beberapa penelitian tentang deteksi suara tangis bayi
sudah dilakukan oleh beberapa peneliti, namun saat ini masih
belum ada penelitian yang membuat sebuah aplikasi pendeteksi
suara tangis bayi berbasis web. Pada Tugas Akhir ini, sebuah
aplikasi dibuat untuk membantu pengguna mengenali suara
tangis bayi berbasis Dunstan Baby Language. Aplikasi
dikembangkan dengan bahasa pemrograman R versi 3.3.1 dan
package tuneR untuk ekstraksi fitur Mel-Frequency Cepstrum
Coefficient (MFCC). Metode yang diterapkan pada aplikasi ini
adalah ekstraksi fitur suara tangis bayi dengan algoritma
MFCC, normalisasi hasil ekstraksi fitur, dan klasifikasi Knearest
Neighbor.
Dari berbagai pengujian yang dilakukan, dapat
disimpulkan bahwa akurasi rata-rata terbaik sebesar 75,95%
dapat dicapai ketika menggunakan parameter wintime pada
ekstraksi fitur MFCC sebesar 0,08 detik, proporsi data latih
85% dan data uji 15% dari setiap kelas, normalisasi ekstraksi
fitur dengan Standard Deviation Normalization, dan klasifikasi
viii
K-nearest Neighbor dengan k=1. Hasil ini lebih baik jika
dibandingkan dengan metode klasifikasi lain seperti Naive
Bayes, Neural Network, maupun Support Vector Machine.
Penggunaan proporsi data latih berdasarkan persentase data
masing-masing jenis tangis bayi menghasilkan akurasi
klasifikasi yang lebih baik jika dibandingkan dengan
penggunaan jumlah data latih seimbang dari masing-masing
jenis tangis bayi.
Pada pengujian aplikasi dengan seluruh data, akurasi
rata-rata yang sebesar 96,57% dapat dicapai ketika
menggunakan parameter wintime pada ekstraksi fitur MFCC
sebesar 0,08 detik, proporsi data latih 85% setiap kelas,
normalisasi ekstraksi fitur dengan Standard Deviation
Normalization, dan klasifikasi K-nearest Neighbor dengan k=1.
Dari pengujian tersebut, dapat disimpulkan bahwa aplikasi
telah berjalan dengan baik saat mengklasifikasi seluruh data
tangis bayi.
===========================================================
An infant crying sound is a sign of an infant who has
experienced a problem. The sound of crying can be used to
identify problems of infant, such as hunger, pain, drowsiness,
discomfort, cold or heat, and the others. However, not everyone
can recognize the meaning infant's cry.
Several studies of the sound of crying infant detection
had been done by some researchers, but currently there is still
no research that develop an infant cry detection applications
based on web. In this final project, an application is developed
to help users identify the sound of crying infant based on
Dunstan Baby Language. Application is developed with R
programming language version 3.3.1 and tuneR package for
Mel-Frequency Cepstrum Coefficient (MFCC) feature
extraction. The method applied in this application are voice
feature extraction algorithm of crying infant with MFCC,
normalization of feature extraction result, and K-nearest
neighbor classification.
From the various tests performed, it can be concluded
that highest average accuracy 75,95% can be obtained by using
parameters consist of wintime=0,08 seconds in MFCC feature
extraction, 85% the proportion of training data and test data
15% of any class, feature extraction normalization by Standard
Deviation Normalization, and k=1 K-nearest Neighbor
x
classification. This result is better than the other classification
methods such as Naive Bayes, Neural Networks, and Support
Vector Machine. The use of training data proportion from each
type crying infants data based on the percentage resulted in
better classification accuracy when compared with the use of
training data from each type crying infants data based on
balanced amount.
In application testing test by using all data, average
accuracy 96,57% can obtained by using parameters consist of
wintime=0,08 seconds in MFCC feature extraction, 85% of
training data proportion of any class, feature extraction
normalization by Standard Deviation Normalization, and k=1
K-nearest Neighbor classification. From the test, it can be
concluded that the application has been running well when
classifying all infant cry data
Infant Cry Signal Processing, Analysis, and Classification with Artificial Neural Networks
As a special type of speech and environmental sound, infant cry has been a growing research area covering infant cry reason classification, pathological infant cry identification, and infant cry detection in the past two decades. In this dissertation, we build a new dataset, explore new feature extraction methods, and propose novel classification approaches, to improve the infant cry classification accuracy and identify diseases by learning infant cry signals.
We propose a method through generating weighted prosodic features combined with acoustic features for a deep learning model to improve the performance of asphyxiated infant cry identification. The combined feature matrix captures the diversity of variations within infant cries and the result outperforms all other related studies on asphyxiated baby crying classification. We propose a non-invasive fast method of using infant cry signals with convolutional neural network (CNN) based age classification to diagnose the abnormality of infant vocal tract development as early as 4-month age. Experiments discover the pattern and tendency of the vocal tract changes and predict the abnormality of infant vocal tract by classifying the cry signals into younger age category. We propose an approach of generating hybrid feature set and using prior knowledge in a multi-stage CNNs model for robust infant sound classification. The dominant and auxiliary features within the set are beneficial to enlarge the coverage as well as keeping a good resolution for modeling the diversity of variations within infant sound and the experimental results give encouraging improvements on two relative databases. We propose an approach of graph convolutional network (GCN) with transfer learning for robust infant cry reason classification. Non-fully connected graphs based on the similarities among the relevant nodes are built to consider the short-term and long-term effects of infant cry signals related to inner-class and inter-class messages. With as limited as 20% of labeled training data, our model outperforms that of the CNN model with 80% labeled training data in both supervised and semi-supervised settings. Lastly, we apply mel-spectrogram decomposition to infant cry classification and propose a fusion method to further improve the infant cry classification performance
Non Invasive Tools for Early Detection of Autism Spectrum Disorders
Autism Spectrum Disorders (ASDs) describe a set of neurodevelopmental disorders. ASD represents a significant public health problem. Currently, ASDs are not diagnosed before the 2nd year of life but an early identification of ASDs would be crucial as interventions are much more effective than specific therapies starting in later childhood. To this aim, cheap an contact-less automatic approaches recently aroused great clinical interest. Among them, the cry and the movements of the newborn, both involving the central nervous system, are proposed as possible indicators of neurological disorders. This PhD work is a first step towards solving this challenging problem.
An integrated system is presented enabling the recording of audio (crying) and video (movements) data of the newborn, their automatic analysis with innovative techniques for the extraction of clinically relevant parameters and their classification with data mining techniques. New robust algorithms were developed for the selection of the voiced parts of the cry signal, the estimation of acoustic parameters based on the wavelet transform and the analysis of the infant’s general movements (GMs) through a new body model for segmentation and 2D reconstruction. In addition to a thorough literature review this thesis presents the state of the art on these topics that shows that no studies exist concerning normative ranges for newborn infant cry in the first 6 months of life nor the correlation between cry and movements.
Through the new automatic methods a population of control infants (“low-risk”, LR) was compared to a group of “high-risk” (HR) infants, i.e. siblings of children already diagnosed with ASD. A subset of LR infants clinically diagnosed as newborns with Typical Development (TD) and one affected by ASD were compared. The results show that the selected acoustic parameters allow good differentiation between the two groups. This result provides new perspectives both diagnostic and therapeutic
Vocal behaviour as an indicator of lamb vigour
The viability and survival of the neonate lamb relies on its ability to communicate and maintain a strong attachment with its dam. To date there has been little concise information available about the role of the lamb's behaviour, and in particular the importance of acoustic cues, in this relationship as greater attention has been focused on maternal attributes important in facilitating the maternal-young bond. In human and rodent neonates, acoustic features of the distress vocalisation are used as indices of neurological deficit and integrity both at birth and in infant acoustic cry analysis. The aim of this thesis was to investigate potential behavioural indicators of lamb vigour, with a particular focus on vocal behaviour, within the first 12 hours of life. Such measures could provide valuable information for development of reproductive breeding objectives, and provide clarity regarding the role of the lamb in failed maternal-young interactions. Delayed vocalisation initiation in response to a separation stimulus was found to be associated with poor vigour-related behaviour reflecting the capacity of the lamb to reunite and follow the dam over 12 hours postpartum. Vocalisation delay was also associated with risk factors related to poor lamb survival including longer parturition duration, male sex, first parity, heavier birth weight and sire-related conformational attributes likely to result in a more difficult birth. Blood assay markers reflecting fetal distress including poor blood oxygenation, and elevated plasma glucose and lactate levels sampled at birth were also demonstrated to be correlated with vocalisation latency. These associations were concluded to reflect impacts on the lamb's neurological system rather than genetic influences because of evidence provided by within-litter comparisons, and to demonstrate neuroregenerative processes over a 12 hour measurement period. An analysis of lamb distress signals modelled on acoustic cry analysis of the human neonate was also undertaken to compare vocalisation characteristics of lambs with delayed responses to those with rapid responses indicating vigour. Signal features of delayed response lambs were more likely to demonstrate acoustic parameters reflecting glottal instability, lower amplitude and reduced repetition rate. These lambs were more likely to emit inefficient or inappropriate signals in the context of isolation. A significantly higher fundamental frequency, an indicator of pathology in the human infant, was not clearly demonstrated to be associated with compromised lambs in this study. It was also found in a two-choice test, where sheep dams were required to demonstrate a preference for signals of their own co-twins, that ewes preferred acoustic signals of lambs correlated with rapid vocalisation response, higher pitch and greater signal stability. The results indicate that delayed vocalisation responsiveness and other acoustic measures are associated with fetal compromise in the neonate lamb, as shown in the human and rodent models. It was concluded that delayed vocal initiation is a marker for poor postnatal outcome characterised by diminished responsiveness to a distress condition. This research has important implications for understanding failed maternal-young relationships and the consequences for survival in mammalian neonates