25 research outputs found

    Two-Dimensional Convolutional Recurrent Neural Networks for Speech Activity Detection

    Get PDF
    Speech Activity Detection (SAD) plays an important role in mobile communications and automatic speech recognition (ASR). Developing efficient SAD systems for real-world applications is a challenging task due to the presence of noise. We propose a new approach to SAD where we treat it as a two-dimensional multilabel image classification problem. To classify the audio segments, we compute their Short-time Fourier Transform spectrograms and classify them with a Convolutional Recurrent Neural Network (CRNN), traditionally used in image recognition. Our CRNN uses a sigmoid activation function, max-pooling in the frequency domain, and a convolutional operation as a moving average filter to remove misclassified spikes. On the development set of Task 1 of the 2019 Fearless Steps Challenge, our system achieved a decision cost function (DCF) of 2.89%, a 66.4% improvement over the baseline. Moreover, it achieved a DCF score of 3.318% on the evaluation dataset of the challenge, ranking first among all submissions

    Comparing CNN and Human Crafted Features for Human Activity Recognition

    Get PDF
    Deep learning techniques such as Convolutional Neural Networks (CNNs) have shown good results in activity recognition. One of the advantages of using these methods resides in their ability to generate features automatically. This ability greatly simplifies the task of feature extraction that usually requires domain specific knowledge, especially when using big data where data driven approaches can lead to anti-patterns. Despite the advantage of this approach, very little work has been undertaken on analyzing the quality of extracted features, and more specifically on how model architecture and parameters affect the ability of those features to separate activity classes in the final feature space. This work focuses on identifying the optimal parameters for recognition of simple activities applying this approach on both signals from inertial and audio sensors. The paper provides the following contributions: (i) a comparison of automatically extracted CNN features with gold standard Human Crafted Features (HCF) is given, (ii) a comprehensive analysis on how architecture and model parameters affect separation of target classes in the feature space. Results are evaluated using publicly available datasets. In particular, we achieved a 93.38% F-Score on the UCI-HAR dataset, using 1D CNNs with 3 convolutional layers and 32 kernel size, and a 90.5% F-Score on the DCASE 2017 development dataset, simplified for three classes (indoor, outdoor and vehicle), using 2D CNNs with 2 convolutional layers and a 2x2 kernel size

    Audio-based Event Recognition System for Smart Homes

    Get PDF
    Building an acoustic-based event recognition system for smart homes is a challenging task due to the lack of high-level structures in environmental sounds. In particular, the selection of effective features is still an open problem. We make an important step toward this goal by showing that the combination of Mel-Frequency Cepstral Coefficients, Zero- Crossing Rate, and Discrete Wavelet Transform features can achieve an F1 score of 96.5% and a recognition accuracy of 97.8% with a gradient boosting classifier for ambient sounds recorded in a kitchen environment

    Energy-based decision engine for household human activity recognition

    Get PDF
    We propose a framework for energy-based human activity recognition in a household environment. We apply machine learning techniques to infer the state of household appliances from their energy consumption data and use rulebased scenarios that exploit these states to detect human activity. Our decision engine achieved a 99.1% accuracy for real-world data collected in the kitchens of two smart homes

    Correlation and co-estimation of US indices of cardiac function with MRI measurements of iron deposition in the heart of patients with thalassemia

    No full text
    Background and objectives: Despite advances in survival in patients with thalassemia major (TM) the most common cause of death is cardiac disease. Regular cardiac follow-up is imperative in order to identify and reverse pathology. Cardiac Magnetic Resonance (CMR) and Echocardiography (US) are applied in parallel to TM patients for cardiac evaluation and ongoing monitoring. Moreover cardiac iron load can be indirectly quantified by CMR’s T2*. However CMR accessibility is limited, whereas US is relatively inexpensive and readily available. The objectives were to find US parameters that may be useful for predicting cardiac iron and to assess the accuracy and reliability of the two methods, with a particular focus on routine US application. Design and methods: We correlated a number of parameters derived from US to CMR’s T2* score in 142 TM patients and compared common volumetric measurements between the two techniques Results: All patients with decreased left ventricular (LV) shortening fraction (LVSF ? 30%) had cardiac iron overload (T2* ? 15 ms). After removing these patients from the analysis, the total diameter index (Tdi) >5.57 cm/m2, left atrial diameter index >2.41 cm/m2, and the diastolic parameter E/A >1.96 were highly specific (91.4%, 97.1% and 96.9% respectively), but had low sensitivity (31.8%, 20.5% and 21.8%), in predicting iron load. A right ventricular index >1.47 cm/m2, LV systolic index >2.26 cm/m2 or Tdi >6.26 cm/m2 discriminated between patients with severe load from those with none to moderate cardiac iron load, with specificity of 91%, 98.5%, and 98.5%, respectively, but again with low sensitivity. In addition while comparing the common volumetric parameters between CMR and US (Teichholz's M-mode formula) we found that the correlation for the ejection fraction (EF) was acceptable (r=0.60) without a statistically significant difference (p=0.37) and the Bland & Altman plot range was narrow (25.8%) Interpretation and conclusions: US parameters for cardiac iron overload prediction have limited value, whereas CMR is essential in assessing cardiac iron. However, patients with decreased LVSF (? 30%) should be considered a priori as having cardiac iron overload (T2* ? 15 ms), and chelation therapy should be intensified. This also applies to patients who meet the above-described US criterion values, whenever CMR is not available. Once a patient is found by CMR to have cardiac iron overload, then the fore-mentioned US criterion values may be useful for ongoing monitoring. US also offers an adequate EF estimation with Teichholz's M-mode formula for routine use, especially when monitoring gross alterations in cardiac function over time, and is easy to perform compared to the high resolution CMR technique.Υπόβαθρο και σκοποί: Παρά τις εξελίξεις στην επιβίωση των ασθενών με β-μεσογειακή αναιμία (β-ΜΑ) η πιο συχνή αιτία θανάτου παραμένει η καρδιακή νόσος. Η τακτική καρδιολογική παρακολούθηση είναι επιτακτική προκειμένου να αναγνωριστεί και να αναστραφεί η παθολογία. Ο καρδιακός μαγνητικός συντονισμός (CMR) και η υπερηχοκαρδιογραφία (US) εφαρμόζονται παράλληλα σε ασθενείς με β-ΜΑ για την καρδιολογική αξιολόγηση και συνεχή παρακολούθησή τους. Επιπλέον το φορτίο καρδιακού σιδήρου μπορεί να προσδιοριστεί έμμεσα με το Τ2* του CMR. Ωστόσο η πρόσβαση στο CMR είναι περιορισμένη, ενώ το US είναι σχετικά ανέξοδο και άμεσα διαθέσιμο. Σκοποί της μελέτης ήταν η ανεύρεση US παραμέτρων που μπορεί να είναι χρήσιμες στην πρόβλεψη του καρδιακού σιδήρου και η εκτίμηση της ακρίβειας και της αξιοπιστίας των δυο μεθόδων, με ιδιαίτερη εστίαση στην καθ' ημέρα εφαρμογή του US. Σχεδιασμός και μέθοδοι: Έγινε συσχέτιση ενός αριθμού US παραμέτρων με το T2* score του CMR σε 142 ασθενείς με β-ΜΑ και σύγκριση των κοινών ογκομετρικών μετρήσεων μεταξύ των δυο μεθόδων. Αποτελέσματα: Όλοι οι ασθενείς με μειωμένη λειτουργία της αριστερής κοιλίας (LV) (LVSF ? 30%) είχαν υπερφόρτωση καρδιακού σιδήρου (T2* ? 15 ms). Μετά την απομάκρυνση αυτών των ασθενών από την ανάλυση, ο ολικός καρδιακός δείκτης (Tdi) >5.57 cm/m2, ο δείκτης του αριστερού κόλπου >2.41 cm/m2 και η διαστολική παράμετρος Ε/Α >1.96 ήταν πολύ ειδικά (91.4%, 97.1% και 96.9% αντίστοιχα), αλλά είχαν χαμηλή ευαισθησία (31.8%, 20.5% και 21.8%), στην πρόβλεψη του φορτίου σιδήρου. Ένας δείκτης της δεξιάς κοιλίας >1.47 cm/m2, ένας τελοσυστολικός δείκτης της LV >2.26 cm/m2 ή ένα Tdi >6.26 cm/m2 ξεχώρισαν τους βαριά φορτωμένους από τους άνευ έως μέτρια φορτωμένους ασθενείς με ειδικότητες 91%, 98.5% και 98.5% αντίστοιχα, αλλά πάλι με χαμηλή ευαισθησία. Επιπρόσθετα κατά την σύγκριση των κοινών ογκομετρικών παραμέτρων μεταξύ του CMR και του US (M-mode τύπος του Teichholz) βρέθηκε ότι η συσχέτιση για το κλάσμα εξώθησης (EF) ήταν αποδεκτή (r=0.60) χωρίς στατιστικά σημαντική διαφορά (p=0.37) και το εύρος της αποτύπωσης κατά Bland & Altman ήταν στενό (25.8%) Ερμηνεία και συμπεράσματα: Οι US παράμετροι για την πρόβλεψη της υπερφόρτωσης καρδιακού σιδήρου έχουν περιορισμένη αξία, ενώ το CMR είναι ουσιώδες στην εκτίμηση του καρδιακού σιδήρου. Εν τούτοις, ασθενείς με μειωμένο LVSF (? 30%) θα πρέπει να θεωρούνται εκ των προτέρων ότι έχουν υπερφόρτωση καρδιακού σιδήρου (T2* ? 15 ms) και η θεραπεία αποσιδήρωσης θα πρέπει να εντατικοποιείται. Αυτό ισχύει και για ασθενείς που τηρούν τις προαναφερθείσες τιμές US κριτηρίου, όταν το CMR δεν είναι διαθέσιμο. Άπαξ και ένας ασθενής βρεθεί με το CMR να έχει υπερφόρτωση καρδιακού σιδήρου τα κριτήρια αυτά του US μπορούν να χρησιμοποιηθούν για την συνεχή παρακολούθηση. Το US προσφέρει επίσης μια επαρκή εκτίμηση του EF με τον M-mode τύπο του Teichholz για καθημερινή χρήση, ειδικά στην καταγραφή αδρών μεταβολών στην καρδιακή λειτουργία με την πάροδο του χρόνου, που είναι εύκολα επιτελέσιμη σε σύγκριση με την υψηλής ευκρίνειας τεχνική του CMR

    Comparison of echocardiographic (US) volumetry with cardiac magnetic resonance (CMR) imaging in transfusion dependent thalassemia major (TM)

    No full text
    Abstract Background Despite advances in survival in patients with thalassemia major (TM) the most common cause of death is cardiac disease. Regular cardiac follow-up is imperative in order to identify and reverse pathology. Cardiac Magnetic Resonance (CMR) and Echocardiography (US) are applied in parallel to TM patients for cardiac evaluation and ongoing monitoring. A comparison between mutual features would be useful in order to assess the accuracy and reliability of the two methods, with a particular focus on routine US application. TM's special attributes offer an excellent opportunity for cardiac imaging research that has universal general purpose applications. Methods 135 TM patients underwent US (Teichholz's M-mode formula – rapidly accessible means of measuring volumes and ejection fraction) and CMR volumetry. Paired-samples t-test, Passing & Badlock regression and Bland & Altman plot were used while comparing the common parameters between the CMR and the US. Results We found that the US volumes were underestimated, especially the end-diastolic volume (p Conclusion In cases where cardiac wall movement abnormalities are absent, the US Teichholz's M-mode formula for volume measurements, though less sophisticated in comparison to the high resolution CMR technique, offers an adequate ejection fraction estimation for routine use, especially when monitoring gross alterations in cardiac function over time, and is easy to perform.</p

    Feature learning for Human Activity Recognition using Convolutional Neural Networks:A case study for Inertial Measurement Unit and Audio data

    Get PDF
    open access articleThe use of Convolutional Neural Networks (CNNs) as a feature learning method for Human Activity Recognition (HAR) is becoming more and more common. Unlike conventional machine learning methods, which require domain-specific expertise, CNNs can extract features automatically. On the other hand, CNNs require a training phase, making them prone to the cold-start problem. In this work, a case study is presented where the use of a pre-trained CNN feature extractor is evaluated under realistic conditions. The case study consists of two main steps: (1) different topologies and parameters are assessed to identify the best candidate models for HAR, thus obtaining a pre-trained CNN model. The pre-trained model (2) is then employed as feature extractor evaluating its use with a large scale real-world dataset. Two CNN applications were considered: Inertial Measurement Unit (IMU) and audio based HAR. For the IMU data, balanced accuracy was 91.98% on the UCI-HAR dataset, and 67.51% on the real-world Extrasensory dataset. For the audio data, the balanced accuracy was 92.30% on the DCASE 2017 dataset, and 35.24% on the Extrasensory dataset
    corecore