13 research outputs found

    Towards Arabic Alphabet and Numbers Sign Language Recognition

    Get PDF
    This paper proposes to develop a new Arabic sign language recognition using Restricted Boltzmann Machines and a direct use of tiny images. Restricted Boltzmann Machines are able to code images as a superposition of a limited number of features taken from a larger alphabet. Repeating this process in deep architecture (Deep Belief Networks) leads to an efficient sparse representation of the initial data in the feature space. A complex problem of classification in the input space is thus transformed into an easier one in the feature space. After appropriate coding, a softmax regression in the feature space must be sufficient to recognize a hand sign according to the input image. To our knowledge, this is the first attempt that tiny images feature extraction using deep architecture is a simpler alternative approach for Arabic sign language recognition that deserves to be considered and investigated

    Robot de reconnaissance des lieux sémantiques basée sur l'architecture profonde et une utilisation directe de mini-images

    No full text
    Usually, human beings are able to quickly distinguish between different places, solely from their visual appearance. This is due to the fact that they can organize their space as composed of discrete units. These units, called ``semantic places'', are characterized by their spatial extend and their functional unity. Such a semantic category can thus be used as contextual information which fosters object detection and recognition. Recent works in semantic place recognition seek to endow the robot with similar capabilities. Contrary to classical localization and mapping works, this problem is usually addressed as a supervised learning problem. The question of semantic places recognition in robotics - the ability to recognize the semantic category of a place to which scene belongs to - is therefore a major requirement for the future of autonomous robotics. It is indeed required for an autonomous service robot to be able to recognize the environment in which it lives and to easily learn the organization of this environment in order to operate and interact successfully. To achieve that goal, different methods have been already proposed, some based on the identification of objects as a prerequisite to the recognition of the scenes, and some based on a direct description of the scene characteristics. If we make the hypothesis that objects are more easily recognized when the scene in which they appear is identified, the second approach seems more suitable. It is however strongly dependent on the nature of the image descriptors used, usually empirically derived from general considerations on image coding.Compared to these many proposals, another approach of image coding, based on a more theoretical point of view, has emerged the last few years. Energy-based models of feature extraction based on the principle of minimizing the energy of some function according to the quality of the reconstruction of the image has lead to the Restricted Boltzmann Machines (RBMs) able to code an image as the superposition of a limited number of features taken from a larger alphabet. It has also been shown that this process can be repeated in a deep architecture, leading to a sparse and efficient representation of the initial data in the feature space. A complex problem of classification in the input space is thus transformed into an easier one in the feature space. This approach has been successfully applied to the identification of tiny images from the 80 millions image database of the MIT. In the present work, we demonstrate that semantic place recognition can be achieved on the basis of tiny images instead of conventional Bag-of-Word (BoW) methods and on the use of Deep Belief Networks (DBNs) for image coding. We show that after appropriate coding a softmax regression in the projection space is sufficient to achieve promising classification results. To our knowledge, this approach has not yet been investigated for scene recognition in autonomous robotics. We compare our methods with the state-of-the-art algorithms using a standard database of robot localization. We study the influence of system parameters and compare different conditions on the same dataset. These experiments show that our proposed model, while being very simple, leads to state-of-the-art results on a semantic place recognition task.Il est généralement facile pour les humains de distinguer rapidement différents lieux en se basant uniquement sur leur aspect visuel. . Ces catégories sémantiques peuvent être utilisées comme information contextuelle favorisant la détection et la reconnaissance d'objets. Des travaux récents en reconnaissance des lieux visent à doter les robots de capacités similaires. Contrairement aux travaux classiques, portant sur la localisation et la cartographie, cette tâche est généralement traitée comme un problème d'apprentissage supervisé.La reconnaissance de lieux sémantiques - la capacité à reconnaître la catégorie sémantique à laquelle une scène appartient – peut être considérée comme une condition essentielle en robotique autonome. Un robot autonome doit en effet pouvoir apprendre facilement l'organisation sémantique de son environnement pour pouvoir fonctionner et interagir avec succès. Pour atteindre cet objectif, différentes méthodes ont déjà été proposées. Certaines sont basées sur l'identification des objets comme une condition préalable à la reconnaissance des scènes, et d'autres fondées sur une description directe des caractéristiques de la scène. Si nous faisons l'hypothèse que les objets sont plus faciles à reconnaître quand la scène dans laquelle ils apparaissent est bien identifiée, la deuxième approche semble plus appropriée. Elle est cependant fortement dépendante de la nature des descripteurs d'images utilisées qui sont généralement dérivés empiriquement a partir des observations générales sur le codage d'images.En opposition avec ces propositions, une autre approche de codage des images, basée sur un point de vue plus théorique, a émergé ces dernières années. Les modèles d'extraction de caractéristiques fondés sur le principe de la minimisation d'une fonction d'énergie en relation avec un modèle statistique génératif expliquant au mieux les données, ont abouti à l'apparition des Machines de Boltzmann Restreintes (Rectricted Boltzmann Machines : RBMs) capables de coder une image comme la superposition d'un nombre limité de caractéristiques extraites à partir d'un plus grand alphabet. Il a été montré que ce processus peut être répété dans une architecture plus profonde, conduisant à une représentation parcimonieuse et efficace des données initiales dans l'espace des caractéristiques. Le problème complexe de la classification dans l'espace de début est ainsi remplacé par un problème plus simple dans l'espace des caractéristiques.Dans ce travail, nous montrons que la reconnaissance sémantiques des lieux peut être réalisée en considérant des mini-images au lieu d'approches plus classiques de type ''sacs-de-mots'' et par l'utilisation de réseaux profonds pour le codage des images. Après avoir realisé un codage approprié, une régression softmax dans l'espace de projection est suffisante pour obtenir des résultats de classification prometteurs. A notre connaissance, cette approche n'a pas encore été proposée pour la reconnaissance de scène en robotique autonome.Nous avons comparé nos méthodes avec les algorithmes de l'état-de-l'art en utilisant une base de données standard de localisation de robot. Nous avons étudié l'influence des paramètres du système et comparé les différentes conditions sur la même base de données. Les expériences réalisées montrent que le modèle que nous proposons, tout en étant très simple, conduit à des résultats comparables à l'état-de-l'art sur une tâche de reconnaissance de lieux sémantiques

    A Recommendation System for Selecting the Appropriate Undergraduate Program at Higher Education Institutions Using Graduate Student Data

    No full text
    Selecting the appropriate undergraduate program is a critical decision for students. Many elements influence this choice for secondary students, including financial, social, demographic, and cultural factors. If a student makes a poor choice, it will have implications for their academic life as well as their professional life. These implications may include having to change their major, which will cause a delay in their graduation, having a low grade-point average (GPA) in their chosen major, which will cause difficulties in finding a job, or even dropping out of university. In this paper, various supervised machine learning techniques, including Decision Tree, Random Forest, and Support Vector Machine, were investigated to predict undergraduate majors. The input features were related to the student’s academic history and the job market. We were able to recommend the program that guarantees both a high academic degree and employment, depending on previous data and experience, for Master of Business Administration (MBA) students. This research was conducted based on a published research and using the same dataset and aimed to improve the results by applying hyper-tuning, which was absent in previous research. The obtained results showed that our work outperformed the work of the published research, where the random forest exceeded the other classification techniques and reached an accuracy of 97.70% compared to 75.00% on the published research. The importance of features was also investigated, and it was found that the degree percentage, MBA percentage, and entry test result were the top contributing features to the model

    Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features

    No full text
    Early diagnosis of medical conditions in infants is crucial for ensuring timely and effective treatment. However, infants are unable to verbalize their symptoms, making it difficult for healthcare professionals to accurately diagnose their conditions. Crying is often the only way for infants to communicate their needs and discomfort. In this paper, we propose a medical diagnostic system for interpreting infants’ cry audio signals (CAS) using a combination of different audio domain features and deep learning (DL) algorithms. The proposed system utilizes a dataset of labeled audio signals from infants with specific pathologies. The dataset includes two infant pathologies with high mortality rates, neonatal respiratory distress syndrome (RDS), sepsis, and crying. The system employed the harmonic ratio (HR) as a prosodic feature, the Gammatone frequency cepstral coefficients (GFCCs) as a cepstral feature, and image-based features through the spectrogram which are extracted using a convolution neural network (CNN) pretrained model and fused with the other features to benefit multiple domains in improving the classification rate and the accuracy of the model. The different combination of the fused features is then fed into multiple machine learning algorithms including random forest (RF), support vector machine (SVM), and deep neural network (DNN) models. The evaluation of the system using the accuracy, precision, recall, F1-score, confusion matrix, and receiver operating characteristic (ROC) curve, showed promising results for the early diagnosis of medical conditions in infants based on the crying signals only, where the system achieved the highest accuracy of 97.50% using the combination of the spectrogram, HR, and GFCC through the deep learning process. The finding demonstrated the importance of fusing different audio features, especially the spectrogram, through the learning process rather than a simple concatenation and the use of deep learning algorithms in extracting sparsely represented features that can be used later on in the classification problem, which improves the separation between different infants’ pathologies. The results outperformed the published benchmark paper by improving the classification problem to be multiclassification (RDS, sepsis, and healthy), investigating a new type of feature, which is the spectrogram, using a new feature fusion technique, which is fusion, through the learning process using the deep learning model

    Robot de reconnaissance des lieux sémantiques basée sur l'architecture profonde et une utilisation directe de mini-images

    No full text
    Il est généralement facile pour les humains de distinguer rapidement différents lieux en se basant uniquement sur leur aspect visuel. . Ces catégories sémantiques peuvent être utilisées comme information contextuelle favorisant la détection et la reconnaissance d'objets. Des travaux récents en reconnaissance des lieux visent à doter les robots de capacités similaires. Contrairement aux travaux classiques, portant sur la localisation et la cartographie, cette tâche est généralement traitée comme un problème d'apprentissage supervisé.La reconnaissance de lieux sémantiques - la capacité à reconnaître la catégorie sémantique à laquelle une scène appartient peut être considérée comme une condition essentielle en robotique autonome. Un robot autonome doit en effet pouvoir apprendre facilement l'organisation sémantique de son environnement pour pouvoir fonctionner et interagir avec succès. Pour atteindre cet objectif, différentes méthodes ont déjà été proposées. Certaines sont basées sur l'identification des objets comme une condition préalable à la reconnaissance des scènes, et d'autres fondées sur une description directe des caractéristiques de la scène. Si nous faisons l'hypothèse que les objets sont plus faciles à reconnaître quand la scène dans laquelle ils apparaissent est bien identifiée, la deuxième approche semble plus appropriée. Elle est cependant fortement dépendante de la nature des descripteurs d'images utilisées qui sont généralement dérivés empiriquement a partir des observations générales sur le codage d'images.En opposition avec ces propositions, une autre approche de codage des images, basée sur un point de vue plus théorique, a émergé ces dernières années. Les modèles d'extraction de caractéristiques fondés sur le principe de la minimisation d'une fonction d'énergie en relation avec un modèle statistique génératif expliquant au mieux les données, ont abouti à l'apparition des Machines de Boltzmann Restreintes (Rectricted Boltzmann Machines : RBMs) capables de coder une image comme la superposition d'un nombre limité de caractéristiques extraites à partir d'un plus grand alphabet. Il a été montré que ce processus peut être répété dans une architecture plus profonde, conduisant à une représentation parcimonieuse et efficace des données initiales dans l'espace des caractéristiques. Le problème complexe de la classification dans l'espace de début est ainsi remplacé par un problème plus simple dans l'espace des caractéristiques.Dans ce travail, nous montrons que la reconnaissance sémantiques des lieux peut être réalisée en considérant des mini-images au lieu d'approches plus classiques de type ''sacs-de-mots'' et par l'utilisation de réseaux profonds pour le codage des images. Après avoir realisé un codage approprié, une régression softmax dans l'espace de projection est suffisante pour obtenir des résultats de classification prometteurs. A notre connaissance, cette approche n'a pas encore été proposée pour la reconnaissance de scène en robotique autonome.Nous avons comparé nos méthodes avec les algorithmes de l'état-de-l'art en utilisant une base de données standard de localisation de robot. Nous avons étudié l'influence des paramètres du système et comparé les différentes conditions sur la même base de données. Les expériences réalisées montrent que le modèle que nous proposons, tout en étant très simple, conduit à des résultats comparables à l'état-de-l'art sur une tâche de reconnaissance de lieux sémantiques.Usually, human beings are able to quickly distinguish between different places, solely from their visual appearance. This is due to the fact that they can organize their space as composed of discrete units. These units, called semantic places'', are characterized by their spatial extend and their functional unity. Such a semantic category can thus be used as contextual information which fosters object detection and recognition. Recent works in semantic place recognition seek to endow the robot with similar capabilities. Contrary to classical localization and mapping works, this problem is usually addressed as a supervised learning problem. The question of semantic places recognition in robotics - the ability to recognize the semantic category of a place to which scene belongs to - is therefore a major requirement for the future of autonomous robotics. It is indeed required for an autonomous service robot to be able to recognize the environment in which it lives and to easily learn the organization of this environment in order to operate and interact successfully. To achieve that goal, different methods have been already proposed, some based on the identification of objects as a prerequisite to the recognition of the scenes, and some based on a direct description of the scene characteristics. If we make the hypothesis that objects are more easily recognized when the scene in which they appear is identified, the second approach seems more suitable. It is however strongly dependent on the nature of the image descriptors used, usually empirically derived from general considerations on image coding.Compared to these many proposals, another approach of image coding, based on a more theoretical point of view, has emerged the last few years. Energy-based models of feature extraction based on the principle of minimizing the energy of some function according to the quality of the reconstruction of the image has lead to the Restricted Boltzmann Machines (RBMs) able to code an image as the superposition of a limited number of features taken from a larger alphabet. It has also been shown that this process can be repeated in a deep architecture, leading to a sparse and efficient representation of the initial data in the feature space. A complex problem of classification in the input space is thus transformed into an easier one in the feature space. This approach has been successfully applied to the identification of tiny images from the 80 millions image database of the MIT. In the present work, we demonstrate that semantic place recognition can be achieved on the basis of tiny images instead of conventional Bag-of-Word (BoW) methods and on the use of Deep Belief Networks (DBNs) for image coding. We show that after appropriate coding a softmax regression in the projection space is sufficient to achieve promising classification results. To our knowledge, this approach has not yet been investigated for scene recognition in autonomous robotics. We compare our methods with the state-of-the-art algorithms using a standard database of robot localization. We study the influence of system parameters and compare different conditions on the same dataset. These experiments show that our proposed model, while being very simple, leads to state-of-the-art results on a semantic place recognition task.PARIS11-SCD-Bib. électronique (914719901) / SudocSudocFranceF

    A Recommendation System for Selecting the Appropriate Undergraduate Program at Higher Education Institutions Using Graduate Student Data

    No full text
    Selecting the appropriate undergraduate program is a critical decision for students. Many elements influence this choice for secondary students, including financial, social, demographic, and cultural factors. If a student makes a poor choice, it will have implications for their academic life as well as their professional life. These implications may include having to change their major, which will cause a delay in their graduation, having a low grade-point average (GPA) in their chosen major, which will cause difficulties in finding a job, or even dropping out of university. In this paper, various supervised machine learning techniques, including Decision Tree, Random Forest, and Support Vector Machine, were investigated to predict undergraduate majors. The input features were related to the student’s academic history and the job market. We were able to recommend the program that guarantees both a high academic degree and employment, depending on previous data and experience, for Master of Business Administration (MBA) students. This research was conducted based on a published research and using the same dataset and aimed to improve the results by applying hyper-tuning, which was absent in previous research. The obtained results showed that our work outperformed the work of the published research, where the random forest exceeded the other classification techniques and reached an accuracy of 97.70% compared to 75.00% on the published research. The importance of features was also investigated, and it was found that the degree percentage, MBA percentage, and entry test result were the top contributing features to the model

    Newborn Cry-Based Diagnostic System to Distinguish between Sepsis and Respiratory Distress Syndrome Using Combined Acoustic Features

    No full text
    Crying is the only means of communication for a newborn baby with its surrounding environment, but it also provides significant information about the newborn’s health, emotions, and needs. The cries of newborn babies have long been known as a biomarker for the diagnosis of pathologies. However, to the best of our knowledge, exploring the discrimination of two pathology groups by means of cry signals is unprecedented. Therefore, this study aimed to identify septic newborns with Neonatal Respiratory Distress Syndrome (RDS) by employing the Machine Learning (ML) methods of Multilayer Perceptron (MLP) and Support Vector Machine (SVM). Furthermore, the cry signal was analyzed from the following two different perspectives: 1) the musical perspective by studying the spectral feature set of Harmonic Ratio (HR), and 2) the speech processing perspective using the short-term feature set of Gammatone Frequency Cepstral Coefficients (GFCCs). In order to assess the role of employing features from both short-term and spectral modalities in distinguishing the two pathology groups, they were fused in one feature set named the combined features. The hyperparameters (HPs) of the implemented ML approaches were fine-tuned to fit each experiment. Finally, by normalizing and fusing the features originating from the two modalities, the overall performance of the proposed design was improved across all evaluation measures, achieving accuracies of 92.49% and 95.3% by the MLP and SVM classifiers, respectively. The MLP classifier was outperformed in terms of all evaluation measures presented in this study, except for the Area Under Curve of Receiver Operator Characteristics (AUC-ROC), which signifies the ability of the proposed design in class separation. The achieved results highlighted the role of combining features from different levels and modalities for a more powerful analysis of the cry signals, as well as including a neural network (NN)-based classifier. Consequently, attaining a 95.3% accuracy for the separation of two entangled pathology groups of RDS and sepsis elucidated the promising potential for further studies with larger datasets and more pathology groups

    Deep Learning Approach for Automatic Classification of Ocular and Cardiac Artifacts in MEG Data

    No full text
    We propose an artifact classification scheme based on a combined deep and convolutional neural network (DCNN) model, to automatically identify cardiac and ocular artifacts from neuromagnetic data, without the need for additional electrocardiogram (ECG) and electrooculogram (EOG) recordings. From independent components, the model uses both the spatial and temporal information of the decomposed magnetoencephalography (MEG) data. In total, 7122 samples were used after data augmentation, in which task and nontask related MEG recordings from 48 subjects served as the database for this study. Artifact rejection was applied using the combined model, which achieved a sensitivity and specificity of 91.8% and 97.4%, respectively. The overall accuracy of the model was validated using a cross-validation test and revealed a median accuracy of 94.4%, indicating high reliability of the DCNN-based artifact removal in task and nontask related MEG experiments. The major advantages of the proposed method are as follows: (1) it is a fully automated and user independent workflow of artifact classification in MEG data; (2) once the model is trained there is no need for auxiliary signal recordings; (3) the flexibility in the model design and training allows for various modalities (MEG/EEG) and various sensor types

    Wearable Devices and Explainable Unsupervised Learning for COVID-19 Detection and Monitoring

    No full text
    Despite the declining COVID-19 cases, global healthcare systems still face significant challenges due to ongoing infections, especially among fully vaccinated individuals, including adolescents and young adults (AYA). To tackle this issue, cost-effective alternatives utilizing technologies like Artificial Intelligence (AI) and wearable devices have emerged for disease screening, diagnosis, and monitoring. However, many AI solutions in this context heavily rely on supervised learning techniques, which pose challenges such as human labeling reliability and time-consuming data annotation. In this study, we propose an innovative unsupervised framework that leverages smartwatch data to detect and monitor COVID-19 infections. We utilize longitudinal data, including heart rate (HR), heart rate variability (HRV), and physical activity measured via step count, collected through the continuous monitoring of volunteers. Our goal is to offer effective and affordable solutions for COVID-19 detection and monitoring. Our unsupervised framework employs interpretable clusters of normal and abnormal measures, facilitating disease progression detection. Additionally, we enhance result interpretation by leveraging the language model Davinci GPT-3 to gain deeper insights into the underlying data patterns and relationships. Our results demonstrate the effectiveness of unsupervised learning, achieving a Silhouette score of 0.55. Furthermore, validation using supervised learning techniques yields high accuracy (0.884 ± 0.005), precision (0.80 ± 0.112), and recall (0.817 ± 0.037). These promising findings indicate the potential of unsupervised techniques for identifying inflammatory markers, contributing to the development of efficient and reliable COVID-19 detection and monitoring methods. Our study shows the capabilities of AI and wearables, reflecting the pursuit of low-cost, accessible solutions for addressing health challenges related to inflammatory diseases, thereby opening new avenues for scalable and widely applicable health monitoring solutions
    corecore