4,371 research outputs found

    An IoT System for Converting Handwritten Text to Editable Format via Gesture Recognition

    Get PDF
    Evaluation of traditional classroom has led to electronic classroom i.e. e-learning. Growth of traditional classroom doesn’t stop at e-learning or distance learning. Next step to electronic classroom is a smart classroom. Most popular features of electronic classroom is capturing video/photos of lecture content and extracting handwriting for note-taking. Numerous techniques have been implemented in order to extract handwriting from video/photo of the lecture but still the deficiency of few techniques can be resolved, and which can turn electronic classroom into smart classroom. In this thesis, we present a real-time IoT system to convert handwritten text into editable format by implementing hand gesture recognition (HGR) with Raspberry Pi and camera. Hand Gesture Recognition (HGR) is built using edge detection algorithm and HGR is used in this system to reduce computational complexity of previous systems i.e. removal of redundant images and lecture’s body from image, recollecting text from previous images to fill area from where lecture’s body has been removed. Raspberry Pi is used to retrieve, perceive HGR and to build a smart classroom based on IoT. Handwritten images are converted into editable format by using OpenCV and machine learning algorithms. In text conversion, recognition of uppercase and lowercase alphabets, numbers, special characters, mathematical symbols, equations, graphs and figures are included with recognition of word, lines, blocks, and paragraphs. With the help of Raspberry Pi and IoT, the editable format of lecture notes is given to students via desktop application which helps students to edit notes and images according to their necessity

    Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark

    Full text link
    Perhaps surprisingly sewerage infrastructure is one of the most costly infrastructures in modern society. Sewer pipes are manually inspected to determine whether the pipes are defective. However, this process is limited by the number of qualified inspectors and the time it takes to inspect a pipe. Automatization of this process is therefore of high interest. So far, the success of computer vision approaches for sewer defect classification has been limited when compared to the success in other fields mainly due to the lack of public datasets. To this end, in this work we present a large novel and publicly available multi-label classification dataset for image-based sewer defect classification called Sewer-ML. The Sewer-ML dataset consists of 1.3 million images annotated by professional sewer inspectors from three different utility companies across nine years. Together with the dataset, we also present a benchmark algorithm and a novel metric for assessing performance. The benchmark algorithm is a result of evaluating 12 state-of-the-art algorithms, six from the sewer defect classification domain and six from the multi-label classification domain, and combining the best performing algorithms. The novel metric is a class-importance weighted F2 score, F2CIW\text{F}2_{\text{CIW}}, reflecting the economic impact of each class, used together with the normal pipe F1 score, F1Normal\text{F}1_{\text{Normal}}. The benchmark algorithm achieves an F2CIW\text{F}2_{\text{CIW}} score of 55.11% and F1Normal\text{F}1_{\text{Normal}} score of 90.94%, leaving ample room for improvement on the Sewer-ML dataset. The code, models, and dataset are available at the project page https://vap.aau.dk/sewer-ml/Comment: CVPR 2021. Project webpage: https://vap.aau.dk/sewer-ml

    Feature Space Augmentation: Improving Prediction Accuracy of Classical Problems in Cognitive Science and Computer Vison

    Get PDF
    The prediction accuracy in many classical problems across multiple domains has seen a rise since computational tools such as multi-layer neural nets and complex machine learning algorithms have become widely accessible to the research community. In this research, we take a step back and examine the feature space in two problems from very different domains. We show that novel augmentation to the feature space yields higher performance. Emotion Recognition in Adults from a Control Group: The objective is to quantify the emotional state of an individual at any time using data collected by wearable sensors. We define emotional state as a mixture of amusement, anger, disgust, fear, sadness, anxiety and neutral and their respective levels at any time. The generated model predicts an individual’s dominant state and generates an emotional spectrum, 1x7 vector indicating levels of each emotional state and anxiety. We present an iterative learning framework that alters the feature space uniquely to an individual’s emotion perception, and predicts the emotional state using the individual specific feature space. Hybrid Feature Space for Image Classification: The objective is to improve the accuracy of existing image recognition by leveraging text features from the images. As humans, we perceive objects using colors, dimensions, geometry and any textual information we can gather. Current image recognition algorithms rely exclusively on the first 3 and do not use the textual information. This study develops and tests an approach that trains a classifier on a hybrid text based feature space that has comparable accuracy to the state of the art CNN’s while being significantly inexpensive computationally. Moreover, when combined with CNN’S the approach yields a statistically significant boost in accuracy. Both models are validated using cross validation and holdout validation, and are evaluated against the state of the art

    Road Quality Classification

    Get PDF
    Automatické vyhodnocování kvality vozovky může být užitečné jak správním orgánům, tak i těm účastníkům silničního provozu, kteří vyhledávají vozovky s kvalitním povrchem pro co největší potěšení z jízdy. Tato práce se zabývá návrhem modelu, který klasifikuje obrázky silnic do pěti kvalitativních kategorií na základě jejich celkového vzhledu. V práci prezentujeme nový ručně anotovaný dataset, obsahující fotografie ze služby Google Street View. Anotace datasetu byla navržena pro motorkáře, ale může být použita i pro jiné účastníky silničního provozu. Experimentovali jsme jak s předučenými konvolučními neuronovými sítěmi, jako jsou MobileNet či DenseNet, tak s vlastními architekturami konvolučních neuronových sítí. Dále jsme vyzkoušeli různé techniky předzpracování dat, např. odstraňování stínů či kontrastně-limitní adaptabilní histogramovou ekvalizací (CLAHE). Námi navrhovaný klasifikační model využívá CLAHE a na testovací sadě dosahuje 71% přesnosti. Vizuální kontrola ukázala, že navrhovaný model je i s touto přesností využitelný za účelem, pro který byl navržen.Automated evaluation of road quality can be helpful to authorities and also road users who seek high-quality roads to maximize their driving pleasure. This thesis proposes a model which classifies road images into five qualitative categories based on overall appearance. We present a new manually annotated dataset, collected from Google Street View. The dataset classes were designed for motorcyclists, but they are also applicable to other road users. We experimented with Convolutions Neural Networks, involving custom architectures and pre-trained networks, such as MobileNet or DenseNet. Also, many experiments with preprocessing methods such as shadow removal or CLAHE. Our proposed classification model uses CLAHE and achieves 71% accuracy on a test set. A visual check showed the model is applicable for its designed purpose despite the modest accuracy since the image data are often controversial and hard to label even for humans

    Linear feature selection and classification using PNN and SFAM neural networks for a nearly online diagnosis of bearing naturally progressing degradations.

    No full text
    International audienceIn this work, an effort is made to characterize seven bearing states depending on the energy entropy of Intrinsic Mode Functions (IMFs) resulted from the Empirical Modes Decomposition (EMD).Three run-to-failure bearing vibration signals representing different defects either degraded or different failing components (roller, inner race and outer race) with healthy state lead to seven bearing states under study. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) are used for feature reduction. Then, six classification scenarios are processed via a Probabilistic Neural Network (PNN) and a Simplified Fuzzy Adaptive resonance theory Map (SFAM) neural network. In other words, the three extracted feature data bases (EMD, PCA and LDA features) are processed firstly with SFAM and secondly with a combination of PNN-SFAM. The computation of classification accuracy and scattering criterion for each scenario shows that the EMD-LDA-PNN-SFAM combination is the suitable strategy for online bearing fault diagnosis. The proposed methodology reveals better generalization capability compared to previous works and it’s validated by an online bearing fault diagnosis. The proposed strategy can be applied for the decision making of several assets

    Defect detection in infrared thermography by deep learning algorithms

    Get PDF
    L'évaluation non destructive (END) est un domaine permettant d'identifier tous les types de dommages structurels dans un objet d'intérêt sans appliquer de dommages et de modifications permanents. Ce domaine fait l'objet de recherches intensives depuis de nombreuses années. La thermographie infrarouge (IR) est l'une des technologies d'évaluation non destructive qui permet d'inspecter, de caractériser et d'analyser les défauts sur la base d'images infrarouges (séquences) provenant de l'enregistrement de l'émission et de la réflexion de la lumière infrarouge afin d'évaluer les objets non autochauffants pour le contrôle de la qualité et l'assurance de la sécurité. Ces dernières années, le domaine de l'apprentissage profond de l'intelligence artificielle a fait des progrès remarquables dans les applications de traitement d'images. Ce domaine a montré sa capacité à surmonter la plupart des inconvénients des autres approches existantes auparavant dans un grand nombre d'applications. Cependant, en raison de l'insuffisance des données d'entraînement, les algorithmes d'apprentissage profond restent encore inexplorés, et seules quelques publications font état de leur application à l'évaluation non destructive de la thermographie (TNDE). Les algorithmes d'apprentissage profond intelligents et hautement automatisés pourraient être couplés à la thermographie infrarouge pour identifier les défauts (dommages) dans les composites, l'acier, etc. avec une confiance et une précision élevée. Parmi les sujets du domaine de recherche TNDE, les techniques d'apprentissage automatique supervisées et non supervisées sont les tâches les plus innovantes et les plus difficiles pour l'analyse de la détection des défauts. Dans ce projet, nous construisons des cadres intégrés pour le traitement des données brutes de la thermographie infrarouge à l'aide d'algorithmes d'apprentissage profond et les points forts des méthodologies proposées sont les suivants: 1. Identification et segmentation automatique des défauts par des algorithmes d'apprentissage profond en thermographie infrarouge. Les réseaux neuronaux convolutifs (CNN) pré-entraînés sont introduits pour capturer les caractéristiques des défauts dans les images thermiques infrarouges afin de mettre en œuvre des modèles basés sur les CNN pour la détection des défauts structurels dans les échantillons composés de matériaux composites (diagnostic des défauts). Plusieurs alternatives de CNNs profonds pour la détection de défauts dans la thermographie infrarouge. Les comparaisons de performance de la détection et de la segmentation automatique des défauts dans la thermographie infrarouge en utilisant différentes méthodes de détection par apprentissage profond : (i) segmentation d'instance (Center-mask ; Mask-RCNN) ; (ii) détection d’objet (Yolo-v3 ; Faster-RCNN) ; (iii) segmentation sémantique (Unet ; Res-unet); 2. Technique d'augmentation des données par la génération de données synthétiques pour réduire le coût des dépenses élevées associées à la collecte de données infrarouges originales dans les composites (composants d'aéronefs.) afin d'enrichir les données de formation pour l'apprentissage des caractéristiques dans TNDE; 3. Le réseau antagoniste génératif (GAN convolutif profond et GAN de Wasserstein) est introduit dans la thermographie infrarouge associée à la thermographie partielle des moindres carrés (PLST) (réseau PLS-GANs) pour l'extraction des caractéristiques visibles des défauts et l'amélioration de la visibilité des défauts pour éliminer le bruit dans la thermographie pulsée; 4. Estimation automatique de la profondeur des défauts (question de la caractérisation) à partir de données infrarouges simulées en utilisant un réseau neuronal récurrent simplifié : Gate Recurrent Unit (GRU) à travers l'apprentissage supervisé par régression.Non-destructive evaluation (NDE) is a field to identify all types of structural damage in an object of interest without applying any permanent damage and modification. This field has been intensively investigated for many years. The infrared thermography (IR) is one of NDE technology through inspecting, characterize and analyzing defects based on the infrared images (sequences) from the recordation of infrared light emission and reflection to evaluate non-self-heating objects for quality control and safety assurance. In recent years, the deep learning field of artificial intelligence has made remarkable progress in image processing applications. This field has shown its ability to overcome most of the disadvantages in other approaches existing previously in a great number of applications. Whereas due to the insufficient training data, deep learning algorithms still remain unexplored, and only few publications involving the application of it for thermography nondestructive evaluation (TNDE). The intelligent and highly automated deep learning algorithms could be coupled with infrared thermography to identify the defect (damages) in composites, steel, etc. with high confidence and accuracy. Among the topics in the TNDE research field, the supervised and unsupervised machine learning techniques both are the most innovative and challenging tasks for defect detection analysis. In this project, we construct integrated frameworks for processing raw data from infrared thermography using deep learning algorithms and highlight of the methodologies proposed include the following: 1. Automatic defect identification and segmentation by deep learning algorithms in infrared thermography. The pre-trained convolutional neural networks (CNNs) are introduced to capture defect feature in infrared thermal images to implement CNNs based models for the detection of structural defects in samples made of composite materials (fault diagnosis). Several alternatives of deep CNNs for the detection of defects in the Infrared thermography. The comparisons of performance of the automatic defect detection and segmentation in infrared thermography using different deep learning detection methods: (i) instance segmentation (Center-mask; Mask-RCNN); (ii) objective location (Yolo-v3; Faster-RCNN); (iii) semantic segmentation (Unet; Res-unet); 2. Data augmentation technique through synthetic data generation to reduce the cost of high expense associated with the collection of original infrared data in the composites (aircraft components.) to enrich training data for feature learning in TNDE; 3. The generative adversarial network (Deep convolutional GAN and Wasserstein GAN) is introduced to the infrared thermography associated with partial least square thermography (PLST) (PLS-GANs network) for visible feature extraction of defects and enhancement of the visibility of defects to remove noise in Pulsed thermography; 4. Automatic defect depth estimation (Characterization issue) from simulated infrared data using a simplified recurrent neural network: Gate Recurrent Unit (GRU) through the regression supervised learning

    Automatic vision based fault detection on electricity transmission components using very highresolution

    Get PDF
    Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial TechnologiesElectricity is indispensable to modern-day governments and citizenry’s day-to-day operations. Fault identification is one of the most significant bottlenecks faced by Electricity transmission and distribution utilities in developing countries to deliver credible services to customers and ensure proper asset audit and management for network optimization and load forecasting. This is due to data scarcity, asset inaccessibility and insecurity, ground-surveys complexity, untimeliness, and general human cost. In this context, we exploit the use of oblique drone imagery with a high spatial resolution to monitor four major Electric power transmission network (EPTN) components condition through a fine-tuned deep learning approach, i.e., Convolutional Neural Networks (CNNs). This study explored the capability of the Single Shot Multibox Detector (SSD), a onestage object detection model on the electric transmission power line imagery to localize, classify and inspect faults present. The components fault considered include the broken insulator plate, missing insulator plate, missing knob, and rusty clamp. The adopted network used a CNN based on a multiscale layer feature pyramid network (FPN) using aerial image patches and ground truth to localise and detect faults via a one-phase procedure. The SSD Rest50 architecture variation performed the best with a mean Average Precision of 89.61%. All the developed SSD based models achieve a high precision rate and low recall rate in detecting the faulty components, thus achieving acceptable balance levels F1-score and representation. Finally, comparable to other works of literature within this same domain, deep-learning will boost timeliness of EPTN inspection and their component fault mapping in the long - run if these deep learning architectures are widely understood, adequate training samples exist to represent multiple fault characteristics; and the effects of augmenting available datasets, balancing intra-class heterogeneity, and small-scale datasets are clearly understood