400 research outputs found

    Coping with Data Scarcity in Deep Learning and Applications for Social Good

    Get PDF
    The recent years are experiencing an extremely fast evolution of the Computer Vision and Machine Learning fields: several application domains benefit from the newly developed technologies and industries are investing a growing amount of money in Artificial Intelligence. Convolutional Neural Networks and Deep Learning substantially contributed to the rise and the diffusion of AI-based solutions, creating the potential for many disruptive new businesses. The effectiveness of Deep Learning models is grounded by the availability of a huge amount of training data. Unfortunately, data collection and labeling is an extremely expensive task in terms of both time and costs; moreover, it frequently requires the collaboration of domain experts. In the first part of the thesis, I will investigate some methods for reducing the cost of data acquisition for Deep Learning applications in the relatively constrained industrial scenarios related to visual inspection. I will primarily assess the effectiveness of Deep Neural Networks in comparison with several classical Machine Learning algorithms requiring a smaller amount of data to be trained. Hereafter, I will introduce a hardware-based data augmentation approach, which leads to a considerable performance boost taking advantage of a novel illumination setup designed for this purpose. Finally, I will investigate the situation in which acquiring a sufficient number of training samples is not possible, in particular the most extreme situation: zero-shot learning (ZSL), which is the problem of multi-class classification when no training data is available for some of the classes. Visual features designed for image classification and trained offline have been shown to be useful for ZSL to generalize towards classes not seen during training. Nevertheless, I will show that recognition performances on unseen classes can be sharply improved by learning ad hoc semantic embedding (the pre-defined list of present and absent attributes that represent a class) and visual features, to increase the correlation between the two geometrical spaces and ease the metric learning process for ZSL. In the second part of the thesis, I will present some successful applications of state-of-the- art Computer Vision, Data Analysis and Artificial Intelligence methods. I will illustrate some solutions developed during the 2020 Coronavirus Pandemic for controlling the disease vii evolution and for reducing virus spreading. I will describe the first publicly available dataset for the analysis of face-touching behavior that we annotated and distributed, and I will illustrate an extensive evaluation of several computer vision methods applied to the produced dataset. Moreover, I will describe the privacy-preserving solution we developed for estimating the \u201cSocial Distance\u201d and its violations, given a single uncalibrated image in unconstrained scenarios. I will conclude the thesis with a Computer Vision solution developed in collaboration with the Egyptian Museum of Turin for digitally unwrapping mummies analyzing their CT scan, to support the archaeologists during mummy analysis and avoiding the devastating and irreversible process of physically unwrapping the bandages for removing amulets and jewels from the body

    Detection and Localization of Root Damages in Underground Sewer Systems using Deep Neural Networks and Computer Vision Techniques

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)The maintenance of a healthy sewer infrastructure is a major challenge due to the root damages from nearby plants that grow through pipe cracks or loose joints, which may lead to serious pipe blockages and collapse. Traditional inspections based on video surveillance to identify and localize root damages within such complex sewer networks are inefficient, laborious, and error-prone. Therefore, this study aims to develop a robust and efficient approach to automatically detect root damages and localize their circumferential and longitudinal positions in CCTV inspection videos by applying deep neural networks and computer vision techniques. With twenty inspection videos collected from various resources, keyframes were extracted from each video according to the difference in a LUV color space with certain selections of local maxima. To recognize distance information from video subtitles, OCR models such as Tesseract and CRNN-CTC were implemented and led to a 90% of recognition accuracy. In addition, a pre-trained segmentation model was applied to detect root damages, but it also found many false positive predictions. By applying a well-tuned YoloV3 model on the detection of pipe joints leveraging the Convex Hull Overlap (CHO) feature, we were able to achieve a 20% improvement on the reliability and accuracy of damage identifications. Moreover, an end-to-end deep learning pipeline that involved Triangle Similarity Theorem (TST) was successfully designed to predict the longitudinal position of each identified root damage. The prediction error was less than 1.0 feet

    Surface and Sub-Surface Analyses for Bridge Inspection

    Get PDF
    The development of bridge inspection solutions has been discussed in the recent past. In this dissertation, significant development and improvement on the state-of-the-art in the field of bridge inspection using multiple sensors (e.g. ground penetrating radar (GPR) and visual sensor) has been proposed. In the first part of this research (discussed in chapter 3), the focus is towards developing effective and novel methods for rebar detection and localization for sub-surface bridge inspection of steel rebars. The data has been collected using Ground Penetrating Radar (GPR) sensor on real bridge decks. In this regard, a number of different approaches have been successively developed that continue to improve the state-of-the-art in this particular research area. The second part (discussed in chapter 4) of this research deals with the development of an automated system for steel bridge defect detection system using a Multi-Directional Bicycle Robot. The training data has been acquired from actual bridges in Vietnam and validation is performed on data collected using Bicycle Robot from actual bridge located in Highway-80, Lovelock, Nevada, USA. A number of different proposed methods have been discussed in chapter 4. The final chapter of the dissertation will conclude the findings from the different parts and discuss ways of improving on the existing works in the near future

    Dashcam-Enabled Deep Learning Applications for Airport Runway Pavement Distress Detection

    Get PDF
    23-8193Pavement distress detection plays a vital role in ensuring the safety and longevity of runway infrastructure. This project presents a comprehensive approach to automate distress detection and geolocation on runway pavement using state-of-the-art deep learning techniques. A Faster R-CNN model is trained to accurately identify and classify various distress types, including longitudinal and transverse cracking, weathering, rutting, and depression. The developed model is deployed on a dataset of high-resolution dashcam images captured along the runway, allowing for real-time detection of distresses. Geolocation techniques are employed to accurately map the distresses onto the runway pavement in real-world coordinates. The system implementation and deployment are discussed, emphasizing the importance of a seamless integration into existing infrastructure. The developed distress detection system offers significant benefits to the Utah Department of Transportation (UDOT) by enabling proactive maintenance planning, optimizing resource allocation, and enhancing runway management capabilities. Future potential for advanced distress analysis, integration with other data sources, and continuous model improvement are also explored. The project showcases the potential of low-cost dashcam solutions combined with deep learning for efficient and cost-effective runway distress detection and management

    Multi-Scale Attention Networks for Pavement Defect Detection

    Get PDF
    Pavement defects such as cracks, net cracks, and pit slots can cause potential traffic safety problems. The timely detection and identification play a key role in reducing the harm of various pavement defects. Particularly, the recent development in deep learning-based CNNs has shown competitive performance in image detection and classification. To detect pavement defects automatically and improve effects, a multi-scale mobile attention-based network, which we termed MANet, is proposed to perform the detection of pavement defects. The architecture of the encoder-decoder is used in MANet, where the encoder adopts the MobileNet as the backbone network to extract pavement defect features. Instead of the original 3×3 convolution, the multi-scale convolution kernels are utilized in depth-wise separable convolution layers of the network. Further, the hybrid attention mechanism is separately incorporated into the encoder and decoder modules to infer the significance of spatial points and inter-channel relationship features for the input intermediate feature maps. The proposed approach achieves state-of-the-art performance on two publicly-available benchmark datasets, i.e., the Crack500 (500 crack images with 2,000×1,500 pixels) and CFD (118 crack images with 480×320 pixels) datasets. The mean intersection over union ( MIoU ) of the proposed approach on these two datasets reaches 0.7219 and 0.7788, respectively. Ablation experiments show that the multi-scale convolution and hybrid attention modules can effectively help the model extract high-level feature representations and generate more accurate pavement crack segmentation results. We further test the model on locally collected pavement crack images (131 images with 1024×768 pixels) and it achieves a satisfactory result. The proposed approach realizes the MIoU of 0.6514 on the local dataset and outperforms other compared baseline methods. Experimental findings demonstrate the validity and feasibility of the proposed approach and it provides a viable solution for pavement crack detection in practical application scenarios. Our code is available at https://github.com/xtu502/pavement-defects

    Investigation of Computer Vision Concepts and Methods for Structural Health Monitoring and Identification Applications

    Get PDF
    This study presents a comprehensive investigation of methods and technologies for developing a computer vision-based framework for Structural Health Monitoring (SHM) and Structural Identification (St-Id) for civil infrastructure systems, with particular emphasis on various types of bridges. SHM is implemented on various structures over the last two decades, yet, there are some issues such as considerable cost, field implementation time and excessive labor needs for the instrumentation of sensors, cable wiring work and possible interruptions during implementation. These issues make it only viable when major investments for SHM are warranted for decision making. For other cases, there needs to be a practical and effective solution, which computer-vision based framework can be a viable alternative. Computer vision based SHM has been explored over the last decade. Unlike most of the vision-based structural identification studies and practices, which focus either on structural input (vehicle location) estimation or on structural output (structural displacement and strain responses) estimation, the proposed framework combines the vision-based structural input and the structural output from non-contact sensors to overcome the limitations given above. First, this study develops a series of computer vision-based displacement measurement methods for structural response (structural output) monitoring which can be applied to different infrastructures such as grandstands, stadiums, towers, footbridges, small/medium span concrete bridges, railway bridges, and long span bridges, and under different loading cases such as human crowd, pedestrians, wind, vehicle, etc. Structural behavior, modal properties, load carrying capacities, structural serviceability and performance are investigated using vision-based methods and validated by comparing with conventional SHM approaches. In this study, some of the most famous landmark structures such as long span bridges are utilized as case studies. This study also investigated the serviceability status of structures by using computer vision-based methods. Subsequently, issues and considerations for computer vision-based measurement in field application are discussed and recommendations are provided for better results. This study also proposes a robust vision-based method for displacement measurement using spatio-temporal context learning and Taylor approximation to overcome the difficulties of vision-based monitoring under adverse environmental factors such as fog and illumination change. In addition, it is shown that the external load distribution on structures (structural input) can be estimated by using visual tracking, and afterward load rating of a bridge can be determined by using the load distribution factors extracted from computer vision-based methods. By combining the structural input and output results, the unit influence line (UIL) of structures are extracted during daily traffic just using cameras from which the external loads can be estimated by using just cameras and extracted UIL. Finally, the condition assessment at global structural level can be achieved using the structural input and output, both obtained from computer vision approaches, would give a normalized response irrespective of the type and/or load configurations of the vehicles or human loads

    A Routine and Post-disaster Road Corridor Monitoring Framework for the Increased Resilience of Road Infrastructures

    Get PDF

    Auto-Classifier: A Robust Defect Detector Based on an AutoML Head

    Full text link
    The dominant approach for surface defect detection is the use of hand-crafted feature-based methods. However, this falls short when conditions vary that affect extracted images. So, in this paper, we sought to determine how well several state-of-the-art Convolutional Neural Networks perform in the task of surface defect detection. Moreover, we propose two methods: CNN-Fusion, that fuses the prediction of all the networks into a final one, and Auto-Classifier, which is a novel proposal that improves a Convolutional Neural Network by modifying its classification component using AutoML. We carried out experiments to evaluate the proposed methods in the task of surface defect detection using different datasets from DAGM2007. We show that the use of Convolutional Neural Networks achieves better results than traditional methods, and also, that Auto-Classifier out-performs all other methods, by achieving 100% accuracy and 100% AUC results throughout all the datasets.Comment: 12 pages, 2 figures. Published in ICONIP2020, proceedings published in the Springer's series of Lecture Notes in Computer Scienc
    corecore