12 research outputs found

    Robust Head-Pose Estimation Based on Partially-Latent Mixture of Linear Regressions

    Get PDF
    Head-pose estimation has many applications, such as social event analysis, human-robot and human-computer interaction, driving assistance, and so forth. Head-pose estimation is challenging because it must cope with changing illumination conditions, variabilities in face orientation and in appearance, partial occlusions of facial landmarks, as well as bounding-box-to-face alignment errors. We propose tu use a mixture of linear regressions with partially-latent output. This regression method learns to map high-dimensional feature vectors (extracted from bounding boxes of faces) onto the joint space of head-pose angles and bounding-box shifts, such that they are robustly predicted in the presence of unobservable phenomena. We describe in detail the mapping method that combines the merits of unsupervised manifold learning techniques and of mixtures of regressions. We validate our method with three publicly available datasets and we thoroughly benchmark four variants of the proposed algorithm with several state-of-the-art head-pose estimation methods.Comment: 12 pages, 5 figures, 3 table

    Human Attention Detection Using AM-FM Representations

    Get PDF
    Human activity detection from digital videos presents many challenges to the computer vision and image processing communities. Recently, many methods have been developed to detect human activities with varying degree of success. Yet, the general human activity detection problem remains very challenging, especially when the methods need to work “in the wild” (e.g., without having precise control over the imaging geometry). The thesis explores phase-based solutions for (i) detecting faces, (ii) back of the heads, (iii) joint detection of faces and back of the heads, and (iv) whether the head is looking to the left or the right, using standard video cameras without any control on the imaging geometry. The proposed phase-based approach is based on the development of simple and robust methods that relie on the use of Amplitude Modulation - Frequency Modulation (AM-FM) models. The approach is validated using video frames extracted from the Advancing Outof- school Learning in Mathematics and Engineering (AOLME) project. The dataset consisted of 13,265 images from ten students looking at the camera, and 6,122 images from five students looking away from the camera. For the students facing the camera, the method was able to correctly classify 97.1% of them looking to the left and 95.9% of them looking to the right. For the students facing the back of the camera, the method was able to correctly classify 87.6% of them looking to the left and 93.3% of them looking to the right. The results indicate that AM-FM based methods hold great promise for analyzing human activity videos

    Explainable machine learning for precise fatigue crack tip detection

    Get PDF
    Data-driven models based on deep learning have led to tremendous breakthroughs in classical computer vision tasks and have recently made their way into natural sciences. However, the absence of domain knowledge in their inherent design significantly hinders the understanding and acceptance of these models. Nevertheless, explainability is crucial to justify the use of deep learning tools in safety-relevant applications such as aircraft component design, service and inspection. In this work, we train convolutional neural networks for crack tip detection in fatigue crack growth experiments using full-field displacement data obtained by digital image correlation. For this, we introduce the novel architecture ParallelNets—a network which combines segmentation and regression of the crack tip coordinates—and compare it with a classical U-Net-based architecture. Aiming for explainability, we use the Grad-CAM interpretability method to visualize the neural attention of several models. Attention heatmaps show that ParallelNets is able to focus on physically relevant areas like the crack tip field, which explains its superior performance in terms of accuracy, robustness, and stability

    Automated production of synthetic point clouds of truss bridges for semantic and instance segmentation using deep learning models

    Get PDF
    The cost of obtaining large volumes of bridge data with technologies like laser scanners hinders the training of deep learning models. To address this, this paper introduces a new method for creating synthetic point clouds of truss bridges and demonstrates the effectiveness of a deep learning approach for semantic and instance segmentation of these point clouds. The method generates point clouds by specifying the dimensions and components of the bridge, resulting in high variability in the generated dataset. A deep learning model is trained using the generated point clouds, which is an adapted version of JSNet. The accuracy of the results surpasses previous heuristic methods. The proposed methodology has significant implications for the development of automated inspection and monitoring systems for truss bridges. Furthermore, the success of the deep learning approach suggests its potential for semantic and instance segmentation of complex point clouds beyond truss bridges.Agencia Estatal de InvestigaciĂłn | Ref. PID2021-124236OB-C33Agencia Estatal de InvestigaciĂłn | Ref. RYC2021-033560-IUniversidade de Vigo/CISU

    A review of deep learning algorithms for computer vision systems in livestock.

    Get PDF
    In livestock operations, systematically monitoring animal body weight, bio-metric body measurements, animal behavior, feed bunk, and other difficult-to-measure phenotypes is manually unfeasible due to labor, costs, and animal stress. Applications of computer vision are growing in importance in livestock systems due to their ability to generate real-time, non-invasive, and accurate animal-level information. However, the development of a computer vision system requires sophisticated statistical and computational approaches for efficient data management and appropriate data mining, as it involves mas-sive datasets. This article aims to provide an overview of how deep learning has been implemented in computer vision systems used in livestock, and how such implementation can be an effective tool to predict animal phe-notypes and to accelerate the development of predictive modeling for precise management decisions. First, we reviewed the most recent milestones achieved with computer vision systems and its respective deep learning algorithms implemented in Animal Science studies. Second, we reviewed the published research studies in Animal Science, which used deep learning algorithms as the primary analytical strategy for image classification, object detection, object segmentation, and feature extraction. The great number of reviewed articles published in the last few years demonstrates the high interest and rapid development of deep learning algorithms in computer vision systems across livestock species. Deep learning algorithms for computer vision systems, such as Mask R-CNN, Faster R-CNN, YOLO (v3 and v4), DeepLab v3, U-Net and others have been used in Animal Science research studies. Additionally, network architectures such as ResNet, Inception, Xception, and VGG16 have been implemented in several studies across livestock species. The great performance of these deep learning algorithms suggests an33improved predictive ability in livestock applications and a faster inference.34However, only a few articles fully described the deep learning algorithms and its implementation. Thus, information regarding hyperparameter tuning, pre-trained weights, deep learning backbone, and hierarchical data structure were missed. We summarized peer-reviewed articles by computer vision tasks38(image classification, object detection, and object segmentation), deep learn-39ing algorithms, species, and phenotypes including animal identification and behavior, feed intake, animal body weight, and many others. Understanding the principles of computer vision and the algorithms used for each application is crucial to develop efficient systems in livestock operations. Such development will potentially have a major impact on the livestock industry by predicting real-time and accurate phenotypes, which could be used in the future to improve farm management decisions, breeding programs through high-throughput phenotyping, and optimized data-driven interventions
    corecore