777 research outputs found

    Camera-Based Heart Rate Extraction in Noisy Environments

    Get PDF
    Remote photoplethysmography (rPPG) is a non-invasive technique that benefits from video to measure vital signs such as the heart rate (HR). In rPPG estimation, noise can introduce artifacts that distort rPPG signal and jeopardize accurate HR measurement. Considering that most rPPG studies occurred in lab-controlled environments, the issue of noise in realistic conditions remains open. This thesis aims to examine the challenges of noise in rPPG estimation in realistic scenarios, specifically investigating the effect of noise arising from illumination variation and motion artifacts on the predicted rPPG HR. To mitigate the impact of noise, a modular rPPG measurement framework, comprising data preprocessing, region of interest, signal extraction, preparation, processing, and HR extraction is developed. The proposed pipeline is tested on the LGI-PPGI-Face-Video-Database public dataset, hosting four different candidates and real-life scenarios. In the RoI module, raw rPPG signals were extracted from the dataset using three machine learning-based face detectors, namely Haarcascade, Dlib, and MediaPipe, in parallel. Subsequently, the collected signals underwent preprocessing, independent component analysis, denoising, and frequency domain conversion for peak detection. Overall, the Dlib face detector leads to the most successful HR for the majority of scenarios. In 50% of all scenarios and candidates, the average predicted HR for Dlib is either in line or very close to the average reference HR. The extracted HRs from the Haarcascade and MediaPipe architectures make up 31.25% and 18.75% of plausible results, respectively. The analysis highlighted the importance of fixated facial landmarks in collecting quality raw data and reducing noise

    Investigating the potential for detecting Oak Decline using Unmanned Aerial Vehicle (UAV) Remote Sensing

    Get PDF
    This PhD project develops methods for the assessment of forest condition utilising modern remote sensing technologies, in particular optical imagery from unmanned aerial systems and with Structure from Motion photogrammetry. The research focuses on health threats to the UK’s native oak trees, specifically, Chronic Oak Decline (COD) and Acute Oak Decline (AOD). The data requirements and methods to identify these complex diseases are investigatedusing RGB and multispectral imagery with very high spatial resolution, as well as crown textural information. These image data are produced photogrammetrically from multitemporal unmanned aerial vehicle (UAV) flights, collected during different seasons to assess the influence of phenology on the ability to detect oak decline. Particular attention is given to the identification of declined oak health within the context of semi-natural forests and heterogenous stands. Semi-natural forest environments pose challenges regarding naturally occurring variability. The studies investigate the potential and practical implications of UAV remote sensing approaches for detection of oak decline under these conditions. COD is studied at Speculation Cannop, a section in the Forest of Dean, dominated by 200-year-old oaks, where decline symptoms have been present for the last decade. Monks Wood, a semi-natural woodland in Cambridgeshire, is the study site for AOD, where trees exhibit active decline symptoms. Field surveys at these sites are designed and carried out to produce highly-accurate differential GNSS positional information of symptomatic and control oak trees. This allows the UAV data to be related to COD or AOD symptoms and the validation of model predictions. Random Forest modelling is used to determine the explanatory value of remote sensing-derived metrics to distinguish trees affected by COD or AOD from control trees. Spectral and textural variables are extracted from the remote sensing data using an object-based approach, adopting circular plots around crown centres at individual tree level. Furthermore, acquired UAV imagery is applied to generate a species distribution map, improving on the number of detectable species and spatial resolution from a previous classification using multispectral data from a piloted aircraft. In the production of the map, parameters relevant for classification accuracy, and identification of oak in particular, are assessed. The effect of plot size, sample size and data combinations are studied. With optimised parameters for species classification, the updated species map is subsequently employed to perform a wall-to-wall prediction of individual oak tree condition, evaluating the potential of a full inventory detection of declined health. UAV-acquired data showed potential for discrimination of control trees and declined trees, in the case of COD and AOD. The greatest potential for detecting declined oak condition was demonstrated with narrowband multispectral imagery. Broadband RGB imagery was determined to be unsuitable for a robust distinction between declined and control trees. The greatest explanatory power was found in remotely-sensed spectra related to photosynthetic activity, indicated by the high feature importance of nearinfrared spectra and the vegetation indices NDRE and NDVI. High feature importance was also produced by texture metrics, that describe structural variations within the crown. The findings indicate that the remotely sensed explanatory variables hold significant information regarding changes in leaf chemistry and crown morphology that relate to chlorosis, defoliation and dieback occurring in the course of the decline. In the case of COD, a distinction of symptomatic from control trees was achieved with 75 % accuracy. Models developed for AOD detection yielded AUC scores up to 0.98,when validated on independent sample data. Classification of oak presence was achieved with a User’s accuracy of 97 % and the produced species map generated 95 % overall accuracy across the eight species within the study area in the north-east of Monks Wood. Despite these encouraging results, it was shown that the generalisation of models is unfeasible at this stage and many challenges remain. A wall-to-wall prediction of decline status confirmed the inability to generalise, yielding unrealistic results, with a high number of declined trees predicted. Identified weaknesses of the developed models indicate complexity related to the natural variability of heterogenous forests combined with the diverse symptoms of oak decline. Specific to the presented studies, additional limitations were attributed to limited ground truth, consequent overfitting,the binary classification of oak health status and uncertainty in UAV-acquired reflectance values. Suggestions for future work are given and involve the extension of field sampling with a non-binary dependent variable to reflect the severity of oak decline induced stress. Further technical research on the quality and reliability of UAV remote sensing data is also required

    Utilization and experimental evaluation of occlusion aware kernel correlation filter tracker using RGB-D

    Get PDF
    Unlike deep-learning which requires large training datasets, correlation filter-based trackers like Kernelized Correlation Filter (KCF) uses implicit properties of tracked images (circulant matrices) for training in real-time. Despite their practical application in tracking, a need for a better understanding of the fundamentals associated with KCF in terms of theoretically, mathematically, and experimentally exists. This thesis first details the workings prototype of the tracker and investigates its effectiveness in real-time applications and supporting visualizations. We further address some of the drawbacks of the tracker in cases of occlusions, scale changes, object rotation, out-of-view and model drift with our novel RGB-D Kernel Correlation tracker. We also study the use of particle filter to improve trackers\u27 accuracy. Our results are experimentally evaluated using a) standard dataset and b) real-time using Microsoft Kinect V2 sensor. We believe this work will set the basis for better understanding the effectiveness of kernel-based correlation filter trackers and to further define some of its possible advantages in tracking

    Detection of deformable objects in a non-stationary scene

    Get PDF
    Image registration is the process of determining a mapping between points of interest on separate images to achieve a correspondence. This is a fundamental area of many problems in computer vision including object recognition and motion tracking. This research focuses on applying image registration to identify differences between frames in non-stationary video scenes for the purpose of motion tracking. The major stages for the image registration process include point detection, image correspondence, and an affine transformation. After applying image registration to spatially align the image frames and detect areas of motion segmentation is applied to segment the moving deformable objects in the non-stationary scenes. In this paper, specific techniques are reviewed to implement image registration. First, I will present other work related to image registration for feature point extraction, image correspondence, and spatial transformations. Then I will discuss deformable object recognition. This will be followed by a detailed description on the methods developed for this research and implementation. Included is a discussion on the Harris Corner Detection operator that allows the identification of key points on separate frames based on detecting areas in frames with strong contrasts in intensity values that can be identified as corners. These corners are the feature points that are comparable between frames. Then there will be an explanation on finding point correspondences between two separate video frames using ordinal and orientation measures. When a correspondence is made, the data acquired from the image correspondence calculations will be used to apply translation to align the video frames. With these methods, two frames of video can be properly aligned and then subtracted to detect deformable objects. Finally, areas of motions are segmented using histograms in the HSV color space. The algorithms are implemented using INTEL?s open computer vision library called OpenCV. The results demonstrate that this approach is successful at detecting deformable objects in non-stationary scenes

    “Deep sensor fusion architecture for point-cloud semantic segmentation”

    Get PDF
    Este trabajo de grado desarrolla un completo abordaje del analisis de datos y su procesamiento para obtener una mejor toma de decisiones, presentando así una arquitectura neuronal multimodal basada CNN, comprende explicaciones precisas de los sistemas que integra y realiza una evaluacion del comportamiento en el entorno.Los sistemas de conducción autónoma integran procedimientos realmente complejos, para los cuales la percepción del entorno del vehículo es una fuente de información clave para tomar decisiones durante maniobras en tiempo real. La segmentación semántica de los datos obtenidos de los sensores LiDAR ha desempeñado un papel importante en la consolidación de una representación densa de los objetos y eventos circundantes. Aunque se han hecho grandes avances para resolver esta tarea, creemos que hay una infrautilización de estrategias que aprovechas la fusión de sensores. Presentamos una arquitectura neuronal multimodal, basada en CNNs que es alimentada por las señales de entrada 2D del LiDAR y de la cámara, computa una representación profunda de ambos sensores, y predice un mapeo de etiquetas para el problema de segmentación de puntos en 3D. Evaluamos la arquitectura propuesta en un conjunto de datos derivados del popular dataset KITTI, que contempla clases semánticas comunes ( coche, peatón y ciclista). Nuestro modelo supera a los métodos existentes y muestra una mejora en el refinamiento de las máscaras de segmentación.Self-driving systems are composed by really complex pipelines in which perceiving the vehicle surroundings is a key source of information used to take real-time maneuver decisions. Semantic segmentation on LiDAR sensor data has played a big role in the consolidation of a dense understanding of the surrounding objects and events. Although great advances have been made for this task, we believe there is an under-exploitation of sensor fusion strategies. We present a multimodal neural architecture, based on CNNs that consumes 2D input signals from LiDAR and camera, computes a deep representation leveraging straightness from both sensors, and predicts a label mapping for the 3D point-wise segmentation problem. We evaluated the proposed architecture in a derived dataset from the KITTI vision benchmark suite which contemplates common semantic classes(i.e. car, pedestrian and cyclist). Our model outperforms existing methods and shows improvement in the segmentation masks refinement.MaestríaMagíster en Ingeniería de Sistemas y ComputaciónTable of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Autonomous vehicle perception systems . . . . . . . . . . . . . . . . . . . . 6 2.1 Semantic segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Autonomous vehicles sensing . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.1 Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.2 LiDAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.3 Radar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.4 Ultrasonic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Point clouds semantic segmentation . . . . . . . . . . . . . . . . . . . . . . . 12 2.3.1 Raw pointcloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.2 Voxelization of pointclouds . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.3 Point cloud projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.4 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 Deep multimodal learning for semantic segmentation . . . . . . . . . . . . . 19 3.1 Method overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Point cloud transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3 Multimodal fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3.1 RGB modality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3.2 LiDAR modality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.3 Fusion step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.4 Decoding part . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.5 Optimization statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.1 KITTI dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.2 Evaluation metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.3 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.4.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
    corecore