618 research outputs found

    Multi-camera Torso Pose Estimation using Graph Neural Networks

    Get PDF
    Estimating the location and orientation of humans is an essential skill for service and assistive robots. To achieve a reliable estimation in a wide area such as an apartment, multiple RGBD cameras are frequently used. Firstly, these setups are relatively expensive. Secondly, they seldom perform an effective data fusion using the multiple camera sources at an early stage of the processing pipeline. Occlusions and partial views make this second point very relevant in these scenarios. The proposal presented in this paper makes use of graph neural networks to merge the information acquired from multiple camera sources, achieving a mean absolute error below 125 mm for the location and 10 degrees for the orientation using low-resolution RGB images. The experiments, conducted in an apartment with three cameras, benchmarked two different graph neural network implementations and a third architecture based on fully connected layers. The software used has been released as open-source in a public repository

    Design of a Specialized UAV Platform for the Discharge of a Fire Extinguishing Capsule

    Get PDF
    Tato práce se zabývá návrhem systému specializovaného pro autonomní detekci a lokalizaci požárů z palubních senzorů bezpilotních helikoptér. Hašení požárů je zajištěno automatickým vystřelením ampule s hasící kapalinou do zdroje požáru z palubního vystřelovače. Hlavní část této práce se soustředí na detekci požárů v datech termální kamery a jejich následnou lokalizaci ve světě za pomoci palubní hloubkové kamery. Bezpilotní helikoptéra je poté optimálně navigována na pozici pro zajištění průletu ampule s hasící kapalinou do zdroje požáru. Vyvinuté metody jsou detailně analyzovány a jejich chování je testováno jak v simulaci, tak současně i při reálných experimentech. Kvalitativní a kvantitativní analýza ukazuje na použitelnost a robustnost celého systému.This thesis deals with the design of an unmanned multirotor aircraft system specialized for autonomous detection and localization of fires from onboard sensors, and the task of fast and effective fire extinguishment. The main part of this thesis focuses on the detection of fires in thermal images and their localization in the world using an onboard depth camera. The localized fires are used to optimally position the unmanned aircraft in order to effectively discharge an ampoule filled with a fire extinguishant from an onboard launcher. The developed methods are analyzed in detail and their performance is evaluated in simulation scenarios as well as in real-world experiments. The included quantitative and qualitative analysis verifies the feasibility and robustness of the system

    Segmentation approaches for diabetic foot disorders

    Get PDF
    Thermography enables non-invasive, accessible, and easily repeated foot temperature measurements for diabetic patients, promoting early detection and regular monitoring protocols, that limit the incidence of disabling conditions associated with diabetic foot disorders. The establishment of this application into standard diabetic care protocols requires to overcome technical issues, particularly the foot sole segmentation. In this work we implemented and evaluated several segmentation approaches which include conventional and Deep Learning methods. Multimodal images, constituted by registered visual-light, infrared and depth images, were acquired for 37 healthy subjects. The segmentation methods explored were based on both visual-light as well as infrared images, and optimization was achieved using the spatial information provided by the depth images. Furthermore, a ground truth was established from the manual segmentation performed by two independent researchers. Overall, the performance level of all the implemented approaches was satisfactory. Although the best performance, in terms of spatial overlap, accuracy, and precision, was found for the Skin and U-Net approaches optimized by the spatial information. However, the robustness of the U-Net approach is preferred.This research was funded by the IACTEC Technological Training program, grant number TF INNOVA 2016–2021. This work was completed while Abián Hernández was beneficiary of a pre-doctoral grant given by the “Agencia Canaria de Investigacion, Innovacion y Sociedad de la Información (ACIISI)” of the “Consejería de Economía, Industria, Comercio y Conocimiento” of the “Gobierno de Canarias”, which is partly financed by the European Social Fund (FSE) (POC 2014–2020, Eje 3 Tema Prioritario 74 (85%))

    Sparse ellipsometry: portable acquisition of polarimetric SVBRDF and shape with unstructured flash photography

    Get PDF
    Ellipsometry techniques allow to measure polarization information of materials, requiring precise rotations of optical components with different configurations of lights and sensors. This results in cumbersome capture devices, carefully calibrated in lab conditions, and in very long acquisition times, usually in the order of a few days per object. Recent techniques allow to capture polarimetric spatially-varying reflectance information, but limited to a single view, or to cover all view directions, but limited to spherical objects made of a single homogeneous material. We present sparse ellipsometry, a portable polarimetric acquisition method that captures both polarimetric SVBRDF and 3D shape simultaneously. Our handheld device consists of off-the-shelf, fixed optical components. Instead of days, the total acquisition time varies between twenty and thirty minutes per object. We develop a complete polarimetric SVBRDF model that includes diffuse and specular components, as well as single scattering, and devise a novel polarimetric inverse rendering algorithm with data augmentation of specular reflection samples via generative modeling. Our results show a strong agreement with a recent ground-truth dataset of captured polarimetric BRDFs of real-world objects

    Efficient Distortion-Free Neural Projector Deblurring in Dynamic Projection Mapping

    Get PDF
    Kageyama Y., Iwai D., Sato K.. Efficient Distortion-Free Neural Projector Deblurring in Dynamic Projection Mapping. IEEE Transactions on Visualization and Computer Graphics , (2024); https://doi.org/10.1109/TVCG.2024.3354957.Dynamic Projection Mapping (DPM) necessitates geometric compensation of the projection image based on the position and orientation of moving objects. Additionally, the projector's shallow depth of field results in pronounced defocus blur even with minimal object movement. Achieving delay-free DPM with high image quality requires real-time implementation of geometric compensation and projector deblurring. To meet this demand, we propose a framework comprising two neural components: one for geometric compensation and another for projector deblurring. The former component warps the image by detecting the optical flow of each pixel in both the projection and captured images. The latter component performs real-time sharpening as needed. Ideally, our network's parameters should be trained on data acquired in an actual environment. However, training the network from scratch while executing DPM, which demands real-time image generation, is impractical. Therefore, the network must undergo pre-training. Unfortunately, there are no publicly available large real datasets for DPM due to the diverse image quality degradation patterns. To address this challenge, we propose a realistic synthetic data generation method that numerically models geometric distortion and defocus blur in real-world DPM. Through exhaustive experiments, we have confirmed that the model trained on the proposed dataset achieves projector deblurring in the presence of geometric distortions with a quality comparable to state-of-the-art methods

    A Dataset of Multi-Illumination Images in the Wild

    Full text link
    Collections of images under a single, uncontrolled illumination have enabled the rapid advancement of core computer vision tasks like classification, detection, and segmentation. But even with modern learning techniques, many inverse problems involving lighting and material understanding remain too severely ill-posed to be solved with single-illumination datasets. To fill this gap, we introduce a new multi-illumination dataset of more than 1000 real scenes, each captured under 25 lighting conditions. We demonstrate the richness of this dataset by training state-of-the-art models for three challenging applications: single-image illumination estimation, image relighting, and mixed-illuminant white balance.Comment: ICCV 201

    Multimodal Sensing Interface for Haptic Interaction

    Get PDF
    This paper investigates the integration of a multimodal sensing system for exploring limits of vibrato tactile haptic feedback when interacting with 3D representation of real objects. In this study, the spatial locations of the objects are mapped to the work volume of the user using a Kinect sensor. The position of the user’s hand is obtained using the marker-based visual processing. The depth information is used to build a vibrotactile map on a haptic glove enhanced with vibration motors. The users can perceive the location and dimension of remote objects by moving their hand inside a scanning region. A marker detection camera provides the location and orientation of the user’s hand (glove) to map the corresponding tactile message. A preliminary study was conducted to explore how different users can perceive such haptic experiences. Factors such as total number of objects detected, object separation resolution, and dimension-based and shape-based discrimination were evaluated. The preliminary results showed that the localization and counting of objects can be attained with a high degree of success. The users were able to classify groups of objects of different dimensions based on the perceived haptic feedback

    Signals in the Soil: Subsurface Sensing

    Get PDF
    In this chapter, novel subsurface soil sensing approaches are presented for monitoring and real-time decision support system applications. The methods, materials, and operational feasibility aspects of soil sensors are explored. The soil sensing techniques covered in this chapter include aerial sensing, in-situ, proximal sensing, and remote sensing. The underlying mechanism used for sensing is also examined as well. The sensor selection and calibration techniques are described in detail. The chapter concludes with discussion of soil sensing challenges

    Enhancing 3D Visual Odometry with Single-Camera Stereo Omnidirectional Systems

    Full text link
    We explore low-cost solutions for efficiently improving the 3D pose estimation problem of a single camera moving in an unfamiliar environment. The visual odometry (VO) task -- as it is called when using computer vision to estimate egomotion -- is of particular interest to mobile robots as well as humans with visual impairments. The payload capacity of small robots like micro-aerial vehicles (drones) requires the use of portable perception equipment, which is constrained by size, weight, energy consumption, and processing power. Using a single camera as the passive sensor for the VO task satisfies these requirements, and it motivates the proposed solutions presented in this thesis. To deliver the portability goal with a single off-the-shelf camera, we have taken two approaches: The first one, and the most extensively studied here, revolves around an unorthodox camera-mirrors configuration (catadioptrics) achieving a stereo omnidirectional system (SOS). The second approach relies on expanding the visual features from the scene into higher dimensionalities to track the pose of a conventional camera in a photogrammetric fashion. The first goal has many interdependent challenges, which we address as part of this thesis: SOS design, projection model, adequate calibration procedure, and application to VO. We show several practical advantages for the single-camera SOS due to its complete 360-degree stereo views, that other conventional 3D sensors lack due to their limited field of view. Since our omnidirectional stereo (omnistereo) views are captured by a single camera, a truly instantaneous pair of panoramic images is possible for 3D perception tasks. Finally, we address the VO problem as a direct multichannel tracking approach, which increases the pose estimation accuracy of the baseline method (i.e., using only grayscale or color information) under the photometric error minimization as the heart of the “direct” tracking algorithm. Currently, this solution has been tested on standard monocular cameras, but it could also be applied to an SOS. We believe the challenges that we attempted to solve have not been considered previously with the level of detail needed for successfully performing VO with a single camera as the ultimate goal in both real-life and simulated scenes
    corecore