70 research outputs found
Investigations of closed source registration method of depth sensor technologies for human-robot collaboration
Productive teaming is the new form of human-robot interaction. The multimodal 3D imaging has a key role in this to gain a more comprehensive understanding of production system as well as to enable trustful collaboration from the teams. For a complete scene capture, the registration of the image modalities is required. Currently, low-cost RGB-D sensors are often used. These come with a closed source registration function. In order to have an efficient and freely available method for any sensors, we have developed a new method, called Triangle-Mesh-Rasterization-Projection (TMRP). To verify the performance of our method, we compare it with the closed-source projection function of the Azure Kinect Sensor (Microsoft). The qualitative comparison showed that both methods produce almost identical results. Minimal differences at the edges indicate that our TMRP interpolation is more accurate. With our method, a freely available open-source registration method is now available that can be applied to almost any multimodal 3D/2D image dataset and is not like the Microsoft SDK optimized for Microsoft products
Event-based sensor fusion in human-machine teaming
Realizing intelligent production systems where machines and human workers can team up seamlessly demands a yet unreached level of situational awareness. The machines' leverage to reach such awareness is to amalgamate a wide variety of sensor modalities through multisensor data fusion. A particularly promising direction to establishing human-like collaborations can be seen in the use of neuro-inspired sensing and computing technologies due to their resemblance with human cognitive processing. This note discusses the concept of integrating neuromorphic sensing modalities into classical sensor fusion frameworks by exploiting event-based fusion and filtering methods that combine time-periodic process models with event-triggered sensor data. Event-based sensor fusion hence adopts the operating principles of event-based sensors and even exhibits the ability to extract information from absent data. Thereby, it can be an enabler to harness the full information potential of the intrinsic spiking nature of event-driven sensors
FPGA-based multi-view stereo system with flexible measurement setup
In recent years, stereoscopic image processing algorithms have gained importance for a variety of applications. To capture larger measurement volumes, multiple stereo systems are combined into a multi-view stereo (MVS) system. To reduce the amount of data and the data rate, calculation steps close to the sensors are outsourced to Field Programmable Gate Arrays (FPGAs) as upstream computing units. The calculation steps include lens distortion correction, rectification and stereo matching. In this paper a FPGA-based MVS system with flexible camera arrangement and partly overlapping field of view is presented. The system consists of four FPGA-based passive stereoscopic systems (Xilinx Zynq-7000 7020 SoC, EV76C570 CMOS sensor) and a downstream processing unit (Zynq Ultrascale ZU9EG SoC). This synchronizes the sensor near processing modules and receives the disparity maps with corresponding left camera image via HDMI. The subsequent computing unit calculates a coherent 3D point cloud. Our developed FPGA-based 3D measurement system captures a large measurement volume at 24 fps by combining a multiple view with eight cameras (using Semi-Global Matching for an image size of 640 px × 460 px, up to 256 px disparity range and with aggregated costs over 4 directions). The capabilities and limitation of the system are shown by an application example with optical non-cooperative surface
Suitability study for real-time depth map generation using stereo matchers in OpenCV and Python
Stereo imaging provides an easy and cost-effective method to measure 3D surfaces, especially due to the availability of extensive free program libraries like OpenCV. An extension of the application to the field of forestry was aimed at here in the context of a project to capture the elevation profile of forest roads by means of stereo imaging. For this purpose, an analysis of the methods contained in OpenCV for the successful generation of depth maps was carried out. The program sections comprised the reading of the image stream, the image correction on the basis of calibrations carried out in advance as well as the generation of the disparity maps by the stereo matchers. These are then converted back into depth maps and stored in suitable memory formats. A data set of the image size 1280x864 pixels consisting of 30 stereo image pairs was used. The aim was to design an evaluation program which allows the processing of the described steps within one second for 30 image pairs. With a sequential processing of all steps under the used test system and the usage of a local stereo matcher a processing time of 4.37 s was determined. Steps to reduce the processing time included parallelizing the image preparation of the two frames of the image pair. Further reduction in total processing time was achieved by processing multiple image pairs simultaneously and using storage formats without compression. A total processing time of 0.8 s could be achieved by outsourcing the stereo matching to the graphics card. However, the tested method did not achieve the desired resolutions in depth as well as in the image plane. This was made possible by using semi-global matchers, which are up to 10 times slower but significantly more accurate, and which were therefore used for further investigations of the forest path profile
- …