730 research outputs found

    RGBD Datasets: Past, Present and Future

    Full text link
    Since the launch of the Microsoft Kinect, scores of RGBD datasets have been released. These have propelled advances in areas from reconstruction to gesture recognition. In this paper we explore the field, reviewing datasets across eight categories: semantics, object pose estimation, camera tracking, scene reconstruction, object tracking, human actions, faces and identification. By extracting relevant information in each category we help researchers to find appropriate data for their needs, and we consider which datasets have succeeded in driving computer vision forward and why. Finally, we examine the future of RGBD datasets. We identify key areas which are currently underexplored, and suggest that future directions may include synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style

    Human action recognition from RGB-D frames

    Get PDF
    Scopo di questo lavoro è la creazione di un sistema client-server facilmente fruibile per la videosorveglianza tramite la telecamera Omnidome®. Vengono impiegate tecnologie AJAX, PHP, HTML e C++ per la realizzazione di una interfaccia di prenotazione e controllo il più semplice ed intuitiva possibile, con gestione automatizzata della coda di utentiopenEmbargo per motivi di segretezza e di proprietà dei risultati e informazioni sensibil

    Intelligent surveillance of indoor environments based on computer vision and 3D point cloud fusion

    Get PDF
    A real-time detection algorithm for intelligent surveillance is presented. The system, based on 3D change detection with respect to a complex scene model, allows intruder monitoring and detection of added and missing objects, under different illumination conditions. The proposed system has two independent stages. First, a mapping application provides an accurate 3D wide model of the scene, using a view registration approach. This registration is based on computer vision and 3D point cloud. Fusion of visual features with 3D descriptors is used in order to identify corresponding points in two consecutive views. The matching of these two views is first estimated by a pre-alignment stage, based on the tilt movement of the sensor, later they are accurately aligned by an Iterative Closest Point variant (Levenberg-Marquardt ICP), which performance has been improved by a previous filter based on geometrical assumptions. The second stage provides accurate intruder and object detection by means of a 3D change detection approach, based on Octree volumetric representation, followed by a clusters analysis. The whole scene is continuously scanned, and every captured is compared with the corresponding part of the wide model thanks to the previous analysis of the sensor movement parameters. With this purpose a tilt-axis calibration method has been developed. Tests performed show the reliable performance of the system under real conditions and the improvements provided by each stage independently. Moreover, the main goal of this application has been enhanced, for reliable intruder detection by the tilting of the sensors using its built-in motor to increase the size of the monitored area. (C) 2015 Elsevier Ltd. All rights reserved.This work was supported by the Spanish Government through the CICYT projects (TRA2013-48314-C3-1-R) and (TRA2011-29454-C03-02)

    RGB-D datasets using microsoft kinect or similar sensors: a survey

    Get PDF
    RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

    3D Sensor Placement and Embedded Processing for People Detection in an Industrial Environment

    Get PDF
    Papers I, II and III are extracted from the dissertation and uploaded as separate documents to meet post-publication requirements for self-arciving of IEEE conference papers.At a time when autonomy is being introduced in more and more areas, computer vision plays a very important role. In an industrial environment, the ability to create a real-time virtual version of a volume of interest provides a broad range of possibilities, including safety-related systems such as vision based anti-collision and personnel tracking. In an offshore environment, where such systems are not common, the task is challenging due to rough weather and environmental conditions, but the result of introducing such safety systems could potentially be lifesaving, as personnel work close to heavy, huge, and often poorly instrumented moving machinery and equipment. This thesis presents research on important topics related to enabling computer vision systems in industrial and offshore environments, including a review of the most important technologies and methods. A prototype 3D sensor package is developed, consisting of different sensors and a powerful embedded computer. This, together with a novel, highly scalable point cloud compression and sensor fusion scheme allows to create a real-time 3D map of an industrial area. The question of where to place the sensor packages in an environment where occlusions are present is also investigated. The result is algorithms for automatic sensor placement optimisation, where the goal is to place sensors in such a way that maximises the volume of interest that is covered, with as few occluded zones as possible. The method also includes redundancy constraints where important sub-volumes can be defined to be viewed by more than one sensor. Lastly, a people detection scheme using a merged point cloud from six different sensor packages as input is developed. Using a combination of point cloud clustering, flattening and convolutional neural networks, the system successfully detects multiple people in an outdoor industrial environment, providing real-time 3D positions. The sensor packages and methods are tested and verified at the Industrial Robotics Lab at the University of Agder, and the people detection method is also tested in a relevant outdoor, industrial testing facility. The experiments and results are presented in the papers attached to this thesis.publishedVersio

    Real-time computation of distance to dynamic obstacles with multiple depth sensors

    Get PDF
    We present an efficient method to evaluate distances between dynamic obstacles and a number of points of interests (e.g., placed on the links of a robot) when using multiple depth cameras. A depth-space oriented discretization of the Cartesian space is introduced that represents at best the workspace monitored by a depth camera, including occluded points. A depth grid map can be initialized off line from the arrangement of the multiple depth cameras, and its peculiar search characteristics allows fusing on line the information given by the multiple sensors in a very simple and fast way. The real-time performance of the proposed approach is shown by means of collision avoidance experiments where two Kinect sensors monitor a human-robot coexistence task

    Characterization of multiphase flows integrating X-ray imaging and virtual reality

    Get PDF
    Multiphase flows are used in a wide variety of industries, from energy production to pharmaceutical manufacturing. However, because of the complexity of the flows and difficulty measuring them, it is challenging to characterize the phenomena inside a multiphase flow. To help overcome this challenge, researchers have used numerous types of noninvasive measurement techniques to record the phenomena that occur inside the flow. One technique that has shown much success is X-ray imaging. While capable of high spatial resolutions, X-ray imaging generally has poor temporal resolution. This research improves the characterization of multiphase flows in three ways. First, an X-ray image intensifier is modified to use a high-speed camera to push the temporal limits of what is possible with current tube source X-ray imaging technology. Using this system, sample flows were imaged at 1000 frames per second without a reduction in spatial resolution. Next, the sensitivity of X-ray computed tomography (CT) measurements to changes in acquisition parameters is analyzed. While in theory CT measurements should be stable over a range of acquisition parameters, previous research has indicated otherwise. The analysis of this sensitivity shows that, while raw CT values are strongly affected by changes to acquisition parameters, if proper calibration techniques are used, acquisition parameters do not significantly influence the results for multiphase flow imaging. Finally, two algorithms are analyzed for their suitability to reconstruct an approximate tomographic slice from only two X-ray projections. These algorithms increase the spatial error in the measurement, as compared to traditional CT; however, they allow for very high temporal resolutions for 3D imaging. The only limit on the speed of this measurement technique is the image intensifier-camera setup, which was shown to be capable of imaging at a rate of at least 1000 FPS. While advances in measurement techniques for multiphase flows are one part of improving multiphase flow characterization, the challenge extends beyond measurement techniques. For improved measurement techniques to be useful, the data must be accessible to scientists in a way that maximizes the comprehension of the phenomena. To this end, this work also presents a system for using the Microsoft Kinect sensor to provide natural, non-contact interaction with multiphase flow data. Furthermore, this system is constructed so that it is trivial to add natural, non-contact interaction to immersive visualization applications. Therefore, multiple visualization applications can be built that are optimized to specific types of data, but all leverage the same natural interaction. Finally, the research is concluded by proposing a system that integrates the improved X-ray measurements, with the Kinect interaction system, and a CAVE automatic virtual environment (CAVE) to present scientists with the multiphase flow measurements in an intuitive and inherently three-dimensional manner

    RGB-D people tracking by detection for a mobile robot

    Get PDF
    In this work, we propose a fast and robust multi-people long-term tracking algorithm for mobile platforms equipped with RGB-D sensors. The approach we followed is based on the clustering of the scene by using 3D information in conjunction with a reliable HOG classifier to identify people among these clusters. For each detected person, we instantiate a Kalman filter to maintain and predict his location, and a classifier trained on-line to recover the track even after full occlusions. We also perform some tests on a challenging real-world indoor environment whose results have been evaluated with the CLEAR MOT metrics. Our algorithm proved to correctly track 96% of people with very limited ID switches and few false positives, with an average frame rate of more than 25 fps. Moreover, its applicability to robot-people following tasks have been tested and discusse

    Occlusion-Aware Multi-View Reconstruction of Articulated Objects for Manipulation

    Get PDF
    The goal of this research is to develop algorithms using multiple views to automatically recover complete 3D models of articulated objects in unstructured environments and thereby enable a robotic system to facilitate further manipulation of those objects. First, an algorithm called Procrustes-Lo-RANSAC (PLR) is presented. Structure-from-motion techniques are used to capture 3D point cloud models of an articulated object in two different configurations. Procrustes analysis, combined with a locally optimized RANSAC sampling strategy, facilitates a straightforward geometric approach to recovering the joint axes, as well as classifying them automatically as either revolute or prismatic. The algorithm does not require prior knowledge of the object, nor does it make any assumptions about the planarity of the object or scene. Second, with such a resulting articulated model, a robotic system is then able to manipulate the object either along its joint axes at a specified grasp point in order to exercise its degrees of freedom or move its end effector to a particular position even if the point is not visible in the current view. This is one of the main advantages of the occlusion-aware approach, because the models capture all sides of the object meaning that the robot has knowledge of parts of the object that are not visible in the current view. Experiments with a PUMA 500 robotic arm demonstrate the effectiveness of the approach on a variety of real-world objects containing both revolute and prismatic joints. Third, we improve the proposed approach by using a RGBD sensor (Microsoft Kinect) that yield a depth value for each pixel immediately by the sensor itself rather than requiring correspondence to establish depth. KinectFusion algorithm is applied to produce a single high-quality, geometrically accurate 3D model from which rigid links of the object are segmented and aligned, allowing the joint axes to be estimated using the geometric approach. The improved algorithm does not require artificial markers attached to objects, yields much denser 3D models and reduces the computation time
    • …
    corecore