2,102 research outputs found

    RGB-D datasets using microsoft kinect or similar sensors: a survey

    Get PDF
    RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

    3D Sensor Placement and Embedded Processing for People Detection in an Industrial Environment

    Get PDF
    Papers I, II and III are extracted from the dissertation and uploaded as separate documents to meet post-publication requirements for self-arciving of IEEE conference papers.At a time when autonomy is being introduced in more and more areas, computer vision plays a very important role. In an industrial environment, the ability to create a real-time virtual version of a volume of interest provides a broad range of possibilities, including safety-related systems such as vision based anti-collision and personnel tracking. In an offshore environment, where such systems are not common, the task is challenging due to rough weather and environmental conditions, but the result of introducing such safety systems could potentially be lifesaving, as personnel work close to heavy, huge, and often poorly instrumented moving machinery and equipment. This thesis presents research on important topics related to enabling computer vision systems in industrial and offshore environments, including a review of the most important technologies and methods. A prototype 3D sensor package is developed, consisting of different sensors and a powerful embedded computer. This, together with a novel, highly scalable point cloud compression and sensor fusion scheme allows to create a real-time 3D map of an industrial area. The question of where to place the sensor packages in an environment where occlusions are present is also investigated. The result is algorithms for automatic sensor placement optimisation, where the goal is to place sensors in such a way that maximises the volume of interest that is covered, with as few occluded zones as possible. The method also includes redundancy constraints where important sub-volumes can be defined to be viewed by more than one sensor. Lastly, a people detection scheme using a merged point cloud from six different sensor packages as input is developed. Using a combination of point cloud clustering, flattening and convolutional neural networks, the system successfully detects multiple people in an outdoor industrial environment, providing real-time 3D positions. The sensor packages and methods are tested and verified at the Industrial Robotics Lab at the University of Agder, and the people detection method is also tested in a relevant outdoor, industrial testing facility. The experiments and results are presented in the papers attached to this thesis.publishedVersio

    Tele-media-art: web-based inclusive teaching of body expression

    Get PDF
    Conferência Internacional, realizada em Olhão, Algarve, de 26-28 de abril de 2018.The Tele-Media-Art project aims to promote the improvement of the online distance learning and artistic teaching process applied in the teaching of two test scenarios, doctorate in digital art-media and the lifelong learning course ”the experience of diversity” by exploiting multimodal telepresence facilities encompassing the diversified visual, auditory and sensory channels, as well as rich forms of gestural / body interaction. To this end, a telepresence system was developed to be installed at Palácio Ceia, in Lisbon, Portugal, headquarters of the Portuguese Open University, from which methodologies of artistic teaching in mixed regime - face-to-face and online distance - that are inclusive to blind and partially sighted students. This system has already been tested against a group of subjects, including blind people. Although positive results were achieved, more development and further tests will be carried in the futureThis project was financed by Calouste Gulbenkian Foundation under Grant number 142793.info:eu-repo/semantics/publishedVersio

    Action intention recognition for proactive human assistance in domestic environments

    Get PDF
    The current Master’s Thesis in Automatics, Control and Robotics covers the development and implementation of an Action Intention Recognition algorithm for proactive human assistance in domestic environments. The proposed solution is based on the use of data provided by a real time RGBD Object Recognition process which captures object state changes inside a defined region of interest of the domestic environment setup. A background analysis is performed to analyze state of the art approaches to both real time RGBD object recognition and action intention recognition methods. The preliminary analysis serves as the base for the proposal of a new volume descriptor for object categorization and an improved formalism for Activation Spreading Networks in the context of action intention recognition. Several tests are performed to study the performance of the proposed solution and its results are analyzed to define the conclusions of the project and propose future work. Finally, the project budget and environmental impact as well as the project schedule are presented and briefly discusse

    Fast, Accurate Thin-Structure Obstacle Detection for Autonomous Mobile Robots

    Full text link
    Safety is paramount for mobile robotic platforms such as self-driving cars and unmanned aerial vehicles. This work is devoted to a task that is indispensable for safety yet was largely overlooked in the past -- detecting obstacles that are of very thin structures, such as wires, cables and tree branches. This is a challenging problem, as thin objects can be problematic for active sensors such as lidar and sonar and even for stereo cameras. In this work, we propose to use video sequences for thin obstacle detection. We represent obstacles with edges in the video frames, and reconstruct them in 3D using efficient edge-based visual odometry techniques. We provide both a monocular camera solution and a stereo camera solution. The former incorporates Inertial Measurement Unit (IMU) data to solve scale ambiguity, while the latter enjoys a novel, purely vision-based solution. Experiments demonstrated that the proposed methods are fast and able to detect thin obstacles robustly and accurately under various conditions.Comment: Appeared at IEEE CVPR 2017 Workshop on Embedded Visio

    Workplace Posture Assessment and Biofeedback with Kinect

    Get PDF
    With the prevalence of computing, many workers today are confined to desk within an office. By sitting in these positions for long periods of time, workers are prone to develop one of many musculoskeletal disorders (MSDs), such as carpal tunnel syndrome. In order to prevent MSDs in the long term, workers must employ good sitting habits. One promising method to ensure good workplace posture is through camera monitoring. To date, camera systems have been used in determining posture in a clean environment. However, an occluded and cluttered background, which is typical in an office setting, imposes a great challenge for a computer vision system to detect desired objects. In this thesis, we design and propose components that assess good posture using information gathered from a Microsoft Kinect camera. To do so, we generate a data set of posture captures to test and train, applying crowd-sourced voting to determine ratings for a subset of these captures. Leveraging this data set, we apply machine learning to develop a classification tool. Finally, we explore and compare the usage of depth information in conjunction with a traditional RGB sensor array and present novel implementations of a wrist locating method

    Sensing of complex buildings and reconstruction into photo-realistic 3D models

    Get PDF
    The 3D reconstruction of indoor and outdoor environments has received an interest only recently, as companies began to recognize that using reconstructed models is a way to generate revenue through location-based services and advertisements. A great amount of research has been done in the field of 3D reconstruction, and one of the latest and most promising applications is Kinect Fusion, which was developed by Microsoft Research. Its strong points are the real-time intuitive 3D reconstruction, interactive frame rate, the level of detail in the models, and the availability of the hardware and software for researchers and enthusiasts. A representative effort towards 3D reconstruction is the Point Cloud Library (PCL). PCL is a large scale, open project for 2D/3D image and point cloud processing. On December 2011, PCL made available an implementation of Kinect Fusion, namely KinFu. KinFu emulates the functionality provided in Kinect Fusion. However, both implementations have two major limitations: 1. The real-time reconstruction takes place only within a cube with a size of 3 meters per axis. The cube's position is fixed at the start of execution, and any object outside of this cube is not integrated into the reconstructed model. Therefore the volume that can be scanned is always limited by the size of the cube. It is possible to manually align many small-size cubes into a single large model, however this is a time-consuming and difficult task, especially when the meshes have complex topologies and high polygon count, as is the case with the meshes obtained from KinFu. 2. The output mesh does not have any color textures. There are some at-tempts to add color in the output point cloud; however, the resulting effect is not photo-realistic. Applying photo-realistic textures to a model can enhance the user experience, even when the model has a simple topology. The main goal of this project is to design and implement a system that captures large indoor environments and generates 3D photo-realistic large indoor models in real time. This report describes an extended version of the KinFu system. The extensions overcome the scalability and texture reconstruction limitations using commodity hardware and open-source software. The complete hardware setup used in this project is worth €2,000, which is comparable to the cost of a single professional laser scanner. The software is released under BSD license, which makes it completely free to use and commercialize. The system has been integrated into the open-source PCL project. The immediate benefits are three-fold: the system becomes a potential industry standard, it is maintained and extended by many developers around the world with no addition-al cost to the VCA group, and it can reduce the application development time by reusing numerous state-of-the-art algorithms
    • …
    corecore