753 research outputs found

    RGB-D datasets using microsoft kinect or similar sensors: a survey

    Get PDF
    RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

    Interactive Spaces Natural interfaces supporting gestures and manipulations in interactive spaces

    Get PDF
    This doctoral dissertation focuses on the development of interactive spaces through the use of natural interfaces based on gestures and manipulative actions. In the real world people use their senses to perceive the external environment and they use manipulations and gestures to explore the world around them, communicate and interact with other individuals. From this perspective the use of natural interfaces that exploit the human sensorial and explorative abilities helps filling the gap between physical and digital world. In the first part of this thesis we describe the work made for improving interfaces and devices for tangible, multi touch and free hand interactions. The idea is to design devices able to work also in uncontrolled environments, and in situations where control is mostly of the physical type where even the less experienced users can express their manipulative exploration and gesture communication abilities. We also analyze how it can be possible to mix these techniques to create an interactive space, specifically designed for teamwork where the natural interfaces are distributed in order to encourage collaboration. We then give some examples of how these interactive scenarios can host various types of applications facilitating, for instance, the exploration of 3D models, the enjoyment of multimedia contents and social interaction. Finally we discuss our results and put them in a wider context, focusing our attention particularly on how the proposed interfaces actually improve people’s lives and activities and the interactive spaces become a place of aggregation where we can pursue objectives that are both personal and shared with others

    Interactive Spaces Natural interfaces supporting gestures and manipulations in interactive spaces

    Get PDF
    This doctoral dissertation focuses on the development of interactive spaces through the use of natural interfaces based on gestures and manipulative actions. In the real world people use their senses to perceive the external environment and they use manipulations and gestures to explore the world around them, communicate and interact with other individuals. From this perspective the use of natural interfaces that exploit the human sensorial and explorative abilities helps filling the gap between physical and digital world. In the first part of this thesis we describe the work made for improving interfaces and devices for tangible, multi touch and free hand interactions. The idea is to design devices able to work also in uncontrolled environments, and in situations where control is mostly of the physical type where even the less experienced users can express their manipulative exploration and gesture communication abilities. We also analyze how it can be possible to mix these techniques to create an interactive space, specifically designed for teamwork where the natural interfaces are distributed in order to encourage collaboration. We then give some examples of how these interactive scenarios can host various types of applications facilitating, for instance, the exploration of 3D models, the enjoyment of multimedia contents and social interaction. Finally we discuss our results and put them in a wider context, focusing our attention particularly on how the proposed interfaces actually improve people’s lives and activities and the interactive spaces become a place of aggregation where we can pursue objectives that are both personal and shared with others

    Computational interaction techniques for 3D selection, manipulation and navigation in immersive VR

    Get PDF
    3D interaction provides a natural interplay for HCI. Many techniques involving diverse sets of hardware and software components have been proposed, which has generated an explosion of Interaction Techniques (ITes), Interactive Tasks (ITas) and input devices, increasing thus the heterogeneity of tools in 3D User Interfaces (3DUIs). Moreover, most of those techniques are based on general formulations that fail in fully exploiting human capabilities for interaction. This is because while 3D interaction enables naturalness, it also produces complexity and limitations when using 3DUIs. In this thesis, we aim to generate approaches that better exploit the high potential human capabilities for interaction by combining human factors, mathematical formalizations and computational methods. Our approach is focussed on the exploration of the close coupling between specific ITes and ITas while addressing common issues of 3D interactions. We specifically focused on the stages of interaction within Basic Interaction Tasks (BITas) i.e., data input, manipulation, navigation and selection. Common limitations of these tasks are: (1) the complexity of mapping generation for input devices, (2) fatigue in mid-air object manipulation, (3) space constraints in VR navigation; and (4) low accuracy in 3D mid-air selection. Along with two chapters of introduction and background, this thesis presents five main works. Chapter 3 focusses on the design of mid-air gesture mappings based on human tacit knowledge. Chapter 4 presents a solution to address user fatigue in mid-air object manipulation. Chapter 5 is focused on addressing space limitations in VR navigation. Chapter 6 describes an analysis and a correction method to address Drift effects involved in scale-adaptive VR navigation; and Chapter 7 presents a hybrid technique 3D/2D that allows for precise selection of virtual objects in highly dense environments (e.g., point clouds). Finally, we conclude discussing how the contributions obtained from this exploration, provide techniques and guidelines to design more natural 3DUIs

    A Survey of Applications and Human Motion Recognition with Microsoft Kinect

    Get PDF
    Microsoft Kinect, a low-cost motion sensing device, enables users to interact with computers or game consoles naturally through gestures and spoken commands without any other peripheral equipment. As such, it has commanded intense interests in research and development on the Kinect technology. In this paper, we present, a comprehensive survey on Kinect applications, and the latest research and development on motion recognition using data captured by the Kinect sensor. On the applications front, we review the applications of the Kinect technology in a variety of areas, including healthcare, education and performing arts, robotics, sign language recognition, retail services, workplace safety training, as well as 3D reconstructions. On the technology front, we provide an overview of the main features of both versions of the Kinect sensor together with the depth sensing technologies used, and review literatures on human motion recognition techniques used in Kinect applications. We provide a classification of motion recognition techniques to highlight the different approaches used in human motion recognition. Furthermore, we compile a list of publicly available Kinect datasets. These datasets are valuable resources for researchers to investigate better methods for human motion recognition and lower-level computer vision tasks such as segmentation, object detection and human pose estimation

    Application-driven visual computing towards industry 4.0 2018

    Get PDF
    245 p.La Tesis recoge contribuciones en tres campos: 1. Agentes Virtuales Interactivos: autónomos, modulares, escalables, ubicuos y atractivos para el usuario. Estos IVA pueden interactuar con los usuarios de manera natural.2. Entornos de RV/RA Inmersivos: RV en la planificación de la producción, el diseño de producto, la simulación de procesos, pruebas y verificación. El Operario Virtual muestra cómo la RV y los Co-bots pueden trabajar en un entorno seguro. En el Operario Aumentado la RA muestra información relevante al trabajador de una manera no intrusiva. 3. Gestión Interactiva de Modelos 3D: gestión online y visualización de modelos CAD multimedia, mediante conversión automática de modelos CAD a la Web. La tecnología Web3D permite la visualización e interacción de estos modelos en dispositivos móviles de baja potencia.Además, estas contribuciones han permitido analizar los desafíos presentados por Industry 4.0. La tesis ha contribuido a proporcionar una prueba de concepto para algunos de esos desafíos: en factores humanos, simulación, visualización e integración de modelos

    Computer Vision-Based Hand Tracking and 3D Reconstruction as a Human-Computer Input Modality with Clinical Application

    Get PDF
    The recent pandemic has impeded patients with hand injuries from connecting in person with their therapists. To address this challenge and improve hand telerehabilitation, we propose two computer vision-based technologies, photogrammetry and augmented reality as alternative and affordable solutions for visualization and remote monitoring of hand trauma without costly equipment. In this thesis, we extend the application of 3D rendering and virtual reality-based user interface to hand therapy. We compare the performance of four popular photogrammetry software in reconstructing a 3D model of a synthetic human hand from videos captured through a smartphone. The visual quality, reconstruction time and geometric accuracy of output model meshes are compared. Reality Capture produces the best result, with output mesh having the least error of 1mm and a total reconstruction time of 15 minutes. We developed an augmented reality app using MediaPipe algorithms that extract hand key points, finger joint coordinates and angles in real-time from hand images or live stream media. We conducted a study to investigate its input variability and validity as a reliable tool for remote assessment of finger range of motion. The intraclass correlation coefficient between DIGITS and in-person measurement obtained is 0.767- 0.81 for finger extension and 0.958–0.857 for finger flexion. Finally, we develop and surveyed the usability of a mobile application that collects patient data medical history, self-reported pain levels and hand 3D models and transfer them to therapists. These technologies can improve hand telerehabilitation, aid clinicians in monitoring hand conditions remotely and make decisions on appropriate therapy, medication, and hand orthoses

    Pictures in Your Mind: Using Interactive Gesture-Controlled Reliefs to Explore Art

    Get PDF
    Tactile reliefs offer many benefits over the more classic raised line drawings or tactile diagrams, as depth, 3D shape, and surface textures are directly perceivable. Although often created for blind and visually impaired (BVI) people, a wider range of people may benefit from such multimodal material. However, some reliefs are still difficult to understand without proper guidance or accompanying verbal descriptions, hindering autonomous exploration. In this work, we present a gesture-controlled interactive audio guide (IAG) based on recent low-cost depth cameras that can be operated directly with the hands on relief surfaces during tactile exploration. The interactively explorable, location-dependent verbal and captioned descriptions promise rapid tactile accessibility to 2.5D spatial information in a home or education setting, to online resources, or as a kiosk installation at public places. We present a working prototype, discuss design decisions, and present the results of two evaluation studies: the first with 13 BVI test users and the second follow-up study with 14 test users across a wide range of people with differences and difficulties associated with perception, memory, cognition, and communication. The participant-led research method of this latter study prompted new, significant and innovative developments

    tCAD: a 3D modeling application on a depth enhanced tabletop computer

    Get PDF
    Tabletop computers featuring multi-touch input and object tracking are a common platform for research on Tangible User Interfaces (also known as Tangible Interaction). However, such systems are confined to sensing activity on the tabletop surface, disregarding the rich and relatively unexplored interaction canvas above the tabletop. This dissertation contributes with tCAD, a 3D modeling tool combining fiducial marker tracking, finger tracking and depth sensing in a single system. This dissertation presents the technical details of how these features were integrated, attesting to its viability through the design, development and early evaluation of the tCAD application. A key aspect of this work is a description of the interaction techniques enabled by merging tracked objects with direct user input on and above a table surface.Universidade da Madeir

    Object recognition for an autonomous wheelchair equipped with a RGB-D camera

    Get PDF
    This thesis has been carried out within a project at AR Lab (Autonomous Robot Laboratory) and IAS-Lab (Intelligent Autonomous Systems Lab) of Shanghai Jiao Tong University and University of Padua respectively. The project aims to create a system to recognize and localize multiple object classes for an autonomous wheelchair called JiaoLong, and in general for a mobile robot. The thesis had as main objective the creation of an object recognition and localization system in an indoor environment through a RGB-D sensor. The approach we followed is based on the recognition of the object by using 2D algorithm and 3D information to identify location and size of it. This will help to obtained robust performance for the recognition step and accurate estimation for the localization, thus changing the behavior of the robot in accordance with the class and the location of the object in the room. This thesis is mainly based on two aspects: • the creation of a 2D module to recognize and detect the object in a RGB image; • the creation of a 3D module to filter point cloud and estimate pose and size of the object. In this thesis we used the Bag of Features algorithm to perform the recognition of objects and a variation of the Constellation Method algorithm for the detection; 3D data are computed with several filtering algorithms which lead to a 3D analysis of the object, then are used the intrinsic information of point cloud for the pose and size estimation. We will also analyze the performance of the algorithm and propose some improvements aimed to increase the overall performance of the system besides research directions that this project could lea
    • …
    corecore