730 research outputs found

    RGBD Datasets: Past, Present and Future

    Full text link
    Since the launch of the Microsoft Kinect, scores of RGBD datasets have been released. These have propelled advances in areas from reconstruction to gesture recognition. In this paper we explore the field, reviewing datasets across eight categories: semantics, object pose estimation, camera tracking, scene reconstruction, object tracking, human actions, faces and identification. By extracting relevant information in each category we help researchers to find appropriate data for their needs, and we consider which datasets have succeeded in driving computer vision forward and why. Finally, we examine the future of RGBD datasets. We identify key areas which are currently underexplored, and suggest that future directions may include synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style

    Multimodal human hand motion sensing and analysis - a review

    Get PDF

    Fusion of pose and head tracking data for immersive mixed-reality application development

    Get PDF
    This work addresses the creation of a development framework where application developers can create, in a natural way, immersive physical activities where users experience a 3D first-person perception of full body control. The proposed frame-work is based on commercial motion sensors and a Head-Mounted Display (HMD), and a uses Unity 3D as a unifying environment where user pose, virtual scene and immersive visualization functions are coordinated. Our proposal is exemplified by the development of a toy application showing its practical us

    An advanced virtual dance performance evaluator

    Get PDF
    The ever increasing availability of high speed Internet access has led to a leap in technologies that support real-time realistic interaction between humans in online virtual environments. In the context of this work, we wish to realise the vision of an online dance studio where a dance class is to be provided by an expert dance teacher and to be delivered to online students via the web. In this paper we study some of the technical issues that need to be addressed in this challenging scenario. In particular, we describe an automatic dance analysis tool that would be used to evaluate a student's performance and provide him/her with meaningful feedback to aid improvement

    Jester: A Device Abstraction and Data Fusion API for Skeletal Tracking

    Get PDF
    Humans naturally interact with the world in three dimensions. Traditionally, personal computers have relied on 2D mice for input because 3D user tracking systems were cumbersome and expensive. Recently, 3D input hardware has become accurate and affordable enough to be marketed to average consumers and integrated into niche applications. Presently, 3D application developers must learn a different API for each device their software will support, and there is no simple way to integrate sensor data if the system has multiple 3D input devices. This thesis presents Jester, a library designed to simplify the development and improve the accuracy of 3D input-supported applications by providing an easily-extensible set of sensor wrappers that abstract the hardware specific details of capturing skeletal data and fusing sensor data in multiple 3D input device systems. Jester\u27s capabilities are demonstrated by creating a toy application that uses a PrimeSense Carmine and Leap Motion Controller to provide full body and finger skeletal tracking. Jester was able to fuse the data in real time while using the Carmine\u27s data to compensate for ambiguity in the Leap\u27s tracking

    Towards the Design of a Natural User Interface for Performing and Learning Musical Gestures

    Get PDF
    AbstractA large variety of musical instruments, either acoustical or digital, are based on a keyboard scheme. Keyboard instruments can produce sounds through acoustic means but they are increasingly used to control digital sound synthesis processes with nowadays music. Interestingly, with all the different possibilities of sonic outcomes, the input remains a musical gesture. In this paper we present the conceptualization of a Natural User Interface (NUI), named the Intangible Musical Instrument (IMI), aiming to support both learning of expert musical gestures and performing music as a unified user experience. The IMI is designed to recognize metaphors of pianistic gestures, focusing on subtle uses of fingers and upper-body. Based on a typology of musical gestures, a gesture vocabulary has been created, hierarchized from basic to complex. These piano-like gestures are finally recognized and transformed into sounds

    MIFTel: a multimodal interactive framework based on temporal logic rules

    Get PDF
    Human-computer and multimodal interaction are increasingly used in everyday life. Machines are able to get more from the surrounding world, assisting humans in different application areas. In this context, the correct processing and management of signals provided by the environments is determinant for structuring the data. Different sources and acquisition times can be exploited for improving recognition results. On the basis of these assumptions, we are proposing a multimodal system that exploits Allen’s temporal logic combined with a prevision method. The main object is to correlate user’s events with system’s reactions. After post-elaborating coming data from different signal sources (RGB images, depth maps, sounds, proximity sensors, etc.), the system is managing the correlations between recognition/detection results and events in real-time to create an interactive environment for the user. For increasing the recognition reliability, a predictive model is also associated with the proposed method. The modularity of the system grants a full dynamic development and upgrade with custom modules. Finally, a comparison with other similar systems is shown, underlining the high flexibility and robustness of the proposed event management method
    corecore