66 research outputs found

    Interaction-Driven Active 3D Reconstruction with Object Interiors

    Full text link
    We introduce an active 3D reconstruction method which integrates visual perception, robot-object interaction, and 3D scanning to recover both the exterior and interior, i.e., unexposed, geometries of a target 3D object. Unlike other works in active vision which focus on optimizing camera viewpoints to better investigate the environment, the primary feature of our reconstruction is an analysis of the interactability of various parts of the target object and the ensuing part manipulation by a robot to enable scanning of occluded regions. As a result, an understanding of part articulations of the target object is obtained on top of complete geometry acquisition. Our method operates fully automatically by a Fetch robot with built-in RGBD sensors. It iterates between interaction analysis and interaction-driven reconstruction, scanning and reconstructing detected moveable parts one at a time, where both the articulated part detection and mesh reconstruction are carried out by neural networks. In the final step, all the remaining, non-articulated parts, including all the interior structures that had been exposed by prior part manipulations and subsequently scanned, are reconstructed to complete the acquisition. We demonstrate the performance of our method via qualitative and quantitative evaluation, ablation studies, comparisons to alternatives, as well as experiments in a real environment.Comment: Accepted to SIGGRAPH Asia 2023, project page at https://vcc.tech/research/2023/InterReco

    From Capture to Display: A Survey on Volumetric Video

    Full text link
    Volumetric video, which offers immersive viewing experiences, is gaining increasing prominence. With its six degrees of freedom, it provides viewers with greater immersion and interactivity compared to traditional videos. Despite their potential, volumetric video services poses significant challenges. This survey conducts a comprehensive review of the existing literature on volumetric video. We firstly provide a general framework of volumetric video services, followed by a discussion on prerequisites for volumetric video, encompassing representations, open datasets, and quality assessment metrics. Then we delve into the current methodologies for each stage of the volumetric video service pipeline, detailing capturing, compression, transmission, rendering, and display techniques. Lastly, we explore various applications enabled by this pioneering technology and we present an array of research challenges and opportunities in the domain of volumetric video services. This survey aspires to provide a holistic understanding of this burgeoning field and shed light on potential future research trajectories, aiming to bring the vision of volumetric video to fruition.Comment: Submitte

    Enhanced device-based 3D object manipulation technique for handheld mobile augmented reality

    Get PDF
    3D object manipulation is one of the most important tasks for handheld mobile Augmented Reality (AR) towards its practical potential, especially for realworld assembly support. In this context, techniques used to manipulate 3D object is an important research area. Therefore, this study developed an improved device based interaction technique within handheld mobile AR interfaces to solve the large range 3D object rotation problem as well as issues related to 3D object position and orientation deviations in manipulating 3D object. The research firstly enhanced the existing device-based 3D object rotation technique with an innovative control structure that utilizes the handheld mobile device tilting and skewing amplitudes to determine the rotation axes and directions of the 3D object. Whenever the device is tilted or skewed exceeding the threshold values of the amplitudes, the 3D object rotation will start continuously with a pre-defined angular speed per second to prevent over-rotation of the handheld mobile device. This over-rotation is a common occurrence when using the existing technique to perform large-range 3D object rotations. The problem of over-rotation of the handheld mobile device needs to be solved since it causes a 3D object registration error and a 3D object display issue where the 3D object does not appear consistent within the user’s range of view. Secondly, restructuring the existing device-based 3D object manipulation technique was done by separating the degrees of freedom (DOF) of the 3D object translation and rotation to prevent the 3D object position and orientation deviations caused by the DOF integration that utilizes the same control structure for both tasks. Next, an improved device-based interaction technique, with better performance on task completion time for 3D object rotation unilaterally and 3D object manipulation comprehensively within handheld mobile AR interfaces was developed. A pilot test was carried out before other main tests to determine several pre-defined values designed in the control structure of the proposed 3D object rotation technique. A series of 3D object rotation and manipulation tasks was designed and developed as separate experimental tasks to benchmark both the proposed 3D object rotation and manipulation techniques with existing ones on task completion time (s). Two different groups of participants aged 19-24 years old were selected for both experiments, with each group consisting sixteen participants. Each participant had to complete twelve trials, which came to a total 192 trials per experiment for all the participants. Repeated measure analysis was used to analyze the data. The results obtained have statistically proven that the developed 3D object rotation technique markedly outpaced existing technique with significant shorter task completion times of 2.04s shorter on easy tasks and 3.09s shorter on hard tasks after comparing the mean times upon all successful trials. On the other hand, for the failed trials, the 3D object rotation technique was 4.99% more accurate on easy tasks and 1.78% more accurate on hard tasks in comparison to the existing technique. Similar results were also extended to 3D object manipulation tasks with an overall 9.529s significant shorter task completion time of the proposed manipulation technique as compared to the existing technique. Based on the findings, an improved device-based interaction technique has been successfully developed to address the insufficient functionalities of the current technique

    To Affinity and Beyond: Interactive Digital Humans as a Human Computer Interface

    Get PDF
    The field of human computer interaction is increasingly exploring the use of more natural, human-like user interfaces to build intelligent agents to aid in everyday life. This is coupled with a move to people using ever more realistic avatars to represent themselves in their digital lives. As the ability to produce emotionally engaging digital human representations is only just now becoming technically possible, there is little research into how to approach such tasks. This is due to both technical complexity and operational implementation cost. This is now changing as we are at a nexus point with new approaches, faster graphics processing and enabling new technologies in machine learning and computer vision becoming available. I articulate the issues required for such digital humans to be considered successfully located on the other side of the phenomenon known as the Uncanny Valley. My results show that a complex mix of perceived and contextual aspects affect the sense making on digital humans and highlights previously undocumented effects of interactivity on the affinity. Users are willing to accept digital humans as a new form of user interface and they react to them emotionally in previously unanticipated ways. My research shows that it is possible to build an effective interactive digital human that crosses the Uncanny Valley. I directly explore what is required to build a visually realistic digital human as a primary research question and I explore if such a realistic face provides sufficient benefit to justify the challenges involved in building it. I conducted a Delphi study to inform the research approaches and then produced a complex digital human character based on these insights. This interactive and realistic digital human avatar represents a major technical undertaking involving multiple teams around the world. Finally, I explored a framework for examining the ethical implications and signpost future research areas

    On Practical Sampling of Bidirectional Reflectance

    Get PDF

    State of the Art on Diffusion Models for Visual Computing

    Full text link
    The field of visual computing is rapidly advancing due to the emergence of generative artificial intelligence (AI), which unlocks unprecedented capabilities for the generation, editing, and reconstruction of images, videos, and 3D scenes. In these domains, diffusion models are the generative AI architecture of choice. Within the last year alone, the literature on diffusion-based tools and applications has seen exponential growth and relevant papers are published across the computer graphics, computer vision, and AI communities with new works appearing daily on arXiv. This rapid growth of the field makes it difficult to keep up with all recent developments. The goal of this state-of-the-art report (STAR) is to introduce the basic mathematical concepts of diffusion models, implementation details and design choices of the popular Stable Diffusion model, as well as overview important aspects of these generative AI tools, including personalization, conditioning, inversion, among others. Moreover, we give a comprehensive overview of the rapidly growing literature on diffusion-based generation and editing, categorized by the type of generated medium, including 2D images, videos, 3D objects, locomotion, and 4D scenes. Finally, we discuss available datasets, metrics, open challenges, and social implications. This STAR provides an intuitive starting point to explore this exciting topic for researchers, artists, and practitioners alike

    Enabling symmetric collaboration in public spaces through 3D mobile interaction

    Full text link
    © 2018 by the authors. Collaboration has been common in workplaces in various engineering settings and in our daily activities. However, how to effectively engage collaborators with collaborative tasks has long been an issue due to various situational and technical constraints. The research in this paper addresses the issue in a specific scenario, which is how to enable users to interact with public information from their own perspective. We describe a 3D mobile interaction technique that allows users to collaborate with other people by creating a symmetric and collaborative ambience. This in turn can increase their engagement with public displays. In order to better understand the benefits and limitations of this technique, we conducted a usability study with a total of 40 participants. The results indicate that the 3D mobile interaction technique promotes collaboration between users and also improves their engagement with the public displays

    Enabling symmetric collaboration in public spaces through 3D mobile interaction

    Get PDF
    Collaboration has been common in workplaces in various engineering settings and in ourdaily activities. However, how to effectively engage collaborators with collaborative tasks has long been an issue due to various situational and technical constraints. The research in this paper addresses the issue in a specific scenario, which is how to enable users to interact with public information from their own perspective. We describe a 3D mobile interaction technique that allows users to collaborate with other people by creating a symmetric and collaborative ambience. This in turn can increase their engagement with public displays. In order to better understand the benefits and limitations of this technique, we conducted a usability study with a total of 40 participants. The results indicate that the 3D mobile interaction technique promotes collaboration between users and also improves their engagement with the public displays

    Big Data Computing for Geospatial Applications

    Get PDF
    The convergence of big data and geospatial computing has brought forth challenges and opportunities to Geographic Information Science with regard to geospatial data management, processing, analysis, modeling, and visualization. This book highlights recent advancements in integrating new computing approaches, spatial methods, and data management strategies to tackle geospatial big data challenges and meanwhile demonstrates opportunities for using big data for geospatial applications. Crucial to the advancements highlighted in this book is the integration of computational thinking and spatial thinking and the transformation of abstract ideas and models to concrete data structures and algorithms

    Let's Walk Up and Play! Design and Evaluation of Collaborative Interactive Musical Experiences for Public Settings

    Get PDF
    This thesis focuses on the design and evaluation of interactive music systems that enable non-experts to experience collaborative music-making in public set- tings, such as museums, galleries and festivals. Although there has been previous research into music systems for non-experts, there is very limited research on how participants engage with collaborative music environments in public set- tings. Informed by a detailed assessment of related research, an interactive, multi-person music system is developed, which serves as a vehicle to conduct practice-based research in real-world settings. A central focus of the design is supporting each player's individual sense of control, in order to examine how this relates to their overall playing experience. Drawing on approaches from Human-Computer Interaction (HCI) and interac- tive art research, a series of user studies is conducted in public settings such as art exhibitions and festivals. Taking into account that the user experience and social dynamics around such new forms of interaction are considerably in u- enced by the context of use, this systematic assessment in real-world contexts contributes to a richer understanding of how people interact and behave in such new creative spaces. This research makes a number of contributions to the elds of HCI, interactive art and New Interfaces for Musical Expression (NIME). It provides a set of de- sign implications to aid designers of future collaborative music systems. These are based on a number of empirical ndings that describe and explain aspects of audience behaviour, engagement and mutual interaction around public, in- teractive multi-person systems. It provides empirical evidence that there is a correlation between participants' perceived level of control and their sense of cre- ative participation and enjoyment. This thesis also develops and demonstrates the application of a mixed-method approach for studying technology-mediated collaborative creativity with live audiences.This research was funded by a Doctoral Studentship from Queen Mary University of London. The studies of this thesis were kindly supported by the Centre for Digital Music EPSRC Platform Grant (EP/E045235/1; EP/K009559/1), and by Hunan University, Changsha, China (Study II). The attendance at ACM Creativity & Cognition 2013 was kindly supported by the EPSRC funded DePIC project (EP/J017205/1)
    corecore