544 research outputs found

    Spatial data acquisition, integration, and modeling for real-time project life-cycle applications

    Get PDF
    Current methods for site modeling employs expensive laser range scanners that produce dense point clouds which require hours or days of post-processing to arrive at a finished model. While these methods produce very detailed models of the scanned scene, useful for obtaining as-built drawings of existing structures, the associated computational time burden precludes the methods from being used onsite for real-time decision-making. Moreover, in many project life-cycle applications, detailed models of objects are not needed. Results of earlier research conducted by the authors demonstrated novel, highly economical methods that reduce data acquisition time and the need for computationally intensive processing. These methods enable complete local area modeling in the order of a minute, and with sufficient accuracy for applications such as advanced equipment control, simple as-built site modeling, and real-time safety monitoring for construction equipment. This paper describes a research project that is investigating novel ways of acquiring, integrating, modeling, and analyzing project site spatial data that do not rely on dense, expensive laser scanning technology and that enable scalability and robustness for real-time, field deployment. Algorithms and methods for modeling objects of simple geometric shape (geometric primitives from a limited number of range points, as well as methods provide a foundation for further development required to address more complex site situations, especially if dynamic site information (motion of personnel and equipment). Field experiments are being conducted to establish performance parameters and validation for the proposed methods and models. Initial experimental work has demonstrated the feasibility of this approach

    Exploring Natural User Abstractions For Shared Perceptual Manipulator Task Modeling & Recovery

    Get PDF
    State-of-the-art domestic robot assistants are essentially autonomous mobile manipulators capable of exerting human-scale precision grasps. To maximize utility and economy, non-technical end-users would need to be nearly as efficient as trained roboticists in control and collaboration of manipulation task behaviors. However, it remains a significant challenge given that many WIMP-style tools require superficial proficiency in robotics, 3D graphics, and computer science for rapid task modeling and recovery. But research on robot-centric collaboration has garnered momentum in recent years; robots are now planning in partially observable environments that maintain geometries and semantic maps, presenting opportunities for non-experts to cooperatively control task behavior with autonomous-planning agents exploiting the knowledge. However, as autonomous systems are not immune to errors under perceptual difficulty, a human-in-the-loop is needed to bias autonomous-planning towards recovery conditions that resume the task and avoid similar errors. In this work, we explore interactive techniques allowing non-technical users to model task behaviors and perceive cooperatively with a service robot under robot-centric collaboration. We evaluate stylus and touch modalities that users can intuitively and effectively convey natural abstractions of high-level tasks, semantic revisions, and geometries about the world. Experiments are conducted with \u27pick-and-place\u27 tasks in an ideal \u27Blocks World\u27 environment using a Kinova JACO six degree-of-freedom manipulator. Possibilities for the architecture and interface are demonstrated with the following features; (1) Semantic \u27Object\u27 and \u27Location\u27 grounding that describe function and ambiguous geometries (2) Task specification with an unordered list of goal predicates, and (3) Guiding task recovery with implied scene geometries and trajectory via symmetry cues and configuration space abstraction. Empirical results from four user studies show our interface was much preferred than the control condition, demonstrating high learnability and ease-of-use that enable our non-technical participants to model complex tasks, provide effective recovery assistance, and teleoperative control

    Advances in Data-Driven Analysis and Synthesis of 3D Indoor Scenes

    Full text link
    This report surveys advances in deep learning-based modeling techniques that address four different 3D indoor scene analysis tasks, as well as synthesis of 3D indoor scenes. We describe different kinds of representations for indoor scenes, various indoor scene datasets available for research in the aforementioned areas, and discuss notable works employing machine learning models for such scene modeling tasks based on these representations. Specifically, we focus on the analysis and synthesis of 3D indoor scenes. With respect to analysis, we focus on four basic scene understanding tasks -- 3D object detection, 3D scene segmentation, 3D scene reconstruction and 3D scene similarity. And for synthesis, we mainly discuss neural scene synthesis works, though also highlighting model-driven methods that allow for human-centric, progressive scene synthesis. We identify the challenges involved in modeling scenes for these tasks and the kind of machinery that needs to be developed to adapt to the data representation, and the task setting in general. For each of these tasks, we provide a comprehensive summary of the state-of-the-art works across different axes such as the choice of data representation, backbone, evaluation metric, input, output, etc., providing an organized review of the literature. Towards the end, we discuss some interesting research directions that have the potential to make a direct impact on the way users interact and engage with these virtual scene models, making them an integral part of the metaverse.Comment: Published in Computer Graphics Forum, Aug 202

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    An annotated bibligraphy of multisensor integration

    Get PDF
    technical reportIn this paper we give an annotated bibliography of the multisensor integration literature

    PARIS: Part-level Reconstruction and Motion Analysis for Articulated Objects

    Full text link
    We address the task of simultaneous part-level reconstruction and motion parameter estimation for articulated objects. Given two sets of multi-view images of an object in two static articulation states, we decouple the movable part from the static part and reconstruct shape and appearance while predicting the motion parameters. To tackle this problem, we present PARIS: a self-supervised, end-to-end architecture that learns part-level implicit shape and appearance models and optimizes motion parameters jointly without any 3D supervision, motion, or semantic annotation. Our experiments show that our method generalizes better across object categories, and outperforms baselines and prior work that are given 3D point clouds as input. Our approach improves reconstruction relative to state-of-the-art baselines with a Chamfer-L1 distance reduction of 3.94 (45.2%) for objects and 26.79 (84.5%) for parts, and achieves 5% error rate for motion estimation across 10 object categories. Video summary at: https://youtu.be/tDSrROPCgUcComment: Presented at ICCV 2023. Project website: https://3dlg-hcvc.github.io/paris
    • …
    corecore