1,131 research outputs found

    GRChombo : Numerical Relativity with Adaptive Mesh Refinement

    Full text link
    In this work, we introduce GRChombo: a new numerical relativity code which incorporates full adaptive mesh refinement (AMR) using block structured Berger-Rigoutsos grid generation. The code supports non-trivial "many-boxes-in-many-boxes" mesh hierarchies and massive parallelism through the Message Passing Interface (MPI). GRChombo evolves the Einstein equation using the standard BSSN formalism, with an option to turn on CCZ4 constraint damping if required. The AMR capability permits the study of a range of new physics which has previously been computationally infeasible in a full 3+1 setting, whilst also significantly simplifying the process of setting up the mesh for these problems. We show that GRChombo can stably and accurately evolve standard spacetimes such as binary black hole mergers and scalar collapses into black holes, demonstrate the performance characteristics of our code, and discuss various physics problems which stand to benefit from the AMR technique.Comment: 48 pages, 24 figure

    Affecting Fundamental Transformation in Future Construction Work Through Replication of the Master-Apprentice Learning Model in Human-Robot Worker Teams

    Full text link
    Construction robots continue to be increasingly deployed on construction sites to assist human workers in various tasks to improve safety, efficiency, and productivity. Due to the recent and ongoing growth in robot capabilities and functionalities, humans and robots are now able to work side-by-side and share workspaces. However, due to inherent safety and trust-related concerns, human-robot collaboration is subject to strict safety standards that require robot motion and forces to be sensitive to proximate human workers. In addition, construction robots are required to perform construction tasks in unstructured and cluttered environments. The tasks are quasi-repetitive, and robots need to handle unexpected circumstances arising from loose tolerances and discrepancies between as-designed and as-built work. It is therefore impractical to pre-program construction robots or apply optimization methods to determine robot motion trajectories for the performance of typical construction work. This research first proposes a new taxonomy for human-robot collaboration on construction sites, which includes five levels: Pre-Programming, Adaptive Manipulation, Imitation Learning, Improvisatory Control, and Full Autonomy, and identifies the gaps existing in knowledge transfer between humans and assisting robots. In an attempt to address the identified gaps, this research focuses on three key studies: enabling construction robots to estimate their pose ubiquitously within the workspace (Pose Estimation), robots learning to perform construction tasks from human workers (Learning from Demonstration), and robots synchronizing their work plans with human collaborators in real-time (Digital Twin). First, this dissertation investigates the use of cameras as a novel sensor system for estimating the pose of large-scale robotic manipulators relative to the job sites. A deep convolutional network human pose estimation algorithm was adapted and fused with sensor-based poses to provide real-time uninterrupted 6-DOF pose estimates of the manipulator’s components. The network was trained with image datasets collected from a robotic excavator in the laboratory and conventional excavators on construction sites. The proposed system yielded an uninterrupted and centimeter-level accuracy pose estimation system for articulated construction robots. Second, this dissertation investigated Robot Learning from Demonstration (LfD) methods to teach robots how to perform quasi-repetitive construction tasks, such as the ceiling tile installation process. LfD methods have the potential to be used in teaching robots specific tasks through human demonstration, such that the robots can then perform the same tasks under different conditions. A visual LfD and a trajectory LfD methods are developed to incorporate the context translation model, Reinforcement Learning method, and generalized cylinders with orientation approach to generate the control policy for the robot to perform the subsequent tasks. The evaluated results in the Gazebo robotics simulator confirm the promise and applicability of the LfD method in teaching robot apprentices to perform quasi-repetitive tasks on construction sites. Third, this dissertation explores a safe working environment for human workers and robots. Robot simulations in online Digital Twins can be used to extend designed construction models, such as BIM (Building Information Models), to the construction phase for real-time monitoring of robot motion planning and control. A bi-directional communication system was developed to bridge robot simulations and physical robots in construction and digital fabrication. Through empirical studies, the high accuracy of the pose synchronization between physical and virtual robots demonstrated the potential for ensuring safety during proximate human-robot co-work.PHDCivil EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169666/1/cjliang_1.pd

    Data-driven 3D Reconstruction and View Synthesis of Dynamic Scene Elements

    Get PDF
    Our world is filled with living beings and other dynamic elements. It is important to record dynamic things and events for the sake of education, archeology, and culture inheritance. From vintage to modern times, people have recorded dynamic scene elements in different ways, from sequences of cave paintings to frames of motion pictures. This thesis focuses on two key computer vision techniques by which dynamic element representation moves beyond video capture: towards 3D reconstruction and view synthesis. Although previous methods on these two aspects have been adopted to model and represent static scene elements, dynamic scene elements present unique and difficult challenges for the tasks. This thesis focuses on three types of dynamic scene elements, namely 1) dynamic texture with static shape, 2) dynamic shapes with static texture, and 3) dynamic illumination of static scenes. Two research aspects will be explored to represent and visualize them: dynamic 3D reconstruction and dynamic view synthesis. Dynamic 3D reconstruction aims to recover the 3D geometry of dynamic objects and, by modeling the objects’ movements, bring 3D reconstructions to life. Dynamic view synthesis, on the other hand, summarizes or predicts the dynamic appearance change of dynamic objects – for example, the daytime-to-nighttime illumination of a building or the future movements of a rigid body. We first target the problem of reconstructing dynamic textures of objects that have (approximately) fixed 3D shape but time-varying appearance. Examples of such objects include waterfalls, fountains, and electronic billboards. Since the appearance of dynamic-textured objects can be random and complicated, estimating the 3D geometry of these objects from 2D images/video requires novel tools beyond the appearance-based point correspondence methods of traditional 3D computer vision. To perform this 3D reconstruction, we introduce a method that simultaneously 1) segments dynamically textured scene objects in the input images and 2) reconstructs the 3D geometry of the entire scene, assuming a static 3D shape for the dynamically textured objects. Compared to dynamic textures, the appearance change of dynamic shapes is due to physically defined motions like rigid body movements. In these cases, assumptions can be made about the object’s motion constraints in order to identify corresponding points on the object at different timepoints. For example, two points on a rigid object have constant distance between them in the 3D space, no matter how the object moves. Based on this assumption of local rigidity, we propose a robust method to correctly identify point correspondences of two images viewing the same moving object from different viewpoints and at different times. Dense 3D geometry could be obtained from the computed point correspondences. We apply this method on unsynchronized video streams, and observe that the number of inlier correspondences found by this method can be used as indicator for frame alignment among the different streams. To model dynamic scene appearance caused by illumination changes, we propose a framework to find a sequence of images that have similar geometric composition as a single reference image and also show a smooth transition in illumination throughout the day. These images could be registered to visualize patterns of illumination change from a single viewpoint. The final topic of this thesis involves predicting the movements of dynamic shapes in the image domain. Towards this end, we propose deep neural network architectures to predict future views of dynamic motions, such as rigid body movements and flowers blooming. Instead of predicting image pixels from the network, my methods predict pixel offsets and iteratively synthesize future views.Doctor of Philosoph

    Multiple View Geometry For Video Analysis And Post-production

    Get PDF
    Multiple view geometry is the foundation of an important class of computer vision techniques for simultaneous recovery of camera motion and scene structure from a set of images. There are numerous important applications in this area. Examples include video post-production, scene reconstruction, registration, surveillance, tracking, and segmentation. In video post-production, which is the topic being addressed in this dissertation, computer analysis of the motion of the camera can replace the currently used manual methods for correctly aligning an artificially inserted object in a scene. However, existing single view methods typically require multiple vanishing points, and therefore would fail when only one vanishing point is available. In addition, current multiple view techniques, making use of either epipolar geometry or trifocal tensor, do not exploit fully the properties of constant or known camera motion. Finally, there does not exist a general solution to the problem of synchronization of N video sequences of distinct general scenes captured by cameras undergoing similar ego-motions, which is the necessary step for video post-production among different input videos. This dissertation proposes several advancements that overcome these limitations. These advancements are used to develop an efficient framework for video analysis and post-production in multiple cameras. In the first part of the dissertation, the novel inter-image constraints are introduced that are particularly useful for scenes where minimal information is available. This result extends the current state-of-the-art in single view geometry techniques to situations where only one vanishing point is available. The property of constant or known camera motion is also described in this dissertation for applications such as calibration of a network of cameras in video surveillance systems, and Euclidean reconstruction from turn-table image sequences in the presence of zoom and focus. We then propose a new framework for the estimation and alignment of camera motions, including both simple (panning, tracking and zooming) and complex (e.g. hand-held) camera motions. Accuracy of these results is demonstrated by applying our approach to video post-production applications such as video cut-and-paste and shadow synthesis. As realistic image-based rendering problems, these applications require extreme accuracy in the estimation of camera geometry, the position and the orientation of the light source, and the photometric properties of the resulting cast shadows. In each case, the theoretical results are fully supported and illustrated by both numerical simulations and thorough experimentation on real data

    Computational Multimedia for Video Self Modeling

    Get PDF
    Video self modeling (VSM) is a behavioral intervention technique in which a learner models a target behavior by watching a video of oneself. This is the idea behind the psychological theory of self-efficacy - you can learn or model to perform certain tasks because you see yourself doing it, which provides the most ideal form of behavior modeling. The effectiveness of VSM has been demonstrated for many different types of disabilities and behavioral problems ranging from stuttering, inappropriate social behaviors, autism, selective mutism to sports training. However, there is an inherent difficulty associated with the production of VSM material. Prolonged and persistent video recording is required to capture the rare, if not existed at all, snippets that can be used to string together in forming novel video sequences of the target skill. To solve this problem, in this dissertation, we use computational multimedia techniques to facilitate the creation of synthetic visual content for self-modeling that can be used by a learner and his/her therapist with a minimum amount of training data. There are three major technical contributions in my research. First, I developed an Adaptive Video Re-sampling algorithm to synthesize realistic lip-synchronized video with minimal motion jitter. Second, to denoise and complete the depth map captured by structure-light sensing systems, I introduced a layer based probabilistic model to account for various types of uncertainties in the depth measurement. Third, I developed a simple and robust bundle-adjustment based framework for calibrating a network of multiple wide baseline RGB and depth cameras

    OpenPTrack: Open Source Multi-Camera Calibration and People Tracking for RGB-D Camera Networks

    Get PDF
    OpenPTrack is an open source software for multi-camera calibration and people tracking in RGB-D camera networks. It allows to track people in big volumes at sensor frame rate and currently supports a heterogeneous set of 3D sensors. In this work, we describe its user-friendly calibration procedure, which consists of simple steps with real-time feedback that allow to obtain accurate results in estimating the camera poses that are then used for tracking people. On top of a calibration based on moving a checkerboard within the tracking space and on a global optimization of cameras and checkerboards poses, a novel procedure which aligns people detections coming from all sensors in a x-y-time space is used for refining camera poses. While people detection is executed locally, in the machines connected to each sensor, tracking is performed by a single node which takes into account detections from all over the network. Here we detail how a cascade of algorithms working on depth point clouds and color, infrared and disparity images is used to perform people detection from different types of sensors and in any indoor light condition. We present experiments showing that a considerable improvement can be obtained with the proposed calibration refinement procedure that exploits people detections and we compare Kinect v1, Kinect v2 and Mesa SR4500 performance for people tracking applications. OpenPTrack is based on the Robot Operating System and the Point Cloud Library and has already been adopted in networks composed of up to ten imagers for interactive arts, education, culture and human\u2013robot interaction applications

    Computational Methods for Cognitive and Cooperative Robotics

    Get PDF
    In the last decades design methods in control engineering made substantial progress in the areas of robotics and computer animation. Nowadays these methods incorporate the newest developments in machine learning and artificial intelligence. But the problems of flexible and online-adaptive combinations of motor behaviors remain challenging for human-like animations and for humanoid robotics. In this context, biologically-motivated methods for the analysis and re-synthesis of human motor programs provide new insights in and models for the anticipatory motion synthesis. This thesis presents the author’s achievements in the areas of cognitive and developmental robotics, cooperative and humanoid robotics and intelligent and machine learning methods in computer graphics. The first part of the thesis in the chapter “Goal-directed Imitation for Robots” considers imitation learning in cognitive and developmental robotics. The work presented here details the author’s progress in the development of hierarchical motion recognition and planning inspired by recent discoveries of the functions of mirror-neuron cortical circuits in primates. The overall architecture is capable of ‘learning for imitation’ and ‘learning by imitation’. The complete system includes a low-level real-time capable path planning subsystem for obstacle avoidance during arm reaching. The learning-based path planning subsystem is universal for all types of anthropomorphic robot arms, and is capable of knowledge transfer at the level of individual motor acts. Next, the problems of learning and synthesis of motor synergies, the spatial and spatio-temporal combinations of motor features in sequential multi-action behavior, and the problems of task-related action transitions are considered in the second part of the thesis “Kinematic Motion Synthesis for Computer Graphics and Robotics”. In this part, a new approach of modeling complex full-body human actions by mixtures of time-shift invariant motor primitives in presented. The online-capable full-body motion generation architecture based on dynamic movement primitives driving the time-shift invariant motor synergies was implemented as an online-reactive adaptive motion synthesis for computer graphics and robotics applications. The last chapter of the thesis entitled “Contraction Theory and Self-organized Scenarios in Computer Graphics and Robotics” is dedicated to optimal control strategies in multi-agent scenarios of large crowds of agents expressing highly nonlinear behaviors. This last part presents new mathematical tools for stability analysis and synthesis of multi-agent cooperative scenarios.In den letzten Jahrzehnten hat die Forschung in den Bereichen der Steuerung und Regelung komplexer Systeme erhebliche Fortschritte gemacht, insbesondere in den Bereichen Robotik und Computeranimation. Die Entwicklung solcher Systeme verwendet heutzutage neueste Methoden und Entwicklungen im Bereich des maschinellen Lernens und der kĂŒnstlichen Intelligenz. Die flexible und echtzeitfĂ€hige Kombination von motorischen Verhaltensweisen ist eine wesentliche Herausforderung fĂŒr die Generierung menschenĂ€hnlicher Animationen und in der humanoiden Robotik. In diesem Zusammenhang liefern biologisch motivierte Methoden zur Analyse und Resynthese menschlicher motorischer Programme neue Erkenntnisse und Modelle fĂŒr die antizipatorische Bewegungssynthese. Diese Dissertation prĂ€sentiert die Ergebnisse der Arbeiten des Autors im Gebiet der kognitiven und Entwicklungsrobotik, kooperativer und humanoider Robotersysteme sowie intelligenter und maschineller Lernmethoden in der Computergrafik. Der erste Teil der Dissertation im Kapitel “Zielgerichtete Nachahmung fĂŒr Roboter” behandelt das Imitationslernen in der kognitiven und Entwicklungsrobotik. Die vorgestellten Arbeiten beschreiben neue Methoden fĂŒr die hierarchische Bewegungserkennung und -planung, die durch Erkenntnisse zur Funktion der kortikalen Spiegelneuronen-Schaltkreise bei Primaten inspiriert wurden. Die entwickelte Architektur ist in der Lage, ‘durch Imitation zu lernen’ und ‘zu lernen zu imitieren’. Das komplette entwickelte System enthĂ€lt ein echtzeitfĂ€higes Pfadplanungssubsystem zur Hindernisvermeidung wĂ€hrend der DurchfĂŒhrung von Armbewegungen. Das lernbasierte Pfadplanungssubsystem ist universell und fĂŒr alle Arten von anthropomorphen Roboterarmen in der Lage, Wissen auf der Ebene einzelner motorischer Handlungen zu ĂŒbertragen. Im zweiten Teil der Arbeit “Kinematische Bewegungssynthese fĂŒr Computergrafik und Robotik” werden die Probleme des Lernens und der Synthese motorischer Synergien, d.h. von rĂ€umlichen und rĂ€umlich-zeitlichen Kombinationen motorischer Bewegungselemente bei Bewegungssequenzen und bei aufgabenbezogenen Handlungs ĂŒbergĂ€ngen behandelt. Es wird ein neuer Ansatz zur Modellierung komplexer menschlicher Ganzkörperaktionen durch Mischungen von zeitverschiebungsinvarianten Motorprimitiven vorgestellt. Zudem wurde ein online-fĂ€higer Synthesealgorithmus fĂŒr Ganzköperbewegungen entwickelt, der auf dynamischen Bewegungsprimitiven basiert, die wiederum auf der Basis der gelernten verschiebungsinvarianten Primitive konstruiert werden. Dieser Algorithmus wurde fĂŒr verschiedene Probleme der Bewegungssynthese fĂŒr die Computergrafik- und Roboteranwendungen implementiert. Das letzte Kapitel der Dissertation mit dem Titel “Kontraktionstheorie und selbstorganisierte Szenarien in der Computergrafik und Robotik” widmet sich optimalen Kontrollstrategien in Multi-Agenten-Szenarien, wobei die Agenten durch eine hochgradig nichtlineare Kinematik gekennzeichnet sind. Dieser letzte Teil prĂ€sentiert neue mathematische Werkzeuge fĂŒr die StabilitĂ€tsanalyse und Synthese von kooperativen Multi-Agenten-Szenarien

    A model-based design flow for embedded vision applications on heterogeneous architectures

    Get PDF
    The ability to gather information from images is straightforward to human, and one of the principal input to understand external world. Computer vision (CV) is the process to extract such knowledge from the visual domain in an algorithmic fashion. The requested computational power to process these information is very high. Until recently, the only feasible way to meet non-functional requirements like performance was to develop custom hardware, which is costly, time-consuming and can not be reused in a general purpose. The recent introduction of low-power and low-cost heterogeneous embedded boards, in which CPUs are combine with heterogeneous accelerators like GPUs, DSPs and FPGAs, can combine the hardware efficiency needed for non-functional requirements with the flexibility of software development. Embedded vision is the term used to identify the application of the aforementioned CV algorithms applied in the embedded field, which usually requires to satisfy, other than functional requirements, also non-functional requirements such as real-time performance, power, and energy efficiency. Rapid prototyping, early algorithm parametrization, testing, and validation of complex embedded video applications for such heterogeneous architectures is a very challenging task. This thesis presents a comprehensive framework that: 1) Is based on a model-based paradigm. Differently from the standard approaches at the state of the art that require designers to manually model the algorithm in any programming language, the proposed approach allows for a rapid prototyping, algorithm validation and parametrization in a model-based design environment (i.e., Matlab/Simulink). The framework relies on a multi-level design and verification flow by which the high-level model is then semi-automatically refined towards the final automatic synthesis into the target hardware device. 2) Relies on a polyglot parallel programming model. The proposed model combines different programming languages and environments such as C/C++, OpenMP, PThreads, OpenVX, OpenCV, and CUDA to best exploit different levels of parallelism while guaranteeing a semi-automatic customization. 3) Optimizes the application performance and energy efficiency through a novel algorithm for the mapping and scheduling of the application 3 tasks on the heterogeneous computing elements of the device. Such an algorithm, called exclusive earliest finish time (XEFT), takes into consideration the possible multiple implementation of tasks for different computing elements (e.g., a task primitive for CPU and an equivalent parallel implementation for GPU). It introduces and takes advantage of the notion of exclusive overlap between primitives to improve the load balancing. This thesis is the result of three years of research activity, during which all the incremental steps made to compose the framework have been tested on real case studie

    TScan: Stationary LiDAR for Traffic and Safety Studies—Object Detection and Tracking

    Get PDF
    The ability to accurately measure and cost-effectively collect traffic data at road intersections is needed to improve their safety and operations. This study investigates the feasibility of using laser ranging technology (LiDAR) for this purpose. The proposed technology does not experience some of the problems of the current video-based technology but less expensive low-end sensors have limited density of points where measurements are collected that may bring new challenges. A novel LiDAR-based portable traffic scanner (TScan) is introduced in this report to detect and track various types of road users (e.g., trucks, cars, pedestrians, and bicycles). The scope of this study included the development of a signal processing algorithm and a user interface, their implementation on a TScan research unit, and evaluation of the unit performance to confirm its practicality for safety and traffic engineering applications. The TScan research unit was developed by integrating a Velodyne HDL-64E laser scanner within the existing Purdue University Mobile Traffic Laboratory which has a telescoping mast, video cameras, a computer, and an internal communications network. The low-end LiDAR sensor’s limited resolution of data points was further reduced by the distance, the light beam absorption on dark objects, and the reflection away from the sensor on oblique surfaces. The motion of the LiDAR sensor located at the top of the mast caused by wind and passing vehicles was accounted for with the readings from an inertial sensor atop the LiDAR. These challenges increased the need for an effective signal processing method to extract the maximum useful information. The developed TScan method identifies and extracts the background with a method applied in both the spherical and orthogonal coordinates. The moving objects are detected by clustering them; then the data points are tracked, first as clusters and then as rectangles fit to these clusters. After tracking, the individual moving objects are classified in categories, such as heavy and non-heavy vehicles, bicycles, and pedestrians. The resulting trajectories of the moving objects are stored for future processing with engineering applications. The developed signal-processing algorithm is supplemented with a convenient user interface for setting and running and inspecting the results during and after the data collection. In addition, one engineering application was developed in this study for counting moving objects at intersections. Another existing application, the Surrogate Safety Analysis Model (SSAM), was interfaced with the TScan method to allow extracting traffic conflicts and collisions from the TScan results. A user manual also was developed to explain the operation of the system and the application of the two engineering applications. Experimentation with the computational load and execution speed of the algorithm implemented on the MATLAB platform indicated that the use of a standard GPU for processing would permit real-time running of the algorithms during data collection. Thus, the post-processing phase of this method is less time consuming and more practical. Evaluation of the TScan performance was evaluated by comparing to the best available method: video frame-by-frame analysis with human observers. The results comparison included counting moving objects; estimating the positions of the objects, their speed, and direction of travel; and counting interactions between moving objects. The evaluation indicated that the benchmark method measured the vehicle positions and speeds at the accuracy comparable to the TScan performance. It was concluded that the TScan performance is sufficient for measuring traffic volumes, speeds, classifications, and traffic conflicts. The traffic interactions extracted by SSAM required automatic post-processing to eliminate vehicle interactions at too low speed and between pedestrians – events that could not be recognized by SSAM. It should be stressed that this post processing does not require human involvement. Nighttime conditions, light rain, and fog did not reduce the quality of the results. Several improvements of this new method are recommended and discussed in this report. The recommendations include implementing two TScan units at large intersections and adding the ability to collect traffic signal indications during data collection
    • 

    corecore