4,949 research outputs found

    Human Motion Trajectory Prediction: A Survey

    Full text link
    With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper provides a survey of human motion trajectory prediction. We review, analyze and structure a large selection of work from different communities and propose a taxonomy that categorizes existing methods based on the motion modeling approach and level of contextual information used. We provide an overview of the existing datasets and performance metrics. We discuss limitations of the state of the art and outline directions for further research.Comment: Submitted to the International Journal of Robotics Research (IJRR), 37 page

    Bayesian Inference of Recursive Sequences of Group Activities from Tracks

    Full text link
    We present a probabilistic generative model for inferring a description of coordinated, recursively structured group activities at multiple levels of temporal granularity based on observations of individuals' trajectories. The model accommodates: (1) hierarchically structured groups, (2) activities that are temporally and compositionally recursive, (3) component roles assigning different subactivity dynamics to subgroups of participants, and (4) a nonparametric Gaussian Process model of trajectories. We present an MCMC sampling framework for performing joint inference over recursive activity descriptions and assignment of trajectories to groups, integrating out continuous parameters. We demonstrate the model's expressive power in several simulated and complex real-world scenarios from the VIRAT and UCLA Aerial Event video data sets.Comment: 10 pages, 6 figures, in Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI'16), Phoenix, AZ, 201

    Identifying manifolds underlying group motion in Vicsek agents

    Full text link
    Collective motion of animal groups often undergoes changes due to perturbations. In a topological sense, we describe these changes as switching between low-dimensional embedding manifolds underlying a group of evolving agents. To characterize such manifolds, first we introduce a simple mapping of agents between time-steps. Then, we construct a novel metric which is susceptible to variations in the collective motion, thus revealing distinct underlying manifolds. The method is validated through three sample scenarios simulated using a Vicsek model, namely switching of speed, coordination, and structure of a group. Combined with a dimensionality reduction technique that is used to infer the dimensionality of the embedding manifold, this approach provides an effective model-free framework for the analysis of collective behavior across animal species.Comment: 12 pages, 6 figures, journal articl

    SEGMENTATION, RECOGNITION, AND ALIGNMENT OF COLLABORATIVE GROUP MOTION

    Get PDF
    Modeling and recognition of human motion in videos has broad applications in behavioral biometrics, content-based visual data analysis, security and surveillance, as well as designing interactive environments. Significant progress has been made in the past two decades by way of new models, methods, and implementations. In this dissertation, we focus our attention on a relatively less investigated sub-area called collaborative group motion analysis. Collaborative group motions are those that typically involve multiple objects, wherein the motion patterns of individual objects may vary significantly in both space and time, but the collective motion pattern of the ensemble allows characterization in terms of geometry and statistics. Therefore, the motions or activities of an individual object constitute local information. A framework to synthesize all local information into a holistic view, and to explicitly characterize interactions among objects, involves large scale global reasoning, and is of significant complexity. In this dissertation, we first review relevant previous contributions on human motion/activity modeling and recognition, and then propose several approaches to answer a sequence of traditional vision questions including 1) which of the motion elements among all are the ones relevant to a group motion pattern of interest (Segmentation); 2) what is the underlying motion pattern (Recognition); and 3) how two motion ensembles are similar and how we can 'optimally' transform one to match the other (Alignment). Our primary practical scenario is American football play, where the corresponding problems are 1) who are offensive players; 2) what are the offensive strategy they are using; and 3) whether two plays are using the same strategy and how we can remove the spatio-temporal misalignment between them due to internal or external factors. The proposed approaches discard traditional modeling paradigm but explore either concise descriptors, hierarchies, stochastic mechanism, or compact generative model to achieve both effectiveness and efficiency. In particular, the intrinsic geometry of the spaces of the involved features/descriptors/quantities is exploited and statistical tools are established on these nonlinear manifolds. These initial attempts have identified new challenging problems in complex motion analysis, as well as in more general tasks in video dynamics. The insights gained from nonlinear geometric modeling and analysis in this dissertation may hopefully be useful toward a broader class of computer vision applications

    Point Cloud Processing for Environmental Analysis in Autonomous Driving using Deep Learning

    Get PDF
    Autonomous self-driving cars need a very precise perception system of their environment, working for every conceivable scenario. Therefore, different kinds of sensor types, such as lidar scanners, are in use. This thesis contributes highly efficient algorithms for 3D object recognition to the scientific community. It provides a Deep Neural Network with specific layers and a novel loss to safely localize and estimate the orientation of objects from point clouds originating from lidar sensors. First, a single-shot 3D object detector is developed that outputs dense predictions in only one forward pass. Next, this detector is refined by fusing complementary semantic features from cameras and joint probabilistic tracking to stabilize predictions and filter outliers. The last part presents an evaluation of data from automotive-grade lidar scanners. A Generative Adversarial Network is also being developed as an alternative for target-specific artificial data generation.One of the main objectives of leading automotive companies is autonomous self-driving cars. They need a very precise perception system of their environment, working for every conceivable scenario. Therefore, different kinds of sensor types are in use. Besides cameras, lidar scanners became very important. The development in that field is significant for future applications and system integration because lidar offers a more accurate depth representation, independent from environmental illumination. Especially algorithms and machine learning approaches, including Deep Learning and Artificial Intelligence based on raw laser scanner data, are very important due to the long range and three-dimensional resolution of the measured point clouds. Consequently, a broad field of research with many challenges and unsolved tasks has been established. This thesis aims to address this deficit and contribute highly efficient algorithms for 3D object recognition to the scientific community. It provides a Deep Neural Network with specific layers and a novel loss to safely localize and estimate the orientation of objects from point clouds. First, a single shot 3D object detector is developed that outputs dense predictions in only one forward pass. Next, this detector is refined by fusing complementary semantic features from cameras and a joint probabilistic tracking to stabilize predictions and filter outliers. In the last part, a concept for deployment into an existing test vehicle focuses on the semi-automated generation of a suitable dataset. Subsequently, an evaluation of data from automotive-grade lidar scanners is presented. A Generative Adversarial Network is also being developed as an alternative for target-specific artificial data generation. Experiments on the acquired application-specific and benchmark datasets show that the presented methods compete with a variety of state-of-the-art algorithms while being trimmed down to efficiency for use in self-driving cars. Furthermore, they include an extensive set of standard evaluation metrics and results to form a solid baseline for future research.Eines der Hauptziele führender Automobilhersteller sind autonome Fahrzeuge. Sie benötigen ein sehr präzises System für die Wahrnehmung der Umgebung, dass für jedes denkbare Szenario überall auf der Welt funktioniert. Daher sind verschiedene Arten von Sensoren im Einsatz, sodass neben Kameras u. a. auch Lidar Sensoren ein wichtiger Bestandteil sind. Die Entwicklung auf diesem Gebiet ist für künftige Anwendungen von höchster Bedeutung, da Lidare eine genauere, von der Umgebungsbeleuchtung unabhängige, Tiefendarstellung bieten. Insbesondere Algorithmen und maschinelle Lernansätze wie Deep Learning, die Rohdaten über Lernzprozesse direkt verarbeiten können, sind aufgrund der großen Reichweite und der dreidimensionalen Auflösung der gemessenen Punktwolken sehr wichtig. Somit hat sich ein weites Forschungsfeld mit vielen Herausforderungen und ungelösten Problemen etabliert. Diese Arbeit zielt darauf ab, dieses Defizit zu verringern und effiziente Algorithmen zur 3D-Objekterkennung zu entwickeln. Sie stellt ein tiefes Neuronales Netzwerk mit spezifischen Schichten und einer neuartigen Fehlerfunktion zur sicheren Lokalisierung und Schätzung der Orientierung von Objekten aus Punktwolken bereit. Zunächst wird ein 3D-Detektor entwickelt, der in nur einem Vorwärtsdurchlauf aus einer Punktwolke alle Objekte detektiert. Anschließend wird dieser Detektor durch die Fusion von komplementären semantischen Merkmalen aus Kamerabildern und einem gemeinsamen probabilistischen Tracking verfeinert, um die Detektionen zu stabilisieren und Ausreißer zu filtern. Im letzten Teil wird ein Konzept für den Einsatz in einem bestehenden Testfahrzeug vorgestellt, das sich auf die halbautomatische Generierung eines geeigneten Datensatzes konzentriert. Hierbei wird eine Auswertung auf Daten von Automotive-Lidaren vorgestellt. Als Alternative zur zielgerichteten künstlichen Datengenerierung wird ein weiteres generatives Neuronales Netzwerk untersucht. Experimente mit den erzeugten anwendungsspezifischen- und Benchmark-Datensätzen zeigen, dass sich die vorgestellten Methoden mit dem Stand der Technik messen können und gleichzeitig auf Effizienz für den Einsatz in selbstfahrenden Autos optimiert sind. Darüber hinaus enthalten sie einen umfangreichen Satz an Evaluierungsmetriken und -ergebnissen, die eine solide Grundlage für die zukünftige Forschung bilden

    Visuo-Haptic Grasping of Unknown Objects through Exploration and Learning on Humanoid Robots

    Get PDF
    Die vorliegende Arbeit befasst sich mit dem Greifen unbekannter Objekte durch humanoide Roboter. Dazu werden visuelle Informationen mit haptischer Exploration kombiniert, um Greifhypothesen zu erzeugen. Basierend auf simulierten Trainingsdaten wird außerdem eine Greifmetrik gelernt, welche die Erfolgswahrscheinlichkeit der Greifhypothesen bewertet und die mit der größten geschätzten Erfolgswahrscheinlichkeit auswählt. Diese wird verwendet, um Objekte mit Hilfe einer reaktiven Kontrollstrategie zu greifen. Die zwei Kernbeiträge der Arbeit sind zum einen die haptische Exploration von unbekannten Objekten und zum anderen das Greifen von unbekannten Objekten mit Hilfe einer neuartigen datengetriebenen Greifmetrik

    Fourteenth Biennial Status Report: März 2017 - February 2019

    No full text

    Training Physics-based Controllers for Articulated Characters with Deep Reinforcement Learning

    Get PDF
    In this thesis, two different applications are discussed for using machine learning techniques to train coordinated motion controllers in arbitrary characters in absence of motion capture data. The methods highlight the resourcefulness of physical simulations to generate synthetic and generic motion data that can be used to learn various targeted skills. First, we present an unsupervised method for learning loco-motion skills in virtual characters from a low dimensional latent space which captures the coordination between multiple joints. We use a technique called motor babble, wherein a character interacts with its environment by actuating its joints through uncoordinated, low-level (motor) excitation, resulting in a corpus of motion data from which a manifold latent space can be extracted. Using reinforcement learning, we then train the character to learn locomotion (such as walking or running) in the low-dimensional latent space instead of the full-dimensional joint action space. The thesis also presents an end-to-end automated framework for training physics-based characters to rhythmically dance to user-input songs. A generative adversarial network (GAN) architecture is proposed that learns to generate physically stable dance moves through repeated interactions with the environment. These moves are then used to construct a dance network that can be used for choreography. Using DRL, the character is then trained to perform these moves, without losing balance and rhythm, in the presence of physical forces such as gravity and friction

    Detection of bimanual gestures everywhere: why it matters, what we need and what is missing

    Full text link
    Bimanual gestures are of the utmost importance for the study of motor coordination in humans and in everyday activities. A reliable detection of bimanual gestures in unconstrained environments is fundamental for their clinical study and to assess common activities of daily living. This paper investigates techniques for a reliable, unconstrained detection and classification of bimanual gestures. It assumes the availability of inertial data originating from the two hands/arms, builds upon a previously developed technique for gesture modelling based on Gaussian Mixture Modelling (GMM) and Gaussian Mixture Regression (GMR), and compares different modelling and classification techniques, which are based on a number of assumptions inspired by literature about how bimanual gestures are represented and modelled in the brain. Experiments show results related to 5 everyday bimanual activities, which have been selected on the basis of three main parameters: (not) constraining the two hands by a physical tool, (not) requiring a specific sequence of single-hand gestures, being recursive (or not). In the best performing combination of modeling approach and classification technique, five out of five activities are recognized up to an accuracy of 97%, a precision of 82% and a level of recall of 100%.Comment: Submitted to Robotics and Autonomous Systems (Elsevier
    corecore