1,395 research outputs found
"Let’s get ready to bundle!":Crowd-level Human Keypoint Tracking
This work examines the suitability of a state-of-the-art human pose trackingmethod for application within surveillance scenarios and focuses on publicplaces in urban areas that tend to suffer from crowdedness, such as city centers.Starting with a short introduction to motivate keypoint tracking in surveillanceapplications, this report will present details about the adapted method, whichfollows an LSTM-based approach. Afterwards, different changes that had tobe incorporated in order to successfully apply the given method to our targetsetting will be presented. Finally, various experiments will show how the chosenmethod performs, based on experiments with simulated data
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
Single camera pose estimation using Bayesian filtering and Kinect motion priors
Traditional approaches to upper body pose estimation using monocular vision
rely on complex body models and a large variety of geometric constraints. We
argue that this is not ideal and somewhat inelegant as it results in large
processing burdens, and instead attempt to incorporate these constraints
through priors obtained directly from training data. A prior distribution
covering the probability of a human pose occurring is used to incorporate
likely human poses. This distribution is obtained offline, by fitting a
Gaussian mixture model to a large dataset of recorded human body poses, tracked
using a Kinect sensor. We combine this prior information with a random walk
transition model to obtain an upper body model, suitable for use within a
recursive Bayesian filtering framework. Our model can be viewed as a mixture of
discrete Ornstein-Uhlenbeck processes, in that states behave as random walks,
but drift towards a set of typically observed poses. This model is combined
with measurements of the human head and hand positions, using recursive
Bayesian estimation to incorporate temporal information. Measurements are
obtained using face detection and a simple skin colour hand detector, trained
using the detected face. The suggested model is designed with analytical
tractability in mind and we show that the pose tracking can be
Rao-Blackwellised using the mixture Kalman filter, allowing for computational
efficiency while still incorporating bio-mechanical properties of the upper
body. In addition, the use of the proposed upper body model allows reliable
three-dimensional pose estimates to be obtained indirectly for a number of
joints that are often difficult to detect using traditional object recognition
strategies. Comparisons with Kinect sensor results and the state of the art in
2D pose estimation highlight the efficacy of the proposed approach.Comment: 25 pages, Technical report, related to Burke and Lasenby, AMDO 2014
conference paper. Code sample: https://github.com/mgb45/SignerBodyPose Video:
https://www.youtube.com/watch?v=dJMTSo7-uF
Probabilistic Parametric Curves for Sequence Modeling
This work proposes a probabilistic extension to BĂ©zier curves as a basis for effectively modeling stochastic processes with a bounded index set. The proposed stochastic process model is based on Mixture Density Networks and BĂ©zier curves with Gaussian random variables as control points. A key advantage of this model is given by the ability to generate multi-mode predictions in a single inference step, thus avoiding the need for Monte Carlo simulation
Probabilistic Parametric Curves for Sequence Modeling
Repräsentationen sequenzieller Daten basieren in der Regel auf der Annahme, dass beobachtete Sequenzen Realisierungen eines unbekannten zugrundeliegenden stochastischen Prozesses sind. Die Bestimmung einer solchen Repräsentation wird üblicherweise als Lernproblem ausgelegt und ergibt ein Sequenzmodell. Das Modell muss in diesem Zusammenhang in der Lage sein, die multimodale Natur der Daten zu erfassen, ohne einzelne Modi zu vermischen. Zur Modellierung eines zugrundeliegenden stochastischen Prozesses lernen häufig verwendete, auf neuronalen Netzen basierende Ansätze entweder eine Wahrscheinlichkeitsverteilung zu parametrisieren oder eine implizite Repräsentation unter Verwendung stochastischer Eingaben oder Neuronen. Dabei integrieren diese Modelle in der Regel Monte Carlo Verfahren oder andere Näherungslösungen, um die Parameterschätzung und probabilistische Inferenz zu ermöglichen. Dies gilt sogar für regressionsbasierte Ansätze basierend auf Mixture Density Netzwerken, welche ebenso Monte Carlo Simulationen zur multi-modalen Inferenz benötigen. Daraus ergibt sich eine Forschungslücke für vollständig regressionsbasierte Ansätze zur Parameterschätzung und probabilistischen Inferenz.
Infolgedessen stellt die vorliegende Arbeit eine probabilistische Erweiterung für Bézierkurven (-Kurven) als Basis für die Modellierung zeitkontinuierlicher stochastischer Prozesse mit beschränkter Indexmenge vor. Das vorgestellte Modell, bezeichnet als -Kurven - Modell, basiert auf Mixture Density Netzwerken (MDN) und Bézierkurven, welche Kurvenkontrollpunkte als normalverteilt annehmen. Die Verwendung eines MDN-basierten Ansatzes steht im Einklang mit aktuellen Versuchen, Unsicherheitsschätzung als Regressionsproblem auszulegen, und ergibt ein generisches Modell, welches allgemein als Basismodell für die probabilistische Sequenzmodellierung einsetzbar ist. Ein wesentlicher Vorteil des Modells ist unter anderem die Möglichkeit glatte, multi-modale Vorhersagen in einem einzigen Inferenzschritt zu generieren, ohne dabei Monte Carlo Simulationen zu benötigen. Durch die Verwendung von Bézierkurven als Basis, kann das Modell außerdem theoretisch für beliebig hohe Datendimensionen verwendet werden, indem die Kontrollpunkte in einen hochdimensionalen Raum eingebettet werden. Um die durch den Fokus auf beschränkte Indexmengen existierenden theoretischen Einschränkungen aufzuheben, wird zusätzlich eine konzeptionelle Erweiterung für das -Kurven - Modell vorgestellt, mit der unendliche stochastische Prozesse modelliert werden können. Wesentliche Eigenschaften des vorgestellten Modells und dessen Erweiterung werden auf verschiedenen Beispielen zur Sequenzsynthese gezeigt.
Aufgrund der hinreichenden Anwendbarkeit des -Kurven - Modells auf die meisten Anwendungsfälle, wird dessen Tauglichkeit umfangreich auf verschiedenen Mehrschrittprädiktionsaufgaben unter Verwendung realer Daten evaluiert. Zunächst wird das Modell gegen häufig verwendete probabilistische Sequenzmodelle im Kontext der Vorhersage von Fußgängertrajektorien evaluiert, wobei es sämtliche Vergleichsmodelle übertrifft. In einer qualitativen Auswertung wird das Verhalten des Modells in einem Vorhersagekontext untersucht. Außerdem werden Schwierigkeiten bei der Bewertung probabilistischer Sequenzmodelle in einem multimodalen Setting diskutiert. Darüber hinaus wird das Modell im Kontext der Vorhersage menschlicher Bewegungen angewendet, um die angestrebte Skalierbarkeit des Modells auf höherdimensionale Daten zu bewerten. Bei dieser Aufgabe übertrifft das Modell allgemein verwendete einfache und auf neuronalen Netzen basierende Grundmodelle und ist in verschiedenen Situationen auf Augenhöhe mit verschiedenen State-of-the-Art-Modellen, was die Einsetzbarkeit in diesem höherdimensionalen Beispiel zeigt. Des Weiteren werden Schwierigkeiten bei der Kovarianzschätzung und die Glättungseigenschaften des -Kurven - Modells diskutiert
From visuomotor control to latent space planning for robot manipulation
Deep visuomotor control is emerging as an active research area for robot manipulation. Recent advances in learning sensory and motor systems in an end-to-end manner have achieved remarkable performance across a range of complex tasks. Nevertheless, a few limitations restrict visuomotor control from being more widely adopted as the de facto choice when facing a manipulation task on a real robotic platform. First, imitation learning-based visuomotor control approaches tend to suffer from the inability to recover from an out-of-distribution state caused by compounding errors. Second, the lack of versatility in task definition limits skill generalisability. Finally, the training data acquisition process and domain transfer are often impractical. In this thesis, individual solutions are proposed to address each of these issues.
In the first part, we find policy uncertainty to be an effective indicator of potential failure cases, in which the robot is stuck in out-of-distribution states. On this basis, we introduce a novel uncertainty-based approach to detect potential failure cases and a recovery strategy based on action-conditioned uncertainty predictions. Then, we propose to employ visual dynamics approximation to our model architecture to capture the motion of the robot arm instead of the static scene background, making it possible to learn versatile skill primitives. In the second part, taking inspiration from the recent progress in latent space planning, we propose a gradient-based optimisation method operating within the latent space of a deep generative model for motion planning. Our approach bypasses the traditional computational challenges encountered by established planning algorithms, and has the capability to specify novel constraints easily and handle multiple constraints simultaneously. Moreover, the training data comes from simple random motor-babbling of kinematically feasible robot states. Our real-world experiments further illustrate that our latent space planning approach can handle both open and closed-loop planning in challenging environments such as heavily cluttered or dynamic scenes. This leads to the first, to our knowledge, closed-loop motion planning algorithm that can incorporate novel custom constraints, and lays the foundation for more complex manipulation tasks
Recommended from our members
Video content analysis for automated detection and tracking of humans in CCTV surveillance applications
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The problems of achieving high detection rate with low false alarm rate for human detection and tracking in video sequence, performance scalability, and improving response time are addressed in this thesis. The underlying causes are the effect of scene complexity, human-to-human interactions, scale changes, and scene background-human interactions. A two-stage processing solution, namely, human detection, and human tracking with two novel pattern classifiers is presented. Scale independent human detection is achieved by processing in the wavelet domain using square wavelet features. These features used to characterise human silhouettes at different scales are similar to rectangular features used in [Viola 2001]. At the detection stage two detectors are combined to improve detection rate. The first detector is based on shape-outline of humans extracted from the scene using a reduced complexity outline extraction algorithm. A Shape mismatch measure is used to differentiate between the human and the background class. The second detector uses rectangular features as primitives for silhouette description in the wavelet domain. The marginal distribution of features collocated at a particular position on a candidate human (a patch of the image) is used to describe statistically the silhouette. Two similarity measures are computed between a candidate human and the model histograms of human and non human classes. The similarity measure is used to discriminate between the human and the non human class. At the tracking stage, a tracker based on joint probabilistic data association filter (JPDAF) for data association, and motion correspondence is presented. Track clustering is used to reduce hypothesis enumeration complexity. Towards improving response time with increase in frame dimension, scene complexity, and number of channels; a scalable algorithmic architecture and operating accuracy prediction technique is presented. A scheduling strategy for improving the response time and throughput by parallel processing is also presented
Dynamic Switching State Systems for Visual Tracking
This work addresses the problem of how to capture the dynamics of maneuvering objects for visual tracking. Towards this end, the perspective of recursive Bayesian filters and the perspective of deep learning approaches for state estimation are considered and their functional viewpoints are brought together
- …