138 research outputs found
Computational Methods for Cognitive and Cooperative Robotics
In the last decades design methods in control engineering made substantial progress in
the areas of robotics and computer animation. Nowadays these methods incorporate the
newest developments in machine learning and artificial intelligence. But the problems
of flexible and online-adaptive combinations of motor behaviors remain challenging for
human-like animations and for humanoid robotics. In this context, biologically-motivated
methods for the analysis and re-synthesis of human motor programs provide new insights
in and models for the anticipatory motion synthesis.
This thesis presents the authorâs achievements in the areas of cognitive and developmental robotics, cooperative and humanoid robotics and intelligent and machine learning methods in computer graphics. The first part of the thesis in the chapter âGoal-directed Imitation for Robotsâ considers imitation learning in cognitive and developmental robotics.
The work presented here details the authorâs progress in the development of hierarchical
motion recognition and planning inspired by recent discoveries of the functions of mirror-neuron cortical circuits in primates. The overall architecture is capable of âlearning for
imitationâ and âlearning by imitationâ. The complete system includes a low-level real-time
capable path planning subsystem for obstacle avoidance during arm reaching. The learning-based path planning subsystem is universal for all types of anthropomorphic robot arms, and is capable of knowledge transfer at the level of individual motor acts.
Next, the problems of learning and synthesis of motor synergies, the spatial and spatio-temporal combinations of motor features in sequential multi-action behavior, and the
problems of task-related action transitions are considered in the second part of the thesis
âKinematic Motion Synthesis for Computer Graphics and Roboticsâ. In this part, a new
approach of modeling complex full-body human actions by mixtures of time-shift invariant
motor primitives in presented. The online-capable full-body motion generation architecture
based on dynamic movement primitives driving the time-shift invariant motor synergies
was implemented as an online-reactive adaptive motion synthesis for computer graphics
and robotics applications.
The last chapter of the thesis entitled âContraction Theory and Self-organized Scenarios
in Computer Graphics and Roboticsâ is dedicated to optimal control strategies in multi-agent scenarios of large crowds of agents expressing highly nonlinear behaviors. This last
part presents new mathematical tools for stability analysis and synthesis of multi-agent
cooperative scenarios.In den letzten Jahrzehnten hat die Forschung in den Bereichen der Steuerung und Regelung
komplexer Systeme erhebliche Fortschritte gemacht, insbesondere in den Bereichen
Robotik und Computeranimation. Die Entwicklung solcher Systeme verwendet heutzutage
neueste Methoden und Entwicklungen im Bereich des maschinellen Lernens und der
kĂŒnstlichen Intelligenz. Die flexible und echtzeitfĂ€hige Kombination von motorischen Verhaltensweisen
ist eine wesentliche Herausforderung fĂŒr die Generierung menschenĂ€hnlicher
Animationen und in der humanoiden Robotik. In diesem Zusammenhang liefern biologisch
motivierte Methoden zur Analyse und Resynthese menschlicher motorischer Programme
neue Erkenntnisse und Modelle fĂŒr die antizipatorische Bewegungssynthese.
Diese Dissertation prÀsentiert die Ergebnisse der Arbeiten des Autors im Gebiet der
kognitiven und Entwicklungsrobotik, kooperativer und humanoider Robotersysteme sowie
intelligenter und maschineller Lernmethoden in der Computergrafik. Der erste Teil der
Dissertation im Kapitel âZielgerichtete Nachahmung fĂŒr Roboterâ behandelt das Imitationslernen
in der kognitiven und Entwicklungsrobotik. Die vorgestellten Arbeiten beschreiben
neue Methoden fĂŒr die hierarchische Bewegungserkennung und -planung, die durch
Erkenntnisse zur Funktion der kortikalen Spiegelneuronen-Schaltkreise bei Primaten inspiriert
wurden. Die entwickelte Architektur ist in der Lage, âdurch Imitation zu lernenâ
und âzu lernen zu imitierenâ. Das komplette entwickelte System enthĂ€lt ein echtzeitfĂ€higes
Pfadplanungssubsystem zur Hindernisvermeidung wĂ€hrend der DurchfĂŒhrung von Armbewegungen.
Das lernbasierte Pfadplanungssubsystem ist universell und fĂŒr alle Arten von
anthropomorphen Roboterarmen in der Lage, Wissen auf der Ebene einzelner motorischer
Handlungen zu ĂŒbertragen.
Im zweiten Teil der Arbeit âKinematische Bewegungssynthese fĂŒr Computergrafik und
Robotikâ werden die Probleme des Lernens und der Synthese motorischer Synergien, d.h.
von rÀumlichen und rÀumlich-zeitlichen Kombinationen motorischer Bewegungselemente
bei Bewegungssequenzen und bei aufgabenbezogenen Handlungs ĂŒbergĂ€ngen behandelt.
Es wird ein neuer Ansatz zur Modellierung komplexer menschlicher Ganzkörperaktionen
durch Mischungen von zeitverschiebungsinvarianten Motorprimitiven vorgestellt. Zudem
wurde ein online-fĂ€higer Synthesealgorithmus fĂŒr Ganzköperbewegungen entwickelt, der
auf dynamischen Bewegungsprimitiven basiert, die wiederum auf der Basis der gelernten
verschiebungsinvarianten Primitive konstruiert werden. Dieser Algorithmus wurde fĂŒr
verschiedene Probleme der Bewegungssynthese fĂŒr die Computergrafik- und Roboteranwendungen
implementiert.
Das letzte Kapitel der Dissertation mit dem Titel âKontraktionstheorie und selbstorganisierte
Szenarien in der Computergrafik und Robotikâ widmet sich optimalen Kontrollstrategien
in Multi-Agenten-Szenarien, wobei die Agenten durch eine hochgradig nichtlineare
Kinematik gekennzeichnet sind. Dieser letzte Teil prÀsentiert neue mathematische Werkzeuge
fĂŒr die StabilitĂ€tsanalyse und Synthese von kooperativen Multi-Agenten-Szenarien
Movement Representation Learning for Pain Level Classification
Self-supervised learning has shown value for uncovering informative movement features for human activity recognition. However, there has been minimal exploration of this approach for affect recognition where availability of large labelled datasets is particularly limited. In this paper, we propose a P-STEMR (Parallel Space-Time Encoding Movement Representation) architecture with the aim of addressing this gap and specifically leveraging the higher availability of human activity recognition datasets for pain-level classification. We evaluated and analyzed the architecture using three different datasets across four sets of experiments. We found statistically significant increase in average F1 score to 0.84 for pain level classification with two classes based on the architecture compared with the use of hand-crafted features. This suggests that it is capable of learning movement representations and transferring these from activity recognition based on data captured in lab settings to classification of pain levels with messier real-world data. We further found that the efficacy of transfer between datasets can be undermined by dissimilarities in population groups due to impairments that affect movement behaviour and in motion primitives (e.g. rotation versus flexion). Future work should investigate how the effect of these differences could be minimized so that data from healthy people can be more valuable for transfer learning
Probabilistic Models of Motor Production
N. Bernstein defined the ability of the central neural system (CNS) to control many degrees of freedom of a physical body with all its redundancy and flexibility as the main problem in motor control. He pointed at that man-made mechanisms usually have one, sometimes two degrees of freedom (DOF); when the number of DOF increases further, it becomes prohibitively hard to control them. The brain, however, seems to perform such control effortlessly. He suggested the way the brain might deal with it: when a motor skill is being acquired, the brain artificially limits the degrees of freedoms, leaving only one or two. As the skill level increases, the brain gradually "frees" the previously fixed DOF, applying control when needed and in directions which have to be corrected, eventually arriving to the control scheme where all the DOF are "free". This approach of reducing the dimensionality of motor control remains relevant even today.
One the possibles solutions of the Bernstetin's problem is the hypothesis of motor primitives (MPs) - small building blocks that constitute complex movements and facilitite motor learnirng and task completion. Just like in the visual system, having a homogenious hierarchical architecture built of similar computational elements may be beneficial.
Studying such a complicated object as brain, it is important to define at which level of details one works and which questions one aims to answer. David Marr suggested three levels of analysis: 1. computational, analysing which problem the system solves; 2. algorithmic, questioning which representation the system uses and which computations it performs; 3. implementational, finding how such computations are performed by neurons in the brain. In this thesis we stay at the first two levels, seeking for the basic representation of motor output.
In this work we present a new model of motor primitives that comprises multiple interacting latent dynamical systems, and give it a full Bayesian treatment. Modelling within the Bayesian framework, in my opinion, must become the new standard in hypothesis testing in neuroscience. Only the Bayesian framework gives us guarantees when dealing with the inevitable plethora of hidden variables and uncertainty.
The special type of coupling of dynamical systems we proposed, based on the Product of Experts, has many natural interpretations in the Bayesian framework. If the dynamical systems run in parallel, it yields Bayesian cue integration. If they are organized hierarchically due to serial coupling, we get hierarchical priors over the dynamics. If one of the dynamical systems represents sensory state, we arrive to the sensory-motor primitives. The compact representation that follows from the variational treatment allows learning of a motor primitives library. Learned separately, combined motion can be represented as a matrix of coupling values.
We performed a set of experiments to compare different models of motor primitives. In a series of 2-alternative forced choice (2AFC) experiments participants were discriminating natural and synthesised movements, thus running a graphics Turing test. When available, Bayesian model score predicted the naturalness of the perceived movements. For simple movements, like walking, Bayesian model comparison and psychophysics tests indicate that one dynamical system is sufficient to describe the data. For more complex movements, like walking and waving, motion can be better represented as a set of coupled dynamical systems. We also experimentally confirmed that Bayesian treatment of model learning on motion data is superior to the simple point estimate of latent parameters. Experiments with non-periodic movements show that they do not benefit from more complex latent dynamics, despite having high kinematic complexity.
By having a fully Bayesian models, we could quantitatively disentangle the influence of motion dynamics and pose on the perception of naturalness. We confirmed that rich and correct dynamics is more important than the kinematic representation.
There are numerous further directions of research. In the models we devised, for multiple parts, even though the latent dynamics was factorized on a set of interacting systems, the kinematic parts were completely independent. Thus, interaction between the kinematic parts could be mediated only by the latent dynamics interactions. A more flexible model would allow a dense interaction on the kinematic level too.
Another important problem relates to the representation of time in Markov chains. Discrete time Markov chains form an approximation to continuous dynamics. As time step is assumed to be fixed, we face with the problem of time step selection. Time is also not a explicit parameter in Markov chains. This also prohibits explicit optimization of time as parameter and reasoning (inference) about it. For example, in optimal control boundary conditions are usually set at exact time points, which is not an ecological scenario, where time is usually a parameter of optimization. Making time an explicit parameter in dynamics may alleviate this
A unifying framework for the identification of motor primitives
Chiovetto E, dâAvella A, Giese MA. A unifying framework for the identification of motor primitives. Plos One. Submitted
Inferring Facial and Body Language
Machine analysis of human facial and body language is a challenging topic in computer
vision, impacting on important applications such as human-computer interaction and visual
surveillance. In this thesis, we present research building towards computational frameworks
capable of automatically understanding facial expression and behavioural body language.
The thesis work commences with a thorough examination in issues surrounding facial
representation based on Local Binary Patterns (LBP). Extensive experiments with different
machine learning techniques demonstrate that LBP features are efficient and effective for
person-independent facial expression recognition, even in low-resolution settings. We then
present and evaluate a conditional mutual information based algorithm to efficiently learn the
most discriminative LBP features, and show the best recognition performance is obtained by
using SVM classifiers with the selected LBP features. However, the recognition is performed
on static images without exploiting temporal behaviors of facial expression.
Subsequently we present a method to capture and represent temporal dynamics of facial
expression by discovering the underlying low-dimensional manifold. Locality Preserving Projections
(LPP) is exploited to learn the expression manifold in the LBP based appearance
feature space. By deriving a universal discriminant expression subspace using a supervised
LPP, we can effectively align manifolds of different subjects on a generalised expression manifold.
Different linear subspace methods are comprehensively evaluated in expression subspace
learning. We formulate and evaluate a Bayesian framework for dynamic facial expression
recognition employing the derived manifold representation. However, the manifold representation
only addresses temporal correlations of the whole face image, does not consider
spatial-temporal correlations among different facial regions. We then employ Canonical Correlation Analysis (CCA) to capture correlations among face
parts. To overcome the inherent limitations of classical CCA for image data, we introduce
and formalise a novel Matrix-based CCA (MCCA), which can better measure correlations in
2D image data. We show this technique can provide superior performance in regression and
recognition tasks, whilst requiring significantly fewer canonical factors. All the above work
focuses on facial expressions. However, the face is usually perceived not as an isolated object
but as an integrated part of the whole body, and the visual channel combining facial and
bodily expressions is most informative.
Finally we investigate two understudied problems in body language analysis, gait-based
gender discrimination and affective body gesture recognition. To effectively combine face
and body cues, CCA is adopted to establish the relationship between the two modalities, and
derive a semantic joint feature space for the feature-level fusion. Experiments on large data
sets demonstrate that our multimodal systems achieve the superior performance in gender
discrimination and affective state analysis.Research studentship of Queen Mary, the International Travel Grant of the Royal Academy of Engineering,
and the Royal Society International Joint Project
- âŠ