22 research outputs found
HP-GAN: Probabilistic 3D human motion prediction via GAN
Predicting and understanding human motion dynamics has many applications,
such as motion synthesis, augmented reality, security, and autonomous vehicles.
Due to the recent success of generative adversarial networks (GAN), there has
been much interest in probabilistic estimation and synthetic data generation
using deep neural network architectures and learning algorithms.
We propose a novel sequence-to-sequence model for probabilistic human motion
prediction, trained with a modified version of improved Wasserstein generative
adversarial networks (WGAN-GP), in which we use a custom loss function designed
for human motion prediction. Our model, which we call HP-GAN, learns a
probability density function of future human poses conditioned on previous
poses. It predicts multiple sequences of possible future human poses, each from
the same input sequence but a different vector z drawn from a random
distribution. Furthermore, to quantify the quality of the non-deterministic
predictions, we simultaneously train a motion-quality-assessment model that
learns the probability that a given skeleton sequence is a real human motion.
We test our algorithm on two of the largest skeleton datasets: NTURGB-D and
Human3.6M. We train our model on both single and multiple action types. Its
predictive power for long-term motion estimation is demonstrated by generating
multiple plausible futures of more than 30 frames from just 10 frames of input.
We show that most sequences generated from the same input have more than 50\%
probabilities of being judged as a real human sequence. We will release all the
code used in this paper to Github
Master of Science
thesisAnimated avatars are becoming increasingly prevalent in three-dimensional virtual environments due to modern motion tracking hardware and their falling cost. As this opens up new possibilities and ways of interaction within such virtual worlds, an important question that arises is how does the presence of an avatar alter the perception and performance of an action in a virtual environment when a user interacts with an object in the virtual environment through their avatar. This research attempts to answer this question by studying the effects of presence of an animated self-avatar in an object manipulation task in a virtual environment. Two experiments were conducted as part of this research. In Experiment 1, the feasibility of an interaction system involving animated self-avatars to manipulate objects in a virtual environment was examined. It was observed that the presence of self-avatars had an affect on the performance of a subset of subjects. Male subjects with gaming experience performed similarly across both visual feedback conditions while female subjects who also had low gaming experience performed better in the condition with avatar feedback than in the condition without avatar feedback. In Experiment 2, we further analyzed the effect of presence of self-avatar visual feedback by looking at the effect of visual immersion in the virtual environment, task difficulty, and individual difference factors such as spatial ability and gaming experience. It was observed that difficult trials were completed significantly faster by subjects in the avatar feedback condition while in the case of the easy trials, there was no significant difference between performance of subjects in the avatar and sphere feedback conditions. No significant interaction was observed between visual feedback condition and either immersiveness or individual difference factors
IMPACT MULTIMEDIA THROUGH VIRTUAL REALITY TECHNOLOGY ON COMMUNICATIONS
U radu se opisuje utjecaj multimedije posredovane tehnologijom virtualne realnosti na komunikaciju i komuniciranje. Tehnologije virtualne realnosti označavaju skup tehnologija koja omogućuje razgovore između ljudi koji su fizički odvojeni tj. prostorno dislocirani. Cilj takve tehnologije je omogućiti razgovor između ljudi koji se nalaze na različitim zemljopisnim mjestima na svijetu, a povezani su poslovno, obiteljski ili prijateljskim odnosima i vezama. Zapravo, pod pojmom „tehnologije virtualne realnosti“ u ovom radu se podrazumijeva skup tehnologija tehnologije za videokonferenciju, videotelefoniju i telerobotiku, koje obavljaju komunikaciju preko virtualnog utjelovljenja odnosno holograma. Tehnologije virtualne realnosti pružaju jeftin i brz način komuniciranja bez obzira na prostornu udaljenost. S komunikološkog aspekta gledano samo su telekonferencije napravile veliki pomak u smislu održavanja stalne i kvalitetne komunikacije koju je lako uspostaviti, ali je skoro u potpunosti eliminiran pristup „licem u lice“, što se može smatrati jednom od „manjkavosti“ u njihovoj primjeni.This paper describes the impact of multimedia mediated virtual reality technology to communication. Technology of virtual reality are a set of technologies that enables conversations between people who are physicaly dislocated. The goal of this technology is to enable conversation between people who are located in different geographic areas of the world and are related to business, family or friendly relations. In fact the term "technology of virtual reality" in this paper meant as a set of technologies for videoconferencing technology, videophones and telerobotics, which perform communication via a virtual incarnation or holograms. Technology of virtual reality provide an inexpensive and fast way to communicate regardless of geographical distance.
With the point of communication view only teleconference made great progress in terms of maintaining the quality and continuous communication that is easy to set up, but is almost completely eliminated access "face to face", which can be considered one of the "deficiencies" in their application
Probabilistic Models of Motor Production
N. Bernstein defined the ability of the central neural system (CNS) to control many degrees of freedom of a physical body with all its redundancy and flexibility as the main problem in motor control. He pointed at that man-made mechanisms usually have one, sometimes two degrees of freedom (DOF); when the number of DOF increases further, it becomes prohibitively hard to control them. The brain, however, seems to perform such control effortlessly. He suggested the way the brain might deal with it: when a motor skill is being acquired, the brain artificially limits the degrees of freedoms, leaving only one or two. As the skill level increases, the brain gradually "frees" the previously fixed DOF, applying control when needed and in directions which have to be corrected, eventually arriving to the control scheme where all the DOF are "free". This approach of reducing the dimensionality of motor control remains relevant even today.
One the possibles solutions of the Bernstetin's problem is the hypothesis of motor primitives (MPs) - small building blocks that constitute complex movements and facilitite motor learnirng and task completion. Just like in the visual system, having a homogenious hierarchical architecture built of similar computational elements may be beneficial.
Studying such a complicated object as brain, it is important to define at which level of details one works and which questions one aims to answer. David Marr suggested three levels of analysis: 1. computational, analysing which problem the system solves; 2. algorithmic, questioning which representation the system uses and which computations it performs; 3. implementational, finding how such computations are performed by neurons in the brain. In this thesis we stay at the first two levels, seeking for the basic representation of motor output.
In this work we present a new model of motor primitives that comprises multiple interacting latent dynamical systems, and give it a full Bayesian treatment. Modelling within the Bayesian framework, in my opinion, must become the new standard in hypothesis testing in neuroscience. Only the Bayesian framework gives us guarantees when dealing with the inevitable plethora of hidden variables and uncertainty.
The special type of coupling of dynamical systems we proposed, based on the Product of Experts, has many natural interpretations in the Bayesian framework. If the dynamical systems run in parallel, it yields Bayesian cue integration. If they are organized hierarchically due to serial coupling, we get hierarchical priors over the dynamics. If one of the dynamical systems represents sensory state, we arrive to the sensory-motor primitives. The compact representation that follows from the variational treatment allows learning of a motor primitives library. Learned separately, combined motion can be represented as a matrix of coupling values.
We performed a set of experiments to compare different models of motor primitives. In a series of 2-alternative forced choice (2AFC) experiments participants were discriminating natural and synthesised movements, thus running a graphics Turing test. When available, Bayesian model score predicted the naturalness of the perceived movements. For simple movements, like walking, Bayesian model comparison and psychophysics tests indicate that one dynamical system is sufficient to describe the data. For more complex movements, like walking and waving, motion can be better represented as a set of coupled dynamical systems. We also experimentally confirmed that Bayesian treatment of model learning on motion data is superior to the simple point estimate of latent parameters. Experiments with non-periodic movements show that they do not benefit from more complex latent dynamics, despite having high kinematic complexity.
By having a fully Bayesian models, we could quantitatively disentangle the influence of motion dynamics and pose on the perception of naturalness. We confirmed that rich and correct dynamics is more important than the kinematic representation.
There are numerous further directions of research. In the models we devised, for multiple parts, even though the latent dynamics was factorized on a set of interacting systems, the kinematic parts were completely independent. Thus, interaction between the kinematic parts could be mediated only by the latent dynamics interactions. A more flexible model would allow a dense interaction on the kinematic level too.
Another important problem relates to the representation of time in Markov chains. Discrete time Markov chains form an approximation to continuous dynamics. As time step is assumed to be fixed, we face with the problem of time step selection. Time is also not a explicit parameter in Markov chains. This also prohibits explicit optimization of time as parameter and reasoning (inference) about it. For example, in optimal control boundary conditions are usually set at exact time points, which is not an ecological scenario, where time is usually a parameter of optimization. Making time an explicit parameter in dynamics may alleviate this
Real-Time Joint Coupling of the Spine for Inverse Kinematics
In this paper we propose a simple model for the coupling behavior of the human spine for an inverse kinematics framework. Our spine model exhibits anatomically correct motions of the vertebrae of virtual mannequins by coupling standard swing and revolute joint models. The adjustement of the joints is made with several simple (in)equality constraints, resulting in a reduction of the solution space dimensionality for the inverse kinematics solver. By reducing the solution space dimensionality to feasible spine shapes, we prevent the inverse kinematics algorithm from providing infeasible postures for the spine.In this paper, we exploit how to apply these simple constraints to the human spine by a strict decoupling of the swing and torsion motion of the vertebrae. We demonstrate the validity of our approach on various experiments
Sentient Matter: Towards Affective Human-Architecture Interaction
Interactive design has been embedded into every aspect of our lives.
Ranging from handy devices to architecturally scaled environments,
these designs have not only shifted the way we facilitate interaction with
other people, but they also actively reconfigure themselves in response
to human stimuli. Following in the wake of interactive experimentation,
sentient matter, the idea that matter embodies the capacity to perceive
and respond to stimuli, attempts to engage in a challenging arena that few
architects and architectural researchers have ventured into. In particular,
the creation and simulation of emotive types of interaction between the
architectural environment and its inhabitants.
This ambition is made possible by the collaboration of multiple
disciplines. Cybernetics, specifically the legacy of Pask’s conversation
theory, inspires this thesis with the question of why emotion is needed in
facilitating human–architecture communication; why emotion appraisal
theory (P. Desmet) within psychology supports the feasibility of an
architectural environment to elicit emotional changes on its participant as
well as the possibility of generating a next-step response by having the
participant’s emotive behaviors observed; and why movement notation
systems, especially Laban Movement Analysis (a movement rating
scale system), helps us to understand how emotions can be identified
by motion elements that signify emotive behavior. Through the process
of decomposing movement into several qualitative and quantitative
factors such as velocity, openness, and smoothness, emotions embodied
in motion can be detected and even manipulated by altering those
movement factors. Moreover, with the employment of a Kinect sensor,
live performance can be analyzed in real time.
Based on the above research and inspired by the Kinetic sculptures
of Margolin, the final product of this thesis is the development of a
prototype that translates human movements that are expressive of
emotion into continuous surface transformations, thus making evident
how such emotive states might be transcoded into an architectural form.
In this process, four typical emotive architectural expressions—joy,
anger, excited, and sadness—are researched. This thesis also documents
three virtual scenarios in order to examine the effect of this interactive
system. Different contexts, kinetic types, and behavioral strategies are
presented so that we may explore their potential applications.
Sentient matter outlines a framework of syntheses, which is built upon
the convergence of embedded computation (intelligence) and physical
counterpart (kinetics). In the entire process, it considers people’s
participation as materials that fuel the generation of legible emotional
behaviors within an architectural environment. Consequently, there
is potential for an architectural learning capacity coupled with an
evolving data library of human behavioral knowledge. This opens doors
for futuristic designs where the paradigm shifts from “What is that
building?” to “What is that building doing?
Recommended from our members
Perceptual thresholds for foot slipping in animated characters
The computer game industry continues to progress toward realistic-looking character motion. However, even in state-of-the-art games, the use of motion capture data in character animation may result in errors such as “foot slipping,” where the feet do not match up with the floor properly during translation. Various algorithms have been proposed to minimize foot slipping, including one which changes limb lengths. While foot slipping decreases the realism of character motion, there must be some threshold below which this error is imperceptible; devoting further processor time in these cases is wasteful. We apply the classical method of perception threshold determination using a set of motion clips with parameterized slipping error. From this experiment, we develop guidelines for acceptable error. Furthermore, we show that introducing simple camera motion may increase the perceptual threshold, and thus could be used to “mask” foot slipping errors