263,899 research outputs found
Pose and motion from contact
In the absence of vision, grasping an object often relies on tactile feedback from the fingertips. As the finger pushes the object, the fingertip can feel the contact point move. If the object is known in advance, from this motion the finger may infer the location of the contact point on the object and thereby the object pose. This paper primarily investigates the problem of determining the pose (orientation and position) and motion (velocity and angular velocity) of a planar object with known geometry from such contact motion generated by pushing. A dynamic analysis of pushing yields a nonlinear system that relates through contact the object pose and motion to the finger motion. The contact motion on the fingertip thus encodes certain information about the object pose. Nonlinear observability theory is employed to show that such information is sufficient for the finger to âobserve â not only the pose but also the motion of the object. Therefore a sensing strategy can be realized as an observer of the nonlinear dynamical system. Two observers are subsequently introduced. The first observer, based on the result of [15], has its âgain â determined by the solution of a Lyapunov-like equation; it can be activated at any time instant during a push. The second observer, based on Newtonâs method, solves for the initial (motionless) object pose from three intermediate contact points during a push. Under the Coulomb friction model, the paper copes with support friction in the plane and/or contact friction between the finger and the object. Extensive simulations have been done to demonstrate the feasibility of the two observers. Preliminary experiments (with an Adept robot) have also been conducted. A contact sensor has been implemented using strain gauges.
Analyzing Whole-Body Pose Transitions in Multi-Contact Motions
When executing whole-body motions, humans are able to use a large variety of
support poses which not only utilize the feet, but also hands, knees and elbows
to enhance stability. While there are many works analyzing the transitions
involved in walking, very few works analyze human motion where more complex
supports occur.
In this work, we analyze complex support pose transitions in human motion
involving locomotion and manipulation tasks (loco-manipulation). We have
applied a method for the detection of human support contacts from motion
capture data to a large-scale dataset of loco-manipulation motions involving
multi-contact supports, providing a semantic representation of them. Our
results provide a statistical analysis of the used support poses, their
transitions and the time spent in each of them. In addition, our data partially
validates our taxonomy of whole-body support poses presented in our previous
work.
We believe that this work extends our understanding of human motion for
humanoids, with a long-term objective of developing methods for autonomous
multi-contact motion planning.Comment: 8 pages, IEEE-RAS International Conference on Humanoid Robots
(Humanoids) 201
D&D: Learning Human Dynamics from Dynamic Camera
3D human pose estimation from a monocular video has recently seen significant
improvements. However, most state-of-the-art methods are kinematics-based,
which are prone to physically implausible motions with pronounced artifacts.
Current dynamics-based methods can predict physically plausible motion but are
restricted to simple scenarios with static camera view. In this work, we
present D&D (Learning Human Dynamics from Dynamic Camera), which leverages the
laws of physics to reconstruct 3D human motion from the in-the-wild videos with
a moving camera. D&D introduces inertial force control (IFC) to explain the 3D
human motion in the non-inertial local frame by considering the inertial forces
of the dynamic camera. To learn the ground contact with limited annotations, we
develop probabilistic contact torque (PCT), which is computed by differentiable
sampling from contact probabilities and used to generate motions. The contact
state can be weakly supervised by encouraging the model to generate correct
motions. Furthermore, we propose an attentive PD controller that adjusts target
pose states using temporal information to obtain smooth and accurate pose
control. Our approach is entirely neural-based and runs without offline
optimization or simulation in physics engines. Experiments on large-scale 3D
human motion benchmarks demonstrate the effectiveness of D&D, where we exhibit
superior performance against both state-of-the-art kinematics-based and
dynamics-based methods. Code is available at https://github.com/Jeffsjtu/DnDComment: ECCV 2022 (Oral
QuestEnvSim: Environment-Aware Simulated Motion Tracking from Sparse Sensors
Replicating a user's pose from only wearable sensors is important for many
AR/VR applications. Most existing methods for motion tracking avoid environment
interaction apart from foot-floor contact due to their complex dynamics and
hard constraints. However, in daily life people regularly interact with their
environment, e.g. by sitting on a couch or leaning on a desk. Using
Reinforcement Learning, we show that headset and controller pose, if combined
with physics simulation and environment observations can generate realistic
full-body poses even in highly constrained environments. The physics simulation
automatically enforces the various constraints necessary for realistic poses,
instead of manually specifying them as in many kinematic approaches. These hard
constraints allow us to achieve high-quality interaction motions without
typical artifacts such as penetration or contact sliding. We discuss three
features, the environment representation, the contact reward and scene
randomization, crucial to the performance of the method. We demonstrate the
generality of the approach through various examples, such as sitting on chairs,
a couch and boxes, stepping over boxes, rocking a chair and turning an office
chair. We believe these are some of the highest-quality results achieved for
motion tracking from sparse sensor with scene interaction
GRAB: A Dataset of Whole-Body Human Grasping of Objects
Training computers to understand, model, and synthesize human grasping
requires a rich dataset containing complex 3D object shapes, detailed contact
information, hand pose and shape, and the 3D body motion over time. While
"grasping" is commonly thought of as a single hand stably lifting an object, we
capture the motion of the entire body and adopt the generalized notion of
"whole-body grasps". Thus, we collect a new dataset, called GRAB (GRasping
Actions with Bodies), of whole-body grasps, containing full 3D shape and pose
sequences of 10 subjects interacting with 51 everyday objects of varying shape
and size. Given MoCap markers, we fit the full 3D body shape and pose,
including the articulated face and hands, as well as the 3D object pose. This
gives detailed 3D meshes over time, from which we compute contact between the
body and object. This is a unique dataset, that goes well beyond existing ones
for modeling and understanding how humans grasp and manipulate objects, how
their full body is involved, and how interaction varies with the task. We
illustrate the practical value of GRAB with an example application; we train
GrabNet, a conditional generative network, to predict 3D hand grasps for unseen
3D object shapes. The dataset and code are available for research purposes at
https://grab.is.tue.mpg.de.Comment: ECCV 202
Realtime State Estimation with Tactile and Visual sensing. Application to Planar Manipulation
Accurate and robust object state estimation enables successful object
manipulation. Visual sensing is widely used to estimate object poses. However,
in a cluttered scene or in a tight workspace, the robot's end-effector often
occludes the object from the visual sensor. The robot then loses visual
feedback and must fall back on open-loop execution.
In this paper, we integrate both tactile and visual input using a framework
for solving the SLAM problem, incremental smoothing and mapping (iSAM), to
provide a fast and flexible solution. Visual sensing provides global pose
information but is noisy in general, whereas contact sensing is local, but its
measurements are more accurate relative to the end-effector. By combining them,
we aim to exploit their advantages and overcome their limitations. We explore
the technique in the context of a pusher-slider system. We adapt iSAM's
measurement cost and motion cost to the pushing scenario, and use an
instrumented setup to evaluate the estimation quality with different object
shapes, on different surface materials, and under different contact modes
Embodied Scene-aware Human Pose Estimation
We propose embodied scene-aware human pose estimation where we estimate 3D
poses based on a simulated agent's proprioception and scene awareness, along
with external third-person observations. Unlike prior methods that often resort
to multistage optimization, non-causal inference, and complex contact modeling
to estimate human pose and human scene interactions, our method is one stage,
causal, and recovers global 3D human poses in a simulated environment. Since 2D
third-person observations are coupled with the camera pose, we propose to
disentangle the camera pose and use a multi-step projection gradient defined in
the global coordinate frame as the movement cue for our embodied agent.
Leveraging a physics simulation and prescanned scenes (e.g., 3D mesh), we
simulate our agent in everyday environments (libraries, offices, bedrooms,
etc.) and equip our agent with environmental sensors to intelligently navigate
and interact with scene geometries. Our method also relies only on 2D keypoints
and can be trained on synthetic datasets derived from popular human motion
databases. To evaluate, we use the popular H36M and PROX datasets and, for the
first time, achieve a success rate of 96.7% on the challenging PROX dataset
without ever using PROX motion sequences for training.Comment: Project website: https://embodiedscene.github.io/embodiedpose/
Zhengyi Luo and Shun Iwase contributed equall
- âŠ