272 research outputs found
Information and Communication Technology and the Global Flow of Wine: A Gravity Model of ICT in Wine Trade
International Relations/Trade,
GANerated Hands for Real-time 3D Hand Tracking from Monocular RGB
We address the highly challenging problem of real-time 3D hand tracking based
on a monocular RGB-only sequence. Our tracking method combines a convolutional
neural network with a kinematic 3D hand model, such that it generalizes well to
unseen data, is robust to occlusions and varying camera viewpoints, and leads
to anatomically plausible as well as temporally smooth hand motions. For
training our CNN we propose a novel approach for the synthetic generation of
training data that is based on a geometrically consistent image-to-image
translation network. To be more specific, we use a neural network that
translates synthetic images to "real" images, such that the so-generated images
follow the same statistical distribution as real-world hand images. For
training this translation network we combine an adversarial loss and a
cycle-consistency loss with a geometric consistency loss in order to preserve
geometric properties (such as hand pose) during translation. We demonstrate
that our hand tracking system outperforms the current state-of-the-art on
challenging RGB-only footage
Single-Shot Multi-Person 3D Pose Estimation From Monocular RGB
We propose a new single-shot method for multi-person 3D pose estimation in
general scenes from a monocular RGB camera. Our approach uses novel
occlusion-robust pose-maps (ORPM) which enable full body pose inference even
under strong partial occlusions by other people and objects in the scene. ORPM
outputs a fixed number of maps which encode the 3D joint locations of all
people in the scene. Body part associations allow us to infer 3D pose for an
arbitrary number of people without explicit bounding box prediction. To train
our approach we introduce MuCo-3DHP, the first large scale training data set
showing real images of sophisticated multi-person interactions and occlusions.
We synthesize a large corpus of multi-person images by compositing images of
individual people (with ground truth from mutli-view performance capture). We
evaluate our method on our new challenging 3D annotated multi-person test set
MuPoTs-3D where we achieve state-of-the-art performance. To further stimulate
research in multi-person 3D pose estimation, we will make our new datasets, and
associated code publicly available for research purposes.Comment: International Conference on 3D Vision (3DV), 201
ARE YOU WILLING TO WAIT LONGER FOR INTERNET PRIVACY?
It becomes increasingly common for governments, service providers and specialized data aggregators to systematically collect traces of personal communication on the Internet without the user’s knowledge or approval. An analysis of these personal traces by data mining algorithms can reveal sensitive personal information, such as location data, behavioral patterns, or personal profiles including preferences and dislikes. Recent studies show that this information can be used for various purposes, for example by insurance companies or banks to identify potentially risky customers, by governments to observe their citizens, and also by repressive regimes to monitor political opponents. Online anonymity software, such as Tor, can help users to protect their privacy, but often comes at the prize of low usability, e.g., by causing increased latency during surfing. In this exploratory study, we determine factors that influence the usage of Internet anonymity software. In particular, we show that Internet literacy, Internet privacy awareness and Internet privacy concerns are important antecedents for determining an Internet user’s intention to use anonymity software, and that Internet patience has a positive moderating effect on the intention to use anonymity software, as well as on its perceived usefulness
Generative Model-Based Loss to the Rescue: A Method to Overcome Annotation Errors for Depth-Based Hand Pose Estimation
We propose to use a model-based generative loss for training hand pose
estimators on depth images based on a volumetric hand model. This additional
loss allows training of a hand pose estimator that accurately infers the entire
set of 21 hand keypoints while only using supervision for 6 easy-to-annotate
keypoints (fingertips and wrist). We show that our partially-supervised method
achieves results that are comparable to those of fully-supervised methods which
enforce articulation consistency. Moreover, for the first time we demonstrate
that such an approach can be used to train on datasets that have erroneous
annotations, i.e. "ground truth" with notable measurement errors, while
obtaining predictions that explain the depth images better than the given
"ground truth"
XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera
We present a real-time approach for multi-person 3D motion capture at over 30
fps using a single RGB camera. It operates successfully in generic scenes which
may contain occlusions by objects and by other people. Our method operates in
subsequent stages. The first stage is a convolutional neural network (CNN) that
estimates 2D and 3D pose features along with identity assignments for all
visible joints of all individuals.We contribute a new architecture for this
CNN, called SelecSLS Net, that uses novel selective long and short range skip
connections to improve the information flow allowing for a drastically faster
network without compromising accuracy. In the second stage, a fully connected
neural network turns the possibly partial (on account of occlusion) 2Dpose and
3Dpose features for each subject into a complete 3Dpose estimate per
individual. The third stage applies space-time skeletal model fitting to the
predicted 2D and 3D pose per subject to further reconcile the 2D and 3D pose,
and enforce temporal coherence. Our method returns the full skeletal pose in
joint angles for each subject. This is a further key distinction from previous
work that do not produce joint angle results of a coherent skeleton in real
time for multi-person scenes. The proposed system runs on consumer hardware at
a previously unseen speed of more than 30 fps given 512x320 images as input
while achieving state-of-the-art accuracy, which we will demonstrate on a range
of challenging real-world scenes.Comment: To appear in ACM Transactions on Graphics (SIGGRAPH) 202
A thalamocortical pathway for fast rerouting of tactile information to occipital cortex in congenital blindness
In congenitally blind individuals, the occipital cortex responds to various nonvisual inputs. Some animal studies raise the possibility that a subcortical pathway allows fast re-routing of tactile information to the occipital cortex, but this has not been shown in humans. Here we show using magnetoencephalography (MEG) that tactile stimulation produces occipital cortex activations, starting as early as 35 ms in congenitally blind individuals, but not in blindfolded sighted controls. Given our measured thalamic response latencies of 20 ms and a mean estimated lateral geniculate nucleus to primary visual cortex transfer time of 15 ms, we claim that this early occipital response is mediated by a direct thalamo-cortical pathway. We also observed stronger directed connectivity in the alpha band range from posterior thalamus to occipital cortex in congenitally blind participants. Our results strongly suggest the contribution of a fast thalamo-cortical pathway in the cross-modal activation of the occipital cortex in congenitally blind humans
- …