683 research outputs found
Hand Keypoint Detection in Single Images using Multiview Bootstrapping
We present an approach that uses a multi-camera system to train fine-grained
detectors for keypoints that are prone to occlusion, such as the joints of a
hand. We call this procedure multiview bootstrapping: first, an initial
keypoint detector is used to produce noisy labels in multiple views of the
hand. The noisy detections are then triangulated in 3D using multiview geometry
or marked as outliers. Finally, the reprojected triangulations are used as new
labeled training data to improve the detector. We repeat this process,
generating more labeled data in each iteration. We derive a result analytically
relating the minimum number of views to achieve target true and false positive
rates for a given detector. The method is used to train a hand keypoint
detector for single images. The resulting keypoint detector runs in realtime on
RGB images and has accuracy comparable to methods that use depth sensors. The
single view detector, triangulated over multiple views, enables 3D markerless
hand motion capture with complex object interactions.Comment: CVPR 201
DART: Distribution Aware Retinal Transform for Event-based Cameras
We introduce a generic visual descriptor, termed as distribution aware
retinal transform (DART), that encodes the structural context using log-polar
grids for event cameras. The DART descriptor is applied to four different
problems, namely object classification, tracking, detection and feature
matching: (1) The DART features are directly employed as local descriptors in a
bag-of-features classification framework and testing is carried out on four
standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS,
NCaltech-101). (2) Extending the classification system, tracking is
demonstrated using two key novelties: (i) For overcoming the low-sample problem
for the one-shot learning of a binary classifier, statistical bootstrapping is
leveraged with online learning; (ii) To achieve tracker robustness, the scale
and rotation equivariance property of the DART descriptors is exploited for the
one-shot learning. (3) To solve the long-term object tracking problem, an
object detector is designed using the principle of cluster majority voting. The
detection scheme is then combined with the tracker to result in a high
intersection-over-union score with augmented ground truth annotations on the
publicly available event camera dataset. (4) Finally, the event context encoded
by DART greatly simplifies the feature correspondence problem, especially for
spatio-temporal slices far apart in time, which has not been explicitly tackled
in the event-based vision domain.Comment: 12 pages, revision submitted to TPAMI in Nov 201
RGB-D-based Action Recognition Datasets: A Survey
Human action recognition from RGB-D (Red, Green, Blue and Depth) data has
attracted increasing attention since the first work reported in 2010. Over this
period, many benchmark datasets have been created to facilitate the development
and evaluation of new algorithms. This raises the question of which dataset to
select and how to use it in providing a fair and objective comparative
evaluation against state-of-the-art methods. To address this issue, this paper
provides a comprehensive review of the most commonly used action recognition
related RGB-D video datasets, including 27 single-view datasets, 10 multi-view
datasets, and 7 multi-person datasets. The detailed information and analysis of
these datasets is a useful resource in guiding insightful selection of datasets
for future research. In addition, the issues with current algorithm evaluation
vis-\'{a}-vis limitations of the available datasets and evaluation protocols
are also highlighted; resulting in a number of recommendations for collection
of new datasets and use of evaluation protocols
'Elbows Out' - Predictive tracking of partially occluded pose for Robot-Assisted dressing
© 2016 IEEE. Robots that can assist in the activities of daily living, such as dressing, may support older adults, addressing the needs of an aging population in the face of a growing shortage of care professionals. Using depth cameras during robot-assisted dressing can lead to occlusions and loss of user tracking, which may result in unsafe trajectory planning or prevent the planning task proceeding altogether. For the dressing task of putting on a jacket, which is addressed in this letter, tracking of the arm is lost when the user's hand enters the jacket, which may lead to unsafe situations for the user and a poor interaction experience. Using motion tracking data, free from occlusions, gathered from a human-human interaction study on an assisted dressing task, recurrent neural network models were built to predict the elbow position of a single arm based on other features of the user pose. The best features for predicting the elbow position were explored by using regression trees indicating the hips and shoulder as possible predictors. Engineered features were also created based on observations of real dressing scenarios and their effectiveness explored. Comparison between position and orientation-based datasets was also included in this study. A 12-fold cross-validation was performed for each feature set and repeated 20 times to improve statistical power. Using position-based data, the elbow position could be predicted with a 4.1 cm error but adding engineered features reduced the error to 2.4 cm. Adding orientation information to the data did not improve the accuracy and aggregating univariate response models failed to make significant improvements. The model was evaluated on Kinect data for a robot dressing task and although not without issues, demonstrates potential for this application. Although this has been demonstrated for jacket dressing, the technique could be applied to a number of different situations during occluded tracking
Teegi: Tangible EEG Interface
We introduce Teegi, a Tangible ElectroEncephaloGraphy (EEG) Interface that
enables novice users to get to know more about something as complex as brain
signals, in an easy, en- gaging and informative way. To this end, we have
designed a new system based on a unique combination of spatial aug- mented
reality, tangible interaction and real-time neurotech- nologies. With Teegi, a
user can visualize and analyze his or her own brain activity in real-time, on a
tangible character that can be easily manipulated, and with which it is
possible to interact. An exploration study has shown that interacting with
Teegi seems to be easy, motivating, reliable and infor- mative. Overall, this
suggests that Teegi is a promising and relevant training and mediation tool for
the general public.Comment: to appear in UIST-ACM User Interface Software and Technology
Symposium, Oct 2014, Honolulu, United State
Possibilities of man-machine interaction through the perception of human gestures
A mesura que les màquines s'utilitzen interaccionant cada cop més amb les persones, la necessitat d'interfícies més amigables esdevé una necessitat creixent. La comunicació oral persona-màquina com una forma d'interacció utilitzant el llenguatge natural és cada vegada més usual. La interpretació dels gestos humans pot, en certes aplicacions, complementar aquesta comunicació oral. Aquest article descriu un sistema d'interpretació dels gestos basat en la visió per computador. El procés d'interpretació realitza la detecció i seguiment d'un operador humà, i a partir dels seus moviments interpreta un conjunt específic d'ordres gestuals, en temps real.As man-machine interaction grows there is an increasing need for friendly interfaces. Human-machine oral communication as a means of natural language interaction is becoming quite common. Interpretation of human gestures can, in some applications, complement such communication. This article describes an interpretation of gestures procedure. The system is based on a computer vision system for the detection and tracking of a human operator and the interpretation of a specific set of human gestures in real time
Anthropomorphic Robot Design and User Interaction Associated with Motion
Though in its original concept a robot was conceived to have some human-like shape, most robots now in use have specific industrial purposes and do not closely resemble humans. Nevertheless, robots that resemble human form in some way have continued to be introduced. They are called anthropomorphic robots. The fact that the user interface to all robots is now highly mediated means that the form of the user interface is not necessarily connected to the robots form, human or otherwise. Consequently, the unique way the design of anthropomorphic robots affects their user interaction is through their general appearance and the way they move. These robots human-like appearance acts as a kind of generalized predictor that gives its operators, and those with whom they may directly work, the expectation that they will behave to some extent like a human. This expectation is especially prominent for interactions with social robots, which are built to enhance it. Often interaction with them may be mainly cognitive because they are not necessarily kinematically intricate enough for complex physical interaction. Their body movement, for example, may be limited to simple wheeled locomotion. An anthropomorphic robot with human form, however, can be kinematically complex and designed, for example, to reproduce the details of human limb, torso, and head movement. Because of the mediated nature of robot control, there remains in general no necessary connection between the specific form of user interface and the anthropomorphic form of the robot. But their anthropomorphic kinematics and dynamics imply that the impact of their design shows up in the way the robot moves. The central finding of this report is that the control of this motion is a basic design element through which the anthropomorphic form can affect user interaction. In particular, designers of anthropomorphic robots can take advantage of the inherent human-like movement to 1) improve the users direct manual control over robot limbs and body positions, 2) improve users ability to detect anomalous robot behavior which could signal malfunction, and 3) enable users to be better able to infer the intent of robot movement. These three benefits of anthropomorphic design are inherent implications of the anthropomorphic form but they need to be recognized by designers as part of anthropomorphic design and explicitly enhanced to maximize their beneficial impact. Examples of such enhancements are provided in this report. If implemented, these benefits of anthropomorphic design can help reduce the risk of Inadequate Design of Human and Automation Robotic Integration (HARI) associated with the HARI-01 gap by providing efficient and dexterous operator control over robots and by improving operator ability to detect malfunctions and understand the intention of robot movement
- …