683 research outputs found

    Hand Keypoint Detection in Single Images using Multiview Bootstrapping

    Full text link
    We present an approach that uses a multi-camera system to train fine-grained detectors for keypoints that are prone to occlusion, such as the joints of a hand. We call this procedure multiview bootstrapping: first, an initial keypoint detector is used to produce noisy labels in multiple views of the hand. The noisy detections are then triangulated in 3D using multiview geometry or marked as outliers. Finally, the reprojected triangulations are used as new labeled training data to improve the detector. We repeat this process, generating more labeled data in each iteration. We derive a result analytically relating the minimum number of views to achieve target true and false positive rates for a given detector. The method is used to train a hand keypoint detector for single images. The resulting keypoint detector runs in realtime on RGB images and has accuracy comparable to methods that use depth sensors. The single view detector, triangulated over multiple views, enables 3D markerless hand motion capture with complex object interactions.Comment: CVPR 201

    DART: Distribution Aware Retinal Transform for Event-based Cameras

    Full text link
    We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-features classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101). (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) For overcoming the low-sample problem for the one-shot learning of a binary classifier, statistical bootstrapping is leveraged with online learning; (ii) To achieve tracker robustness, the scale and rotation equivariance property of the DART descriptors is exploited for the one-shot learning. (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset. (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.Comment: 12 pages, revision submitted to TPAMI in Nov 201

    RGB-D-based Action Recognition Datasets: A Survey

    Get PDF
    Human action recognition from RGB-D (Red, Green, Blue and Depth) data has attracted increasing attention since the first work reported in 2010. Over this period, many benchmark datasets have been created to facilitate the development and evaluation of new algorithms. This raises the question of which dataset to select and how to use it in providing a fair and objective comparative evaluation against state-of-the-art methods. To address this issue, this paper provides a comprehensive review of the most commonly used action recognition related RGB-D video datasets, including 27 single-view datasets, 10 multi-view datasets, and 7 multi-person datasets. The detailed information and analysis of these datasets is a useful resource in guiding insightful selection of datasets for future research. In addition, the issues with current algorithm evaluation vis-\'{a}-vis limitations of the available datasets and evaluation protocols are also highlighted; resulting in a number of recommendations for collection of new datasets and use of evaluation protocols

    'Elbows Out' - Predictive tracking of partially occluded pose for Robot-Assisted dressing

    Get PDF
    © 2016 IEEE. Robots that can assist in the activities of daily living, such as dressing, may support older adults, addressing the needs of an aging population in the face of a growing shortage of care professionals. Using depth cameras during robot-assisted dressing can lead to occlusions and loss of user tracking, which may result in unsafe trajectory planning or prevent the planning task proceeding altogether. For the dressing task of putting on a jacket, which is addressed in this letter, tracking of the arm is lost when the user's hand enters the jacket, which may lead to unsafe situations for the user and a poor interaction experience. Using motion tracking data, free from occlusions, gathered from a human-human interaction study on an assisted dressing task, recurrent neural network models were built to predict the elbow position of a single arm based on other features of the user pose. The best features for predicting the elbow position were explored by using regression trees indicating the hips and shoulder as possible predictors. Engineered features were also created based on observations of real dressing scenarios and their effectiveness explored. Comparison between position and orientation-based datasets was also included in this study. A 12-fold cross-validation was performed for each feature set and repeated 20 times to improve statistical power. Using position-based data, the elbow position could be predicted with a 4.1 cm error but adding engineered features reduced the error to 2.4 cm. Adding orientation information to the data did not improve the accuracy and aggregating univariate response models failed to make significant improvements. The model was evaluated on Kinect data for a robot dressing task and although not without issues, demonstrates potential for this application. Although this has been demonstrated for jacket dressing, the technique could be applied to a number of different situations during occluded tracking

    Teegi: Tangible EEG Interface

    Get PDF
    We introduce Teegi, a Tangible ElectroEncephaloGraphy (EEG) Interface that enables novice users to get to know more about something as complex as brain signals, in an easy, en- gaging and informative way. To this end, we have designed a new system based on a unique combination of spatial aug- mented reality, tangible interaction and real-time neurotech- nologies. With Teegi, a user can visualize and analyze his or her own brain activity in real-time, on a tangible character that can be easily manipulated, and with which it is possible to interact. An exploration study has shown that interacting with Teegi seems to be easy, motivating, reliable and infor- mative. Overall, this suggests that Teegi is a promising and relevant training and mediation tool for the general public.Comment: to appear in UIST-ACM User Interface Software and Technology Symposium, Oct 2014, Honolulu, United State

    Possibilities of man-machine interaction through the perception of human gestures

    Get PDF
    A mesura que les màquines s'utilitzen interaccionant cada cop més amb les persones, la necessitat d'interfícies més amigables esdevé una necessitat creixent. La comunicació oral persona-màquina com una forma d'interacció utilitzant el llenguatge natural és cada vegada més usual. La interpretació dels gestos humans pot, en certes aplicacions, complementar aquesta comunicació oral. Aquest article descriu un sistema d'interpretació dels gestos basat en la visió per computador. El procés d'interpretació realitza la detecció i seguiment d'un operador humà, i a partir dels seus moviments interpreta un conjunt específic d'ordres gestuals, en temps real.As man-machine interaction grows there is an increasing need for friendly interfaces. Human-machine oral communication as a means of natural language interaction is becoming quite common. Interpretation of human gestures can, in some applications, complement such communication. This article describes an interpretation of gestures procedure. The system is based on a computer vision system for the detection and tracking of a human operator and the interpretation of a specific set of human gestures in real time

    Anthropomorphic Robot Design and User Interaction Associated with Motion

    Get PDF
    Though in its original concept a robot was conceived to have some human-like shape, most robots now in use have specific industrial purposes and do not closely resemble humans. Nevertheless, robots that resemble human form in some way have continued to be introduced. They are called anthropomorphic robots. The fact that the user interface to all robots is now highly mediated means that the form of the user interface is not necessarily connected to the robots form, human or otherwise. Consequently, the unique way the design of anthropomorphic robots affects their user interaction is through their general appearance and the way they move. These robots human-like appearance acts as a kind of generalized predictor that gives its operators, and those with whom they may directly work, the expectation that they will behave to some extent like a human. This expectation is especially prominent for interactions with social robots, which are built to enhance it. Often interaction with them may be mainly cognitive because they are not necessarily kinematically intricate enough for complex physical interaction. Their body movement, for example, may be limited to simple wheeled locomotion. An anthropomorphic robot with human form, however, can be kinematically complex and designed, for example, to reproduce the details of human limb, torso, and head movement. Because of the mediated nature of robot control, there remains in general no necessary connection between the specific form of user interface and the anthropomorphic form of the robot. But their anthropomorphic kinematics and dynamics imply that the impact of their design shows up in the way the robot moves. The central finding of this report is that the control of this motion is a basic design element through which the anthropomorphic form can affect user interaction. In particular, designers of anthropomorphic robots can take advantage of the inherent human-like movement to 1) improve the users direct manual control over robot limbs and body positions, 2) improve users ability to detect anomalous robot behavior which could signal malfunction, and 3) enable users to be better able to infer the intent of robot movement. These three benefits of anthropomorphic design are inherent implications of the anthropomorphic form but they need to be recognized by designers as part of anthropomorphic design and explicitly enhanced to maximize their beneficial impact. Examples of such enhancements are provided in this report. If implemented, these benefits of anthropomorphic design can help reduce the risk of Inadequate Design of Human and Automation Robotic Integration (HARI) associated with the HARI-01 gap by providing efficient and dexterous operator control over robots and by improving operator ability to detect malfunctions and understand the intention of robot movement
    corecore