9,354 research outputs found

    TossingBot: Learning to Throw Arbitrary Objects with Residual Physics

    Full text link
    We investigate whether a robot arm can learn to pick and throw arbitrary objects into selected boxes quickly and accurately. Throwing has the potential to increase the physical reachability and picking speed of a robot arm. However, precisely throwing arbitrary objects in unstructured settings presents many challenges: from acquiring reliable pre-throw conditions (e.g. initial pose of object in manipulator) to handling varying object-centric properties (e.g. mass distribution, friction, shape) and dynamics (e.g. aerodynamics). In this work, we propose an end-to-end formulation that jointly learns to infer control parameters for grasping and throwing motion primitives from visual observations (images of arbitrary objects in a bin) through trial and error. Within this formulation, we investigate the synergies between grasping and throwing (i.e., learning grasps that enable more accurate throws) and between simulation and deep learning (i.e., using deep networks to predict residuals on top of control parameters predicted by a physics simulator). The resulting system, TossingBot, is able to grasp and throw arbitrary objects into boxes located outside its maximum reach range at 500+ mean picks per hour (600+ grasps per hour with 85% throwing accuracy); and generalizes to new objects and target locations. Videos are available at https://tossingbot.cs.princeton.eduComment: Summary Video: https://youtu.be/f5Zn2Up2RjQ Project webpage: https://tossingbot.cs.princeton.ed

    Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning

    Full text link
    Skilled robotic manipulation benefits from complex synergies between non-prehensile (e.g. pushing) and prehensile (e.g. grasping) actions: pushing can help rearrange cluttered objects to make space for arms and fingers; likewise, grasping can help displace objects to make pushing movements more precise and collision-free. In this work, we demonstrate that it is possible to discover and learn these synergies from scratch through model-free deep reinforcement learning. Our method involves training two fully convolutional networks that map from visual observations to actions: one infers the utility of pushes for a dense pixel-wise sampling of end effector orientations and locations, while the other does the same for grasping. Both networks are trained jointly in a Q-learning framework and are entirely self-supervised by trial and error, where rewards are provided from successful grasps. In this way, our policy learns pushing motions that enable future grasps, while learning grasps that can leverage past pushes. During picking experiments in both simulation and real-world scenarios, we find that our system quickly learns complex behaviors amid challenging cases of clutter, and achieves better grasping success rates and picking efficiencies than baseline alternatives after only a few hours of training. We further demonstrate that our method is capable of generalizing to novel objects. Qualitative results (videos), code, pre-trained models, and simulation environments are available at http://vpg.cs.princeton.eduComment: To appear at the International Conference On Intelligent Robots and Systems (IROS) 2018. Project webpage: http://vpg.cs.princeton.edu Summary video: https://youtu.be/-OkyX7Zlhi

    Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View

    Full text link
    We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation (<= 50%) in the form of an RGB-D image. To make this possible, Im2Pano3D leverages strong contextual priors learned from large-scale synthetic and real-world indoor scenes. To ease the prediction of 3D structure, we propose to parameterize 3D surfaces with their plane equations and train the model to predict these parameters directly. To provide meaningful training supervision, we use multiple loss functions that consider both pixel level accuracy and global context consistency. Experiments demon- strate that Im2Pano3D is able to predict the semantics and 3D structure of the unobserved scene with more than 56% pixel accuracy and less than 0.52m average distance error, which is significantly better than alternative approaches.Comment: Video summary: https://youtu.be/Au3GmktK-S

    Convergence rate across the Nepal Himalaya and interseismic coupling on the Main Himalayan Thrust: Implications for seismic hazard

    Get PDF
    We document geodetic strain across the Nepal Himalaya using GPS times series from 30 stations in Nepal and southern Tibet, in addition to previously published campaign GPS points and leveling data and determine the pattern of interseismic coupling on the Main Himalayan Thrust fault (MHT). The noise on the daily GPS positions is modeled as a combination of white and colored noise, in order to infer secular velocities at the stations with consistent uncertainties. We then locate the pole of rotation of the Indian plate in the ITRF 2005 reference frame at longitude = − 1.34° ± 3.31°, latitude = 51.4° ± 0.3° with an angular velocity of Ω = 0.5029 ± 0.0072°/Myr. The pattern of coupling on the MHT is computed on a fault dipping 10° to the north and whose strike roughly follows the arcuate shape of the Himalaya. The model indicates that the MHT is locked from the surface to a distance of approximately 100 km down dip, corresponding to a depth of 15 to 20 km. In map view, the transition zone between the locked portion of the MHT and the portion which is creeping at the long term slip rate seems to be at the most a few tens of kilometers wide and coincides with the belt of midcrustal microseismicity underneath the Himalaya. According to a previous study based on thermokinematic modeling of thermochronological and thermobarometric data, this transition seems to happen in a zone where the temperature reaches 350°C. The convergence between India and South Tibet proceeds at a rate of 17.8 ± 0.5 mm/yr in central and eastern Nepal and 20.5 ± 1 mm/yr in western Nepal. The moment deficit due to locking of the MHT in the interseismic period accrues at a rate of 6.6 ± 0.4 × 10^(19) Nm/yr on the MHT underneath Nepal. For comparison, the moment released by the seismicity over the past 500 years, including 14 M_W ≥ 7 earthquakes with moment magnitudes up to 8.5, amounts to only 0.9 × 10^(19) Nm/yr, indicating a large deficit of seismic slip over that period or very infrequent large slow slip events. No large slow slip event has been observed however over the 20 years covered by geodetic measurements in the Nepal Himalaya. We discuss the magnitude and return period of M > 8 earthquakes required to balance the long term slip budget on the MHT

    Matterport3D: Learning from RGB-D Data in Indoor Environments

    Full text link
    Access to large, diverse RGB-D datasets is critical for training RGB-D scene understanding algorithms. However, existing datasets still cover only a limited number of views or a restricted scale of spaces. In this paper, we introduce Matterport3D, a large-scale RGB-D dataset containing 10,800 panoramic views from 194,400 RGB-D images of 90 building-scale scenes. Annotations are provided with surface reconstructions, camera poses, and 2D and 3D semantic segmentations. The precise global alignment and comprehensive, diverse panoramic set of views over entire buildings enable a variety of supervised and self-supervised computer vision tasks, including keypoint matching, view overlap prediction, normal prediction from color, semantic segmentation, and region classification

    P2P Email Encryption by An Identity-Based One-Way Group Key Agreement Protocol

    Get PDF
    As a result of high-tech companies such as Google, Yahoo, and Microsoft offering free email services, email has become a primary channel of communication. However, email service providers have traditionally offered little in the way of message privacy protection. This has made emails, of which billions are sent around the world on any day, an attractive data source for personal identity information thieves. Google was one of the first companies to provide substantial email privacy protection when they began using the HTTPS always-on option to encrypt messages sent through their email service, Gmail. Unfortunately, Gmail\u27s encryption option does not offer true point-to-point encryption since the encrypted emails are decrypted and stored in plaintext form on Google\u27s servers. This type of approach poses a security vulnerability which is unacceptable to security-minded users such as highly sensitive government agencies and private companies. For these users, true point-to-point encryption is needed. This paper introduces an identity-based one-way group key agreement protocol and describes a point-to-point email encryption scheme based on the protocol. Both the security proofs and the efficiency analysis, with experimental results, of the new scheme are provided

    Emotion Recognition based on Multimodal Information

    Get PDF
    corecore