6,440 research outputs found
A dataset of 40K naturalistic 6-degree-of-freedom robotic grasp demonstrations
Modern approaches to grasp planning often involve deep learning. However,
there are only a few large datasets of labelled grasping examples on physical
robots, and available datasets involve relatively simple planar grasps with
two-fingered grippers. Here we present: 1) a new human grasp demonstration
method that facilitates rapid collection of naturalistic grasp examples, with
full six-degree-of-freedom gripper positioning; and 2) a dataset of roughly
forty thousand successful grasps on 109 different rigid objects with the
RightHand Robotics three-fingered ReFlex gripper
Predicting Contextual Sequences via Submodular Function Maximization
Sequence optimization, where the items in a list are ordered to maximize some
reward has many applications such as web advertisement placement, search, and
control libraries in robotics. Previous work in sequence optimization produces
a static ordering that does not take any features of the item or context of the
problem into account. In this work, we propose a general approach to order the
items within the sequence based on the context (e.g., perceptual information,
environment description, and goals). We take a simple, efficient,
reduction-based approach where the choice and order of the items is established
by repeatedly learning simple classifiers or regressors for each "slot" in the
sequence. Our approach leverages recent work on submodular function
maximization to provide a formal regret reduction from submodular sequence
optimization to simple cost-sensitive prediction. We apply our contextual
sequence prediction algorithm to optimize control libraries and demonstrate
results on two robotics problems: manipulator trajectory prediction and mobile
robot path planning.Comment: 8 page
SGDN: Segmentation-Based Grasp Detection Network For Unsymmetrical Three-Finger Gripper
In this paper, we present Segmentation-Based Grasp Detection Network (SGDN)
to predict a feasible robotic grasping for a unsymmetrical three-finger robotic
gripper using RGB images. The feasible grasping of a target should be a
collection of grasp regions with the same grasp angle and width. In other
words, a simplified planar grasp representation should be pixel-level rather
than region-level such as five-dimensional grasp representation.Therefore, we
propose a pixel-level grasp representation, oriented base-fixed triangle. It is
also more suitable for unsymmetrical three-finger gripper which cannot grasp
symmetrically when grasping some objects, the grasp angle is at [0, 2{\pi})
instead of [0, {\pi}) of parallel plate gripper.In order to predict the
appropriate grasp region and its corresponding grasp angle and width in the RGB
image, SGDN uses DeepLabv3+ as a feature extractor, and uses a three-channel
grasp predictor to predict feasible oriented base-fixed triangle grasp
representation of each pixel.On the re-annotated Cornell Grasp Dataset, our
model achieves an accuracy of 96.8% and 92.27% on image-wise split and
object-wise split respectively, and obtains accurate predictions consistent
with the state-of-the-art methods.Comment: 9 pages, 8 figures. arXiv admin note: text overlap with
arXiv:1803.02209 by other author
Test bed experiments for various telerobotic system characteristics and configurations
Dexterous manipulation and grasping in telerobotic systems depends on the integration of high-performance sensors, displays, actuators and controls into systems in which careful consideration has been given to human perception and tolerance. Research underway at the Wisconsin Center for Space Automation and Robotics (WCSAR) has the objective of enhancing the performance of these systems and their components, and quantifying the effects of the many electrical, mechanical, control, and human factors that affect their performance. This will lead to a fundamental understanding of performance issues which will in turn allow designers to evaluate sensor, actuator, display, and control technologies with respect to generic measures of dexterous performance. As part of this effort, an experimental test bed was developed which has telerobotic components with exceptionally high fidelity in master/slave operation. A Telerobotic Performance Analysis System has also been developed which allows performance to be determined for various system configurations and electro-mechanical characteristics. Both this performance analysis system and test bed experiments are described
Machine Vision in the Context of Robotics: A Systematic Literature Review
Machine vision is critical to robotics due to a wide range of applications
which rely on input from visual sensors such as autonomous mobile robots and
smart production systems. To create the smart homes and systems of tomorrow, an
overview about current challenges in the research field would be of use to
identify further possible directions, created in a systematic and reproducible
manner. In this work a systematic literature review was conducted covering
research from the last 10 years. We screened 172 papers from four databases and
selected 52 relevant papers. While robustness and computation time were
improved greatly, occlusion and lighting variance are still the biggest
problems faced. From the number of recent publications, we conclude that the
observed field is of relevance and interest to the research community. Further
challenges arise in many areas of the field.Comment: 10 pages 5 figures, systematic literature stud
End-to-End Learning of Semantic Grasping
We consider the task of semantic robotic grasping, in which a robot picks up
an object of a user-specified class using only monocular images. Inspired by
the two-stream hypothesis of visual reasoning, we present a semantic grasping
framework that learns object detection, classification, and grasp planning in
an end-to-end fashion. A "ventral stream" recognizes object class while a
"dorsal stream" simultaneously interprets the geometric relationships necessary
to execute successful grasps. We leverage the autonomous data collection
capabilities of robots to obtain a large self-supervised dataset for training
the dorsal stream, and use semi-supervised label propagation to train the
ventral stream with only a modest amount of human supervision. We
experimentally show that our approach improves upon grasping systems whose
components are not learned end-to-end, including a baseline method that uses
bounding box detection. Furthermore, we show that jointly training our model
with auxiliary data consisting of non-semantic grasping data, as well as
semantically labeled images without grasp actions, has the potential to
substantially improve semantic grasping performance.Comment: 14 page
Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance
We present a machine learning-based methodology capable of providing
real-time ("nowcast") and forecast estimates of influenza activity in the US by
leveraging data from multiple data sources including: Google searches, Twitter
microblogs, nearly real-time hospital visit records, and data from a
participatory surveillance system. Our main contribution consists of combining
multiple influenza-like illnesses (ILI) activity estimates, generated
independently with each data source, into a single prediction of ILI utilizing
machine learning ensemble approaches. Our methodology exploits the information
in each data source and produces accurate weekly ILI predictions for up to four
weeks ahead of the release of CDC's ILI reports. We evaluate the predictive
ability of our ensemble approach during the 2013-2014 (retrospective) and
2014-2015 (live) flu seasons for each of the four weekly time horizons. Our
ensemble approach demonstrates several advantages: (1) our ensemble method's
predictions outperform every prediction using each data source independently,
(2) our methodology can produce predictions one week ahead of GFT's real-time
estimates with comparable accuracy, and (3) our two and three week forecast
estimates have comparable accuracy to real-time predictions using an
autoregressive model. Moreover, our results show that considerable insight is
gained from incorporating disparate data streams, in the form of social media
and crowd sourced data, into influenza predictions in all time horizon
Object Handovers: a Review for Robotics
This article surveys the literature on human-robot object handovers. A
handover is a collaborative joint action where an agent, the giver, gives an
object to another agent, the receiver. The physical exchange starts when the
receiver first contacts the object held by the giver and ends when the giver
fully releases the object to the receiver. However, important cognitive and
physical processes begin before the physical exchange, including initiating
implicit agreement with respect to the location and timing of the exchange.
From this perspective, we structure our review into the two main phases
delimited by the aforementioned events: 1) a pre-handover phase, and 2) the
physical exchange. We focus our analysis on the two actors (giver and receiver)
and report the state of the art of robotic givers (robot-to-human handovers)
and the robotic receivers (human-to-robot handovers). We report a comprehensive
list of qualitative and quantitative metrics commonly used to assess the
interaction. While focusing our review on the cognitive level (e.g.,
prediction, perception, motion planning, learning) and the physical level
(e.g., motion, grasping, grip release) of the handover, we briefly discuss also
the concepts of safety, social context, and ergonomics. We compare the
behaviours displayed during human-to-human handovers to the state of the art of
robotic assistants, and identify the major areas of improvement for robotic
assistants to reach performance comparable to human interactions. Finally, we
propose a minimal set of metrics that should be used in order to enable a fair
comparison among the approaches.Comment: Review paper, 19 page
Geometric Affordances from a Single Example via the Interaction Tensor
This paper develops and evaluates a new tensor field representation to
express the geometric affordance of one object over another. We expand the well
known bisector surface representation to one that is weight-driven and that
retains the provenance of surface points with directional vectors. We also
incorporate the notion of affordance keypoints which allow for faster decisions
at a point of query and with a compact and straightforward descriptor. Using a
single interaction example, we are able to generalize to previously-unseen
scenarios; both synthetic and also real scenes captured with RGBD sensors. We
show how our interaction tensor allows for significantly better performance
over alternative formulations. Evaluations also include crowdsourcing
comparisons that confirm the validity of our affordance proposals, which agree
on average 84% of the time with human judgments, and which is 20-40% better
than the baseline methods.Comment: 10 pages, 12 figure
YouTube-8M Video Understanding Challenge Approach and Applications
This paper introduces the YouTube-8M Video Understanding Challenge hosted as
a Kaggle competition and also describes my approach to experimenting with
various models. For each of my experiments, I provide the score result as well
as possible improvements to be made. Towards the end of the paper, I discuss
the various ensemble learning techniques that I applied on the dataset which
significantly boosted my overall competition score. At last, I discuss the
exciting future of video understanding research and also the many applications
that such research could significantly improve.Comment: YouTube-8M Workshop submission, 8 page
- …