1 research outputs found
POV-Surgery: A Dataset for Egocentric Hand and Tool Pose Estimation During Surgical Activities
The surgical usage of Mixed Reality (MR) has received growing attention in
areas such as surgical navigation systems, skill assessment, and robot-assisted
surgeries. For such applications, pose estimation for hand and surgical
instruments from an egocentric perspective is a fundamental task and has been
studied extensively in the computer vision field in recent years. However, the
development of this field has been impeded by a lack of datasets, especially in
the surgical field, where bloody gloves and reflective metallic tools make it
hard to obtain 3D pose annotations for hands and objects using conventional
methods. To address this issue, we propose POV-Surgery, a large-scale,
synthetic, egocentric dataset focusing on pose estimation for hands with
different surgical gloves and three orthopedic surgical instruments, namely
scalpel, friem, and diskplacer. Our dataset consists of 53 sequences and 88,329
frames, featuring high-resolution RGB-D video streams with activity
annotations, accurate 3D and 2D annotations for hand-object pose, and 2D
hand-object segmentation masks. We fine-tune the current SOTA methods on
POV-Surgery and further show the generalizability when applying to real-life
cases with surgical gloves and tools by extensive evaluations. The code and the
dataset are publicly available at batfacewayne.github.io/POV_Surgery_io/