106 research outputs found
Multi-aspect Repetition Suppression and Content Moderation of Large Language Models
Natural language generation is one of the most impactful fields in NLP, and
recent years have witnessed its evolution brought about by large language
models (LLMs). As the key instrument for writing assistance applications, they
are generally prone to replicating or extending offensive content provided in
the input. In low-resource data regime, they can also lead to repetitive
outputs (Holtzman et al., 2019) [1]. Usually, offensive content and repetitions
are mitigated with post-hoc methods, including n-gram level blocklists, top-k
and nucleus sampling. In this paper, we introduce a combination of exact and
non-exact repetition suppression using token and sequence level unlikelihood
loss, repetition penalty during training, inference, and post-processing
respectively. We further explore multi-level unlikelihood loss to the extent
that it endows the model with abilities to avoid generating offensive words and
phrases from the beginning. Finally, with comprehensive experiments, we
demonstrate that our proposed methods work exceptionally in controlling the
repetition and content quality of LLM outputs
Depth-Driven Geometric Prompt Learning for Laparoscopic Liver Landmark Detection
Laparoscopic liver surgery poses a complex intraoperative dynamic environment
for surgeons, where remains a significant challenge to distinguish critical or
even hidden structures inside the liver. Liver anatomical landmarks, e.g.,
ridge and ligament, serve as important markers for 2D-3D alignment, which can
significantly enhance the spatial perception of surgeons for precise surgery.
To facilitate the detection of laparoscopic liver landmarks, we collect a novel
dataset called L3D, which comprises 1,152 frames with elaborated landmark
annotations from surgical videos of 39 patients across two medical sites. For
benchmarking purposes, 12 mainstream detection methods are selected and
comprehensively evaluated on L3D. Further, we propose a depth-driven geometric
prompt learning network, namely D2GPLand. Specifically, we design a Depth-aware
Prompt Embedding (DPE) module that is guided by self-supervised prompts and
generates semantically relevant geometric information with the benefit of
global depth cues extracted from SAM-based features. Additionally, a
Semantic-specific Geometric Augmentation (SGA) scheme is introduced to
efficiently merge RGB-D spatial and geometric information through reverse
anatomic perception. The experimental results indicate that D2GPLand obtains
state-of-the-art performance on L3D, with 63.52% DICE and 48.68% IoU scores.
Together with 2D-3D fusion technology, our method can directly provide the
surgeon with intuitive guidance information in laparoscopic scenarios.Comment: This paper has been accepted by MICCAI 202
Constraint-Based Optimized Human Skeleton Extraction from Single-Depth Camera
As a cutting-edge research topic in computer vision and graphics for decades, human skeleton extraction from single-depth camera remains challenging due to possibly occurring occlusions of different body parts, huge appearance variations, and sensor noise. In this paper, we propose to incorporate human skeleton length conservation and symmetry priors as well as temporal constraints to enhance the consistency and continuity for the estimated skeleton of a moving human body. Given an initial estimation of the skeleton joint positions provided per frame by the Kinect SDK or Nuitrack SDK, which do not follow the aforementioned priors and can prone to errors, our framework improves the accuracy of these pose estimates based on the length and symmetry constraints. In addition, our method is device-independent and can be integrated into skeleton extraction SDKs for refinement, allowing the detection of outliers within the initial joint location estimates and predicting new joint location estimates following the temporal observations. The experimental results demonstrate the effectiveness and robustness of our approach in several cases
Constraint-Based Optimized Human Skeleton Extraction from Single-Depth Camera
As a cutting-edge research topic in computer vision and graphics for decades, human skeleton extraction from single-depth camera remains challenging due to possibly occurring occlusions of different body parts, huge appearance variations, and sensor noise. In this paper, we propose to incorporate human skeleton length conservation and symmetry priors as well as temporal constraints to enhance the consistency and continuity for the estimated skeleton of a moving human body. Given an initial estimation of the skeleton joint positions provided per frame by the Kinect SDK or Nuitrack SDK, which do not follow the aforementioned priors and can prone to errors, our framework improves the accuracy of these pose estimates based on the length and symmetry constraints. In addition, our method is device-independent and can be integrated into skeleton extraction SDKs for refinement, allowing the detection of outliers within the initial joint location estimates and predicting new joint location estimates following the temporal observations. The experimental results demonstrate the effectiveness and robustness of our approach in several cases.</jats:p
A Method for Space Tracking and Positioning of Surgical Instruments in Virtual Surgery Simulation
- …
