8 research outputs found
DiffRef3D: A Diffusion-based Proposal Refinement Framework for 3D Object Detection
Denoising diffusion models show remarkable performances in generative tasks,
and their potential applications in perception tasks are gaining interest. In
this paper, we introduce a novel framework named DiffRef3D which adopts the
diffusion process on 3D object detection with point clouds for the first time.
Specifically, we formulate the proposal refinement stage of two-stage 3D object
detectors as a conditional diffusion process. During training, DiffRef3D
gradually adds noise to the residuals between proposals and target objects,
then applies the noisy residuals to proposals to generate hypotheses. The
refinement module utilizes these hypotheses to denoise the noisy residuals and
generate accurate box predictions. In the inference phase, DiffRef3D generates
initial hypotheses by sampling noise from a Gaussian distribution as residuals
and refines the hypotheses through iterative steps. DiffRef3D is a versatile
proposal refinement framework that consistently improves the performance of
existing 3D object detection models. We demonstrate the significance of
DiffRef3D through extensive experiments on the KITTI benchmark. Code will be
available
PG-RCNN: Semantic Surface Point Generation for 3D Object Detection
One of the main challenges in LiDAR-based 3D object detection is that the
sensors often fail to capture the complete spatial information about the
objects due to long distance and occlusion. Two-stage detectors with point
cloud completion approaches tackle this problem by adding more points to the
regions of interest (RoIs) with a pre-trained network. However, these methods
generate dense point clouds of objects for all region proposals, assuming that
objects always exist in the RoIs. This leads to the indiscriminate point
generation for incorrect proposals as well. Motivated by this, we propose Point
Generation R-CNN (PG-RCNN), a novel end-to-end detector that generates semantic
surface points of foreground objects for accurate detection. Our method uses a
jointly trained RoI point generation module to process the contextual
information of RoIs and estimate the complete shape and displacement of
foreground objects. For every generated point, PG-RCNN assigns a semantic
feature that indicates the estimated foreground probability. Extensive
experiments show that the point clouds generated by our method provide
geometrically and semantically rich information for refining false positive and
misaligned proposals. PG-RCNN achieves competitive performance on the KITTI
benchmark, with significantly fewer parameters than state-of-the-art models.
The code is available at https://github.com/quotation2520/PG-RCNN.Comment: Accepted by ICCV 202
Explore and Match: End-to-End Video Grounding with Transformer
We present a new paradigm named explore-and-match for video grounding, which
aims to seamlessly unify two streams of video grounding methods: proposal-based
and proposal-free. To achieve this goal, we formulate video grounding as a set
prediction problem and design an end-to-end trainable Video Grounding
Transformer (VidGTR) that can utilize the architectural strengths of rich
contextualization and parallel decoding for set prediction. The overall
training is balanced by two key losses that play different roles, namely span
localization loss and set guidance loss. These two losses force each proposal
to regress the target timespan and identify the target query. Throughout the
training, VidGTR first explores the search space to diversify the initial
proposals and then matches the proposals to the corresponding targets to fit
them in a fine-grained manner. The explore-and-match scheme successfully
combines the strengths of two complementary methods, without encoding prior
knowledge into the pipeline. As a result, VidGTR sets new state-of-the-art
results on two video grounding benchmarks with double the inference speed
Viola rostrata Muhl. var. japonica Ohwi
原著和名: ナガハシスミレ科名: スミレ科 = Violaceae採集地: 新潟県 三島郡 和島村 (越後 和島村)採集日: 1966/4/6採集者: 萩庭丈壽整理番号: JH046880国立科学博物館整理番号: TNS-VS-99688
Pre-Treatment Objective Diagnosis and Post-Treatment Outcome Evaluation in Patients with Vascular Pulsatile Tinnitus Using Transcanal Recording and Spectro-Temporal Analysis - Fig 1
<p>Pre-treatment (A) and post-treatment (B) ear canal signals of seven vascular pulsatile tinnitus subjects and ear canal signals of five control subjects (C) measured with an upright, neutral head position.</p
Demographic characteristics of the included subjects with vascular pulsatile tinnitus.
<p>Demographic characteristics of the included subjects with vascular pulsatile tinnitus.</p
Sound pressure level differences of the pulse-synchronized five spectral bands (~50ms) between pre- and post-treatment ear canal signals of the patient group and ear canal signals of the control group analyzed in the frequency domain.
<p>Asterisks designate statistically significant. n.s., non-significant.</p