61 research outputs found
Cosine Measures of Neutrosophic Cubic Sets for Multiple Attribute Decision-Making
The neutrosophic cubic set can contain much more information to express its interval neutrosophic numbers and single-valued neutrosophic numbers simultaneously in indeterminate environments. Hence, it is a usual tool for expressing much more information in complex decision-making problems
CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model
Supervised crowd counting relies heavily on costly manual labeling, which is
difficult and expensive, especially in dense scenes. To alleviate the problem,
we propose a novel unsupervised framework for crowd counting, named CrowdCLIP.
The core idea is built on two observations: 1) the recent contrastive
pre-trained vision-language model (CLIP) has presented impressive performance
on various downstream tasks; 2) there is a natural mapping between crowd
patches and count text. To the best of our knowledge, CrowdCLIP is the first to
investigate the vision language knowledge to solve the counting problem.
Specifically, in the training stage, we exploit the multi-modal ranking loss by
constructing ranking text prompts to match the size-sorted crowd patches to
guide the image encoder learning. In the testing stage, to deal with the
diversity of image patches, we propose a simple yet effective progressive
filtering strategy to first select the highly potential crowd patches and then
map them into the language space with various counting intervals. Extensive
experiments on five challenging datasets demonstrate that the proposed
CrowdCLIP achieves superior performance compared to previous unsupervised
state-of-the-art counting methods. Notably, CrowdCLIP even surpasses some
popular fully-supervised methods under the cross-dataset setting. The source
code will be available at https://github.com/dk-liang/CrowdCLIP.Comment: Accepted by CVPR 202
SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model
With the development of large language models, many remarkable linguistic
systems like ChatGPT have thrived and achieved astonishing success on many
tasks, showing the incredible power of foundation models. In the spirit of
unleashing the capability of foundation models on vision tasks, the Segment
Anything Model (SAM), a vision foundation model for image segmentation, has
been proposed recently and presents strong zero-shot ability on many downstream
2D tasks. However, whether SAM can be adapted to 3D vision tasks has yet to be
explored, especially 3D object detection. With this inspiration, we explore
adapting the zero-shot ability of SAM to 3D object detection in this paper. We
propose a SAM-powered BEV processing pipeline to detect objects and get
promising results on the large-scale Waymo open dataset. As an early attempt,
our method takes a step toward 3D object detection with vision foundation
models and presents the opportunity to unleash their power on 3D vision tasks.
The code is released at https://github.com/DYZhang09/SAM3D.Comment: Technical Report. The code is released at
https://github.com/DYZhang09/SAM3
Paint and Distill: Boosting 3D Object Detection with Semantic Passing Network
3D object detection task from lidar or camera sensors is essential for
autonomous driving. Pioneer attempts at multi-modality fusion complement the
sparse lidar point clouds with rich semantic texture information from images at
the cost of extra network designs and overhead. In this work, we propose a
novel semantic passing framework, named SPNet, to boost the performance of
existing lidar-based 3D detection models with the guidance of rich context
painting, with no extra computation cost during inference. Our key design is to
first exploit the potential instructive semantic knowledge within the
ground-truth labels by training a semantic-painted teacher model and then guide
the pure-lidar network to learn the semantic-painted representation via
knowledge passing modules at different granularities: class-wise passing,
pixel-wise passing and instance-wise passing. Experimental results show that
the proposed SPNet can seamlessly cooperate with most existing 3D detection
frameworks with 1~5% AP gain and even achieve new state-of-the-art 3D detection
performance on the KITTI test benchmark. Code is available at:
https://github.com/jb892/SPNet.Comment: Accepted by ACMMM202
SOOD: Towards Semi-Supervised Oriented Object Detection
Semi-Supervised Object Detection (SSOD), aiming to explore unlabeled data for
boosting object detectors, has become an active task in recent years. However,
existing SSOD approaches mainly focus on horizontal objects, leaving
multi-oriented objects that are common in aerial images unexplored. This paper
proposes a novel Semi-supervised Oriented Object Detection model, termed SOOD,
built upon the mainstream pseudo-labeling framework. Towards oriented objects
in aerial scenes, we design two loss functions to provide better supervision.
Focusing on the orientations of objects, the first loss regularizes the
consistency between each pseudo-label-prediction pair (includes a prediction
and its corresponding pseudo label) with adaptive weights based on their
orientation gap. Focusing on the layout of an image, the second loss
regularizes the similarity and explicitly builds the many-to-many relation
between the sets of pseudo-labels and predictions. Such a global consistency
constraint can further boost semi-supervised learning. Our experiments show
that when trained with the two proposed losses, SOOD surpasses the
state-of-the-art SSOD methods under various settings on the DOTA-v1.5
benchmark. The code will be available at https://github.com/HamPerdredes/SOOD.Comment: Accepted to CVPR 2023. Code will be available at
https://github.com/HamPerdredes/SOO
SGM3D: Stereo Guided Monocular 3D Object Detection
Monocular 3D object detection aims to predict the object location, dimension
and orientation in 3D space alongside the object category given only a
monocular image. It poses a great challenge due to its ill-posed property which
is critically lack of depth information in the 2D image plane. While there
exist approaches leveraging off-the-shelve depth estimation or relying on LiDAR
sensors to mitigate this problem, the dependence on the additional depth model
or expensive equipment severely limits their scalability to generic 3D
perception. In this paper, we propose a stereo-guided monocular 3D object
detection framework, dubbed SGM3D, adapting the robust 3D features learned from
stereo inputs to enhance the feature for monocular detection. We innovatively
present a multi-granularity domain adaptation (MG-DA) mechanism to exploit the
network's ability to generate stereo-mimicking features given only on monocular
cues. Coarse BEV feature-level, as well as the fine anchor-level domain
adaptation, are both leveraged for guidance in the monocular domain.In
addition, we introduce an IoU matching-based alignment (IoU-MA) method for
object-level domain adaptation between the stereo and monocular predictions to
alleviate the mismatches while adopting the MG-DA. Extensive experiments
demonstrate state-of-the-art results on KITTI and Lyft datasets.Comment: 8 pages, 5 figure
Diffusion-based 3D Object Detection with Random Boxes
3D object detection is an essential task for achieving autonomous driving.
Existing anchor-based detection methods rely on empirical heuristics setting of
anchors, which makes the algorithms lack elegance. In recent years, we have
witnessed the rise of several generative models, among which diffusion models
show great potential for learning the transformation of two distributions. Our
proposed Diff3Det migrates the diffusion model to proposal generation for 3D
object detection by considering the detection boxes as generative targets.
During training, the object boxes diffuse from the ground truth boxes to the
Gaussian distribution, and the decoder learns to reverse this noise process. In
the inference stage, the model progressively refines a set of random boxes to
the prediction results. We provide detailed experiments on the KITTI benchmark
and achieve promising performance compared to classical anchor-based 3D
detection methods.Comment: Accepted by PRCV 202
Development and clinical application of a new testicular prosthesis
A new type of testicular prosthesis made of silastic with an elliptical shape to mimic a normal testis was developed by our team and submitted for patenting in China. The prosthesis was produced in different sizes to imitate the normal testis of the patient. To investigate the effects and safety of the testicular prosthesis, 20 patients receiving testicular prosthesis implantation were recruited for this study. Follow-up after 6 months revealed no complications in the patients. All the patients answered that they were satisfied with their body image and the position of the implants, 19 patients were satisfied with the size and 16 patients were satisfied with the weight. These results show that the testicular prosthesis used in this study can meet patient's expectations. Patients undergoing orchiectomy should be offered the option to receive a testicular prosthesis implantation. The dimensions and weight of the available prosthetic implants should be further addressed to improve patient satisfaction
Cost-effectiveness of endovascular thrombectomy with alteplase versus endovascular thrombectomy alone for acute ischemic stroke secondary to large vessel occlusion
Background: Recent randomized trials have suggested that endovascular thrombectomy (EVT) alone may provide similar functional outcomes as the current standard of care, EVT combined with intravenous alteplase treatment, for acute ischemic stroke secondary to large vessel occlusion. We conducted an economic evaluation of these 2 therapeutic options.
Methods: We constructed a decision analytic model with a hypothetical cohort of 1000 patients to assess the cost-effectiveness of EVT with intravenous alteplase treatment versus EVT alone for acute ischemic stroke secondary to large vessel occlusion from both the societal and public health care payer perspectives. We used studies and data published in 2009–2021 for model inputs, and acquired cost data for Canada and China, representing high- and middle-income countries, respectively. We calculated incremental cost-effectiveness ratios (ICERs) using a lifetime horizon and accounted for uncertainty using 1-way and probabilistic sensitivity analyses. All costs are reported in 2021 Canadian dollars.
Results: In Canada, the difference in quality-adjusted life-years (QALYs) gained between EVT with alteplase and EVT alone was 0.10 from both the societal and health care payer perspectives. The difference in cost was USD2847 from a societal perspective and USD2767 from the payer perspective. In China, the difference in QALYs gained was 0.07 from both perspectives, and the difference in cost was USD1550 from the societal perspective and USD1607 from the payer perspective. One-way sensitivity analyses showed that the distributions of modified Rankin Scale scores at 90 days after stroke were the most influential factor on ICERs. For Canada, compared to EVT alone, the probability that EVT with alteplase would be cost-effective at a willingness-to-pay threshold of USD50 000 per QALY gained was 58.7% from a societal perspective and 58.4% from a payer perspective. The corresponding values for at a willingness-to-pay threshold of USD47 185 (3 times the Chinese gross domestic product per capita in 2021) were 65.2% and 67.4%.
Interpretation: For patients with acute ischemic stroke due to large vessel occlusion eligible for immediate treatment with both EVT alone and EVT with intravenous alteplase treatment, it is uncertain whether EVT with alteplase is cost-effective compared to EVT alone in Canada and China
- …