61 research outputs found

    Cosine Measures of Neutrosophic Cubic Sets for Multiple Attribute Decision-Making

    Get PDF
    The neutrosophic cubic set can contain much more information to express its interval neutrosophic numbers and single-valued neutrosophic numbers simultaneously in indeterminate environments. Hence, it is a usual tool for expressing much more information in complex decision-making problems

    CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model

    Full text link
    Supervised crowd counting relies heavily on costly manual labeling, which is difficult and expensive, especially in dense scenes. To alleviate the problem, we propose a novel unsupervised framework for crowd counting, named CrowdCLIP. The core idea is built on two observations: 1) the recent contrastive pre-trained vision-language model (CLIP) has presented impressive performance on various downstream tasks; 2) there is a natural mapping between crowd patches and count text. To the best of our knowledge, CrowdCLIP is the first to investigate the vision language knowledge to solve the counting problem. Specifically, in the training stage, we exploit the multi-modal ranking loss by constructing ranking text prompts to match the size-sorted crowd patches to guide the image encoder learning. In the testing stage, to deal with the diversity of image patches, we propose a simple yet effective progressive filtering strategy to first select the highly potential crowd patches and then map them into the language space with various counting intervals. Extensive experiments on five challenging datasets demonstrate that the proposed CrowdCLIP achieves superior performance compared to previous unsupervised state-of-the-art counting methods. Notably, CrowdCLIP even surpasses some popular fully-supervised methods under the cross-dataset setting. The source code will be available at https://github.com/dk-liang/CrowdCLIP.Comment: Accepted by CVPR 202

    SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model

    Full text link
    With the development of large language models, many remarkable linguistic systems like ChatGPT have thrived and achieved astonishing success on many tasks, showing the incredible power of foundation models. In the spirit of unleashing the capability of foundation models on vision tasks, the Segment Anything Model (SAM), a vision foundation model for image segmentation, has been proposed recently and presents strong zero-shot ability on many downstream 2D tasks. However, whether SAM can be adapted to 3D vision tasks has yet to be explored, especially 3D object detection. With this inspiration, we explore adapting the zero-shot ability of SAM to 3D object detection in this paper. We propose a SAM-powered BEV processing pipeline to detect objects and get promising results on the large-scale Waymo open dataset. As an early attempt, our method takes a step toward 3D object detection with vision foundation models and presents the opportunity to unleash their power on 3D vision tasks. The code is released at https://github.com/DYZhang09/SAM3D.Comment: Technical Report. The code is released at https://github.com/DYZhang09/SAM3

    Paint and Distill: Boosting 3D Object Detection with Semantic Passing Network

    Full text link
    3D object detection task from lidar or camera sensors is essential for autonomous driving. Pioneer attempts at multi-modality fusion complement the sparse lidar point clouds with rich semantic texture information from images at the cost of extra network designs and overhead. In this work, we propose a novel semantic passing framework, named SPNet, to boost the performance of existing lidar-based 3D detection models with the guidance of rich context painting, with no extra computation cost during inference. Our key design is to first exploit the potential instructive semantic knowledge within the ground-truth labels by training a semantic-painted teacher model and then guide the pure-lidar network to learn the semantic-painted representation via knowledge passing modules at different granularities: class-wise passing, pixel-wise passing and instance-wise passing. Experimental results show that the proposed SPNet can seamlessly cooperate with most existing 3D detection frameworks with 1~5% AP gain and even achieve new state-of-the-art 3D detection performance on the KITTI test benchmark. Code is available at: https://github.com/jb892/SPNet.Comment: Accepted by ACMMM202

    SOOD: Towards Semi-Supervised Oriented Object Detection

    Full text link
    Semi-Supervised Object Detection (SSOD), aiming to explore unlabeled data for boosting object detectors, has become an active task in recent years. However, existing SSOD approaches mainly focus on horizontal objects, leaving multi-oriented objects that are common in aerial images unexplored. This paper proposes a novel Semi-supervised Oriented Object Detection model, termed SOOD, built upon the mainstream pseudo-labeling framework. Towards oriented objects in aerial scenes, we design two loss functions to provide better supervision. Focusing on the orientations of objects, the first loss regularizes the consistency between each pseudo-label-prediction pair (includes a prediction and its corresponding pseudo label) with adaptive weights based on their orientation gap. Focusing on the layout of an image, the second loss regularizes the similarity and explicitly builds the many-to-many relation between the sets of pseudo-labels and predictions. Such a global consistency constraint can further boost semi-supervised learning. Our experiments show that when trained with the two proposed losses, SOOD surpasses the state-of-the-art SSOD methods under various settings on the DOTA-v1.5 benchmark. The code will be available at https://github.com/HamPerdredes/SOOD.Comment: Accepted to CVPR 2023. Code will be available at https://github.com/HamPerdredes/SOO

    SGM3D: Stereo Guided Monocular 3D Object Detection

    Full text link
    Monocular 3D object detection aims to predict the object location, dimension and orientation in 3D space alongside the object category given only a monocular image. It poses a great challenge due to its ill-posed property which is critically lack of depth information in the 2D image plane. While there exist approaches leveraging off-the-shelve depth estimation or relying on LiDAR sensors to mitigate this problem, the dependence on the additional depth model or expensive equipment severely limits their scalability to generic 3D perception. In this paper, we propose a stereo-guided monocular 3D object detection framework, dubbed SGM3D, adapting the robust 3D features learned from stereo inputs to enhance the feature for monocular detection. We innovatively present a multi-granularity domain adaptation (MG-DA) mechanism to exploit the network's ability to generate stereo-mimicking features given only on monocular cues. Coarse BEV feature-level, as well as the fine anchor-level domain adaptation, are both leveraged for guidance in the monocular domain.In addition, we introduce an IoU matching-based alignment (IoU-MA) method for object-level domain adaptation between the stereo and monocular predictions to alleviate the mismatches while adopting the MG-DA. Extensive experiments demonstrate state-of-the-art results on KITTI and Lyft datasets.Comment: 8 pages, 5 figure

    Diffusion-based 3D Object Detection with Random Boxes

    Full text link
    3D object detection is an essential task for achieving autonomous driving. Existing anchor-based detection methods rely on empirical heuristics setting of anchors, which makes the algorithms lack elegance. In recent years, we have witnessed the rise of several generative models, among which diffusion models show great potential for learning the transformation of two distributions. Our proposed Diff3Det migrates the diffusion model to proposal generation for 3D object detection by considering the detection boxes as generative targets. During training, the object boxes diffuse from the ground truth boxes to the Gaussian distribution, and the decoder learns to reverse this noise process. In the inference stage, the model progressively refines a set of random boxes to the prediction results. We provide detailed experiments on the KITTI benchmark and achieve promising performance compared to classical anchor-based 3D detection methods.Comment: Accepted by PRCV 202

    Development and clinical application of a new testicular prosthesis

    Get PDF
    A new type of testicular prosthesis made of silastic with an elliptical shape to mimic a normal testis was developed by our team and submitted for patenting in China. The prosthesis was produced in different sizes to imitate the normal testis of the patient. To investigate the effects and safety of the testicular prosthesis, 20 patients receiving testicular prosthesis implantation were recruited for this study. Follow-up after 6 months revealed no complications in the patients. All the patients answered that they were satisfied with their body image and the position of the implants, 19 patients were satisfied with the size and 16 patients were satisfied with the weight. These results show that the testicular prosthesis used in this study can meet patient's expectations. Patients undergoing orchiectomy should be offered the option to receive a testicular prosthesis implantation. The dimensions and weight of the available prosthetic implants should be further addressed to improve patient satisfaction

    Cost-effectiveness of endovascular thrombectomy with alteplase versus endovascular thrombectomy alone for acute ischemic stroke secondary to large vessel occlusion

    Get PDF
    Background: Recent randomized trials have suggested that endovascular thrombectomy (EVT) alone may provide similar functional outcomes as the current standard of care, EVT combined with intravenous alteplase treatment, for acute ischemic stroke secondary to large vessel occlusion. We conducted an economic evaluation of these 2 therapeutic options. Methods: We constructed a decision analytic model with a hypothetical cohort of 1000 patients to assess the cost-effectiveness of EVT with intravenous alteplase treatment versus EVT alone for acute ischemic stroke secondary to large vessel occlusion from both the societal and public health care payer perspectives. We used studies and data published in 2009–2021 for model inputs, and acquired cost data for Canada and China, representing high- and middle-income countries, respectively. We calculated incremental cost-effectiveness ratios (ICERs) using a lifetime horizon and accounted for uncertainty using 1-way and probabilistic sensitivity analyses. All costs are reported in 2021 Canadian dollars. Results: In Canada, the difference in quality-adjusted life-years (QALYs) gained between EVT with alteplase and EVT alone was 0.10 from both the societal and health care payer perspectives. The difference in cost was USD2847 from a societal perspective and USD2767 from the payer perspective. In China, the difference in QALYs gained was 0.07 from both perspectives, and the difference in cost was USD1550 from the societal perspective and USD1607 from the payer perspective. One-way sensitivity analyses showed that the distributions of modified Rankin Scale scores at 90 days after stroke were the most influential factor on ICERs. For Canada, compared to EVT alone, the probability that EVT with alteplase would be cost-effective at a willingness-to-pay threshold of USD50 000 per QALY gained was 58.7% from a societal perspective and 58.4% from a payer perspective. The corresponding values for at a willingness-to-pay threshold of USD47 185 (3 times the Chinese gross domestic product per capita in 2021) were 65.2% and 67.4%. Interpretation: For patients with acute ischemic stroke due to large vessel occlusion eligible for immediate treatment with both EVT alone and EVT with intravenous alteplase treatment, it is uncertain whether EVT with alteplase is cost-effective compared to EVT alone in Canada and China
    corecore