65 research outputs found

    Improving Trajectory Prediction in Dynamic Multi-Agent Environment by Dropping Waypoints

    Full text link
    The inherently diverse and uncertain nature of trajectories presents a formidable challenge in accurately modeling them. Motion prediction systems must effectively learn spatial and temporal information from the past to forecast the future trajectories of the agent. Many existing methods learn temporal motion via separate components within stacked models to capture temporal features. Furthermore, prediction methods often operate under the assumption that observed trajectory waypoint sequences are complete, disregarding scenarios where missing values may occur, which can influence their performance. Moreover, these models may be biased toward particular waypoint sequences when making predictions. We propose a novel approach called Temporal Waypoint Dropping (TWD) that explicitly incorporates temporal dependencies during the training of a trajectory prediction model. By stochastically dropping waypoints from past observed trajectories, the model is forced to learn the underlying temporal representation from the remaining waypoints, resulting in an improved model. Incorporating stochastic temporal waypoint dropping into the model learning process significantly enhances its performance in scenarios with missing values. Experimental results demonstrate our approach's substantial improvement in trajectory prediction capabilities. Our approach can complement existing trajectory prediction methods to improve their prediction accuracy. We evaluate our proposed approach on three datasets: NBA Sports VU, ETH-UCY, and TrajNet++.Comment: Under Revie

    Accuracy Booster: Performance Boosting using Feature Map Re-calibration

    Get PDF
    Convolution Neural Networks (CNN) have been extremely successful in solving intensive computer vision tasks. The convolutional filters used in CNNs have played a major role in this success, by extracting useful features from the inputs. Recently researchers have tried to boost the performance of CNNs by re-calibrating the feature maps produced by these filters, e.g., Squeeze-and-Excitation Networks (SENets). These approaches have achieved better performance by Exciting up the important channels or feature maps while diminishing the rest. However, in the process, architectural complexity has increased. We propose an architectural block that introduces much lower complexity than the existing methods of CNN performance boosting while performing significantly better than them. We carry out experiments on the CIFAR, ImageNet and MS-COCO datasets, and show that the proposed block can challenge the state-of-the-art results. Our method boosts the ResNet-50 architecture to perform comparably to the ResNet-152 architecture, which is a three times deeper network, on classification. We also show experimentally that our method is not limited to classification but also generalizes well to other tasks such as object detection.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV), 202

    Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives

    Full text link
    Deep learning has demonstrated remarkable performance across various tasks in medical imaging. However, these approaches primarily focus on supervised learning, assuming that the training and testing data are drawn from the same distribution. Unfortunately, this assumption may not always hold true in practice. To address these issues, unsupervised domain adaptation (UDA) techniques have been developed to transfer knowledge from a labeled domain to a related but unlabeled domain. In recent years, significant advancements have been made in UDA, resulting in a wide range of methodologies, including feature alignment, image translation, self-supervision, and disentangled representation methods, among others. In this paper, we provide a comprehensive literature review of recent deep UDA approaches in medical imaging from a technical perspective. Specifically, we categorize current UDA research in medical imaging into six groups and further divide them into finer subcategories based on the different tasks they perform. We also discuss the respective datasets used in the studies to assess the divergence between the different domains. Finally, we discuss emerging areas and provide insights and discussions on future research directions to conclude this survey.Comment: Under Revie

    Minimizing Supervision in Multi-label Categorization

    Full text link
    Multiple categories of objects are present in most images. Treating this as a multi-class classification is not justified. We treat this as a multi-label classification problem. In this paper, we further aim to minimize the supervision required for providing supervision in multi-label classification. Specifically, we investigate an effective class of approaches that associate a weak localization with each category either in terms of the bounding box or segmentation mask. Doing so improves the accuracy of multi-label categorization. The approach we adopt is one of active learning, i.e., incrementally selecting a set of samples that need supervision based on the current model, obtaining supervision for these samples, retraining the model with the additional set of supervised samples and proceeding again to select the next set of samples. A crucial concern is the choice of the set of samples. In doing so, we provide a novel insight, and no specific measure succeeds in obtaining a consistently improved selection criterion. We, therefore, provide a selection criterion that consistently improves the overall baseline criterion by choosing the top k set of samples for a varied set of criteria. Using this criterion, we are able to show that we can retain more than 98% of the fully supervised performance with just 20% of samples (and more than 96% using 10%) of the dataset on PASCAL VOC 2007 and 2012. Also, our proposed approach consistently outperforms all other baseline metrics for all benchmark datasets and model combinations.Comment: Accepted in CVPR-W 202

    Small and Dim Target Detection in IR Imagery: A Review

    Full text link
    While there has been significant progress in object detection using conventional image processing and machine learning algorithms, exploring small and dim target detection in the IR domain is a relatively new area of study. The majority of small and dim target detection methods are derived from conventional object detection algorithms, albeit with some alterations. The task of detecting small and dim targets in IR imagery is complex. This is because these targets often need distinct features, the background is cluttered with unclear details, and the IR signatures of the scene can change over time due to fluctuations in thermodynamics. The primary objective of this review is to highlight the progress made in this field. This is the first review in the field of small and dim target detection in infrared imagery, encompassing various methodologies ranging from conventional image processing to cutting-edge deep learning-based approaches. The authors have also introduced a taxonomy of such approaches. There are two main types of approaches: methodologies using several frames for detection, and single-frame-based detection techniques. Single frame-based detection techniques encompass a diverse range of methods, spanning from traditional image processing-based approaches to more advanced deep learning methodologies. Our findings indicate that deep learning approaches perform better than traditional image processing-based approaches. In addition, a comprehensive compilation of various available datasets has also been provided. Furthermore, this review identifies the gaps and limitations in existing techniques, paving the way for future research and development in this area.Comment: Under Revie

    Data efficient deep learning for medical image analysis: A survey

    Full text link
    The rapid evolution of deep learning has significantly advanced the field of medical image analysis. However, despite these achievements, the further enhancement of deep learning models for medical image analysis faces a significant challenge due to the scarcity of large, well-annotated datasets. To address this issue, recent years have witnessed a growing emphasis on the development of data-efficient deep learning methods. This paper conducts a thorough review of data-efficient deep learning methods for medical image analysis. To this end, we categorize these methods based on the level of supervision they rely on, encompassing categories such as no supervision, inexact supervision, incomplete supervision, inaccurate supervision, and only limited supervision. We further divide these categories into finer subcategories. For example, we categorize inexact supervision into multiple instance learning and learning with weak annotations. Similarly, we categorize incomplete supervision into semi-supervised learning, active learning, and domain-adaptive learning and so on. Furthermore, we systematically summarize commonly used datasets for data efficient deep learning in medical image analysis and investigate future research directions to conclude this survey.Comment: Under Revie

    CPWC:Contextual Point Wise Convolution for Object Recognition

    Get PDF
    Convolutional layers are a major driving force behind the successes of deep learning. Pointwise convolution (PWC) is a 1x1 convolutional filter that is primarily used for parameter reduction. However, the PWC ignores the spatial information around the points it is processing. This design is by choice, in order to reduce the overall parameters and computations. However, we hypothesize that this shortcoming of PWC has a significant impact on the network performance. We propose an alternative design for pointwise convolution, which uses spatial information from the input efficiently. Our design significantly improves the performance of the networks without substantially increasing the number of parameters and computations. We experimentally show that our design results in significant improvement in the performance of the network for classification as well as detection.Comment: Accepted in ICASSP 202
    • …
    corecore