15 research outputs found

    Tracking-by-Trackers with a Distilled and Reinforced Model

    Get PDF
    Visual object tracking was generally tackled by reasoning independently on fast processing algorithms, accurate online adaptation methods, and fusion of trackers. In this paper, we unify such goals by proposing a novel tracking methodology that takes advantage of other visual trackers, offline and online. A compact student model is trained via the marriage of knowledge distillation and reinforcement learning. The first allows to transfer and compress tracking knowledge of other trackers. The second enables the learning of evaluation measures which are then exploited online. After learning, the student can be ultimately used to build (i) a very fast single-shot tracker, (ii) a tracker with a simple and effective online adaptation mechanism, (iii) a tracker that performs fusion of other trackers. Extensive validation shows that the proposed algorithms compete with real-time state-of-the-art trackers

    Improving MRI-based Knee Disorder Diagnosis with Pyramidal Feature Details

    Get PDF
    This paper presents MRPyrNet, a new convolutional neural network (CNN) architecture that improves the capabilities of CNN-based pipelines for knee injury detection via magnetic resonance imaging (MRI). Existing works showed that anomalies are localized in small-sized knee regions that appear in particular areas of MRI scans. Based on such facts, MRPyrNet exploits a Feature Pyramid Network to enhance small appearing features and Pyramidal Detail Pooling to capture such relevant information in a robust way. Experimental results on two publicly available datasets demonstrate that MRPyrNet improves the ACL tear and meniscal tear diagnosis capabilities of two state-of-the-art methodologies. Code is available at https://git.io/JtMPH

    Visualizing Skiers' Trajectories in Monocular Videos

    Full text link
    Trajectories are fundamental to winning in alpine skiing. Tools enabling the analysis of such curves can enhance the training activity and enrich broadcasting content. In this paper, we propose SkiTraVis, an algorithm to visualize the sequence of points traversed by a skier during its performance. SkiTraVis works on monocular videos and constitutes a pipeline of a visual tracker to model the skier's motion and of a frame correspondence module to estimate the camera's motion. The separation of the two motions enables the visualization of the trajectory according to the moving camera's perspective. We performed experiments on videos of real-world professional competitions to quantify the visualization error, the computational efficiency, as well as the applicability. Overall, the results achieved demonstrate the potential of our solution for broadcasting media enhancement and coach assistance.Comment: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), CVsports worksho

    Is First Person Vision Challenging for Object Tracking?

    Full text link
    Understanding human-object interactions is fundamental in First Person Vision (FPV). Tracking algorithms which follow the objects manipulated by the camera wearer can provide useful cues to effectively model such interactions. Visual tracking solutions available in the computer vision literature have significantly improved their performance in the last years for a large variety of target objects and tracking scenarios. However, despite a few previous attempts to exploit trackers in FPV applications, a methodical analysis of the performance of state-of-the-art trackers in this domain is still missing. In this paper, we fill the gap by presenting the first systematic study of object tracking in FPV. Our study extensively analyses the performance of recent visual trackers and baseline FPV trackers with respect to different aspects and considering a new performance measure. This is achieved through TREK-150, a novel benchmark dataset composed of 150 densely annotated video sequences. Our results show that object tracking in FPV is challenging, which suggests that more research efforts should be devoted to this problem so that tracking could benefit FPV tasks.Comment: IEEE/CVF International Conference on Computer Vision (ICCV) 2021, Visual Object Tracking Challenge VOT2021 workshop. arXiv admin note: text overlap with arXiv:2011.1226

    One-class Gaussian process regressor for quality assessment of transperineal ultrasound images

    Get PDF
    The use of ultrasound guidance in prostate cancer radiotherapy workflows is not widespread. This can be partially attributed to the need for image interpretation by a trained operator during ultrasound image acquisition. In this work, a one-class regressor, based on DenseNet and Gaussian processes, was implemented to assess automatically the quality of transperineal ultrasound images of the male pelvic region. The implemented deep learning approach achieved a scoring accuracy of 94%, a specificity of 95% and a sensitivity of 93% with respect to the majority vote of three experts, which was comparable with the results of these experts. This is the first step towards a fully automatic workflow, which could potentially remove the need for image interpretation and thereby make the use of ultrasound imaging, which allows real-time volumetric organ tracking in the RT environment, more appealing for hospitals

    The Ninth Visual Object Tracking VOT2021 Challenge Results

    Get PDF
    acceptedVersionPeer reviewe

    Visual Object Tracking with Deep Learning

    Get PDF
    In its simplest definition, the problem of visual object tracking consists in making a computer recognize and localize persistently a target object in a video. This is a core problem in the field of computer vision that aims to replicate the human ability in keeping the focus on a particular object with the sight. In the past, several different algorithmic principles have been proposed to reach such a capability. Thanks to the tremendous improvement in accuracy, recent algorithms based on deep learning emerged as promising methodologies to achieve the goal. The fundamental idea behind these techniques is to exploit the ability of deep neural networks in learning complex functions to learn how to track objects by visual examples. The potential of this kind of tool attracted the interest of the research community so much that nowadays deep learning is the way-to-go for the implementation of effective visual tracking algorithms. Despite the popularity, the study of deep neural networks for visual tracking is relatively at its early stages. This means that there are still many open issues that need to be addressed to fully comprehend the capabilities and potentialities of such learning models. In this Thesis, we try to give an answer to some of these questions.In its simplest definition, the problem of visual object tracking consists in making a computer recognize and localize persistently a target object in a video. This is a core problem in the field of computer vision that aims to replicate the human ability in keeping the focus on a particular object with the sight. In the past, several different algorithmic principles have been proposed to reach such a capability. Thanks to the tremendous improvement in accuracy, recent algorithms based on deep learning emerged as promising methodologies to achieve the goal. The fundamental idea behind these techniques is to exploit the ability of deep neural networks in learning complex functions to learn how to track objects by visual examples. The potential of this kind of tool attracted the interest of the research community so much that nowadays deep learning is the way-to-go for the implementation of effective visual tracking algorithms. Despite the popularity, the study of deep neural networks for visual tracking is relatively at its early stages. This means that there are still many open issues that need to be addressed to fully comprehend the capabilities and potentialities of such learning models. In this Thesis, we try to give an answer to some of these questions

    Surrogate Safety Measures Prediction at Multiple Timescales in V2P Conflicts Based on Gated Recurrent Unit

    Get PDF
    Improving pedestrian safety at urban intersections requires intelligent systems that should not only understand the actual vehicle\u2013pedestrian (V2P) interaction state but also proactively anticipate the event\u2019s future severity pattern. This paper presents a Gated Recurrent Unit-based system that aims to predict, up to 3 s ahead in time, the severity level of V2P encounters, depending on the current scene representation drawn from on-board radars\u2019 data. A car-driving simulator experiment has been designed to collect sequential mobility features on a cohort of 65 licensed university students who faced different V2P conflicts on a planned urban route. To accurately describe the pedestrian safety condition during the encounter process, a combination of surrogate safety indicators, namely TAdv (Time Advantage) and T2 (Nearness of the Encroachment), are considered for modeling. Due to the nature of these indicators, multiple recurrent neural networks are trained to separately predict T2 continuous values and TAdv categories. Afterwards, their predictions are exploited to label serious conflict interactions. As a comparison, an additional Gated Recurrent Unit (GRU) neural network is developed to directly predict the severity level of inner-city encounters. The latter neural model reaches the best performance on the test set, scoring a recall value of 0.899. Based on selected threshold values, the presented models can be used to label pedestrians near accident events and to enhance existing intelligent driving systems

    Combining complementary trackers for enhanced long-term visual object tracking

    No full text
    Several different algorithms have been studied to combine the capabilities of baseline trackers in the context of short-term visual object tracking. Despite such an extended interest, the long-term setting has not been taken into consideration by previous studies. In this paper, we explicitly consider long-term tracking scenarios and provide a framework to fuse the characteristics of complementary state-of-the-art trackers to achieve enhanced tracking performance. Our strategy perceives whether the two trackers are following the target object through an online learned deep verification model. Such a target recognition strategy enables the activation of a decision strategy which selects the best performing tracker as well as it corrects their performance when failing. The proposed solution is studied extensively and the comparison with several other approaches reveals that it beats the state-of-the-art on the long-term visual tracking benchmarks LTB-35, LTB-50, TLP, and LaSOT
    corecore