10 research outputs found

    Visual object tracking in dynamic scenes

    Get PDF
    Visual object tracking is a fundamental task in the field computer vision. Visual object tracking is widely used in numerous applications which include, but are not limited to video surveillance, image understanding, robotics, and human-computer interaction. In essence, visual object tracking is the problem of estimating the states/trajectory of the object of interest over time. Unlike other tasks such as object detection where the number of classes/categories are defined beforehand, the only available information of the object of interest is at the first frame. Even though, Deep Learning (DL) has revolutionised most computer vision tasks, visual object tracking still imposes several challenges. The nature of visual object tracking task is stochastic, where no prior-knowledge is available about the object of interest during the training or testing/inference. Moreover, visual object tracking is a class-agnostic task, as opposed object detection and segmentation tasks. In this thesis, the main objective is to develop and advance the visual object trackers using novel designs of deep learning frameworks and mathematical formulations. To take advantage of different trackers, a novel framework is developed to track moving objects based on a composite framework and a reporter mechanism. The composite framework has built-in trackers and user-defined trackers to track the object of interest. The framework contains a module to calculate the robustness for each tracker and a reporter mechanism serves as a recovery mechanism if trackers fail to locate the object of interest. Different trackers may fail to track the object of interest, thus, a more robust framework based on Siamese network architecture, namely DensSiam, is proposed to use the concept of dense layers and connects each dense layer in the network to all layers in a feed-forward fashion with a similarity-learning function. DensSiam also includes a Self-Attention mechanism to force the network to pay more attention to non-local features during offline training. Generally, Siamese trackers do not fully utilize semantic and objectness information from pre-trained networks that have been trained on an image classification task. To solve this problem a novel architecture design is proposed , dubbed DomainSiam, to learn a Domain-Aware that fully utilizes semantic and objectness information while producing a class-agnostic track using a ridge regression network. Moreover, to reduce the sparsity problem, we solve the ridge regression problem with a differentiable weighted-dynamic loss function. Siamese trackers have high speed and work in real-time, however, they lack high accuracy. To overcome this challenge, a novel dynamic policy gradient Agent-Environment architecture with Siamese network (DP-Siam) is proposed to train the tracker to increase the accuracy and the expected average overlap while running in real-time. DP-Siam is trained offline with reinforcement learning to produce a continuous action that predicts the optimal object location. One of the common design block in most object trackers in the literature is the backbone network, where the backbone network is trained in the feature space. To design a backbone network that maps from feature space to another space (i.e., joint-nullspace) and more suitable for object tracking and classification, a novel framework is proposed. The new framework is called NullSpaceNet has a clear interpretation for the feature representation and the features in this space are more separable. NullSpaceNet is utilized in object tracking by regularizing the discriminative joint-nullspace backbone network. The novel tracker is called NullSpaceRDAR, and encourages the network to have a representation for the target-specific information for the object of interest in the joint-nullspace. In contrast to feature space where objects from a specific class are categorized into one category however, it is insensitive to intra-class variations. Furthermore, we use the NullSpaceNet backbone to learn a tracker, dubbed NullSpaceRDAR, with a regularized discriminative joint-nullspace backbone network that is specifically designed for object tracking. In the regularized discriminative joint-nullspace, the features from the same target-specific are collapsed into one point in the joint-null space and different targetspecific features are collapsed into different points in the joint-nullspace. Consequently, the joint-nullspace forces the network to be sensitive to the variations of the object from the same class (intra-class variations). Moreover, a dynamic adaptive loss function is proposed to select the suitable loss function from a super-set family of losses based on the training data to make NullSpaceRDAR more robust to different challenges

    Adaptive Framework for Robust Visual Tracking

    Get PDF
    Visual tracking is a difficult and challenging problem, for numerous reasons such as small object size, pose angle variations, occlusion, and camera motion. Object tracking has many real-world applications such as surveillance systems, moving organs in medical imaging, and robotics. Traditional tracking methods lack a recovery mechanism that can be used in situations when the tracked objects drift away from ground truth. In this paper, we propose a novel framework for tracking moving objects based on a composite framework and a reporter mechanism. The composite framework tracks moving objects using different trackers and produces pairs of forward/backward tracklets. A robustness score is then calculated for each tracker using its forward/backward tracklet pair to find the most reliable moving object trajectory. The reporter serves as the recovery mechanism to correct the moving object trajectory when the robustness score is very low, mainly using a combination of particle filter and template matching. The proposed framework can handle partial and heavy occlusions; moreover, the structure of the framework enables integration of other user-specific trackers. Extensive experiments on recent benchmarks show that the proposed framework outperforms other current state-of-the-art trackers due to its powerful trajectory analysis and recovery mechanism; the framework improved the area under the curve from 68% to 70.8% on OTB-100 benchmark

    The Ninth Visual Object Tracking VOT2021 Challenge Results

    Get PDF
    acceptedVersionPeer reviewe

    DP-Siam: Dynamic Policy Siamese Network for Robust Object Tracking

    No full text

    CORONA-Net: Diagnosing COVID-19 from X-ray Images Using Re-Initialization and Classification Networks

    No full text
    The COVID-19 pandemic has been deemed a global health pandemic. The early detection of COVID-19 is key to combating its outbreak and could help bring this pandemic to an end. One of the biggest challenges in combating COVID-19 is accurate testing for the disease. Utilizing the power of Convolutional Neural Networks (CNNs) to detect COVID-19 from chest X-ray images can help radiologists compare and validate their results with an automated system. In this paper, we propose a carefully designed network, dubbed CORONA-Net, that can accurately detect COVID-19 from chest X-ray images. CORONA-Net is divided into two phases: (1) The reinitialization phase and (2) the classification phase. In the reinitialization phase, the network consists of encoder and decoder networks. The objective of this phase is to train and initialize the encoder and decoder networks by a distribution that comes out of medical images. In the classification phase, the decoder network is removed from CORONA-Net, and the encoder network acts as a backbone network to fine-tune the classification phase based on the learned weights from the reinitialization phase. Extensive experiments were performed on a publicly available dataset, COVIDx, and the results show that CORONA-Net significantly outperforms the current state-of-the-art networks with an overall accuracy of 95.84%.Science, Irving K. Barber Faculty of (Okanagan)Non UBCComputer Science, Mathematics, Physics and Statistics, Department of (Okanagan)ReviewedFacult

    Deep Learning-Based Crowd Scene Analysis Survey

    No full text
    Recently, our world witnessed major events that attracted a lot of attention towards the importance of automatic crowd scene analysis. For example, the COVID-19 breakout and public events require an automatic system to manage, count, secure, and track a crowd that shares the same area. However, analyzing crowd scenes is very challenging due to heavy occlusion, complex behaviors, and posture changes. This paper surveys deep learning-based methods for analyzing crowded scenes. The reviewed methods are categorized as (1) crowd counting and (2) crowd actions recognition. Moreover, crowd scene datasets are surveyed. In additional to the above surveys, this paper proposes an evaluation metric for crowd scene analysis methods. This metric estimates the difference between calculated crowed count and actual count in crowd scene videos.Other UBCNon UBCReviewedFacult

    The Sixth Visual Object Tracking VOT2018 Challenge Results

    Get PDF
    The Visual Object Tracking challenge VOT2018 is the sixth annual tracker benchmarking activity organized by the VOT initiative. Results of over eighty trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years. The evaluation included the standard VOT and other popular methodologies for short-term tracking analysis and a “real-time” experiment simulating a situation where a tracker processes images as if provided by a continuously running sensor. A long-term tracking subchallenge has been introduced to the set of standard VOT sub-challenges. The new subchallenge focuses on long-term tracking properties, namely coping with target disappearance and reappearance. A new dataset has been compiled and a performance evaluation methodology that focuses on long-term tracking capabilities has been adopted. The VOT toolkit has been updated to support both standard short-term and the new long-term tracking subchallenges. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The dataset, the evaluation kit and the results are publicly available at the challenge website (http://votchallenge.net).Funding agencies: Slovenian research agencySlovenian Research Agency - Slovenia [P2-0214, P2-0094, J2-8175]; Czech Science FoundationGrant Agency of the Czech Republic [GACR P103/12/G084]; WASP; VR (EMC2); SSF (SymbiCloud); SNIC; AIT Strategic Research Programme 2017 Visua</p
    corecore