62 research outputs found

    HCT: Hybrid Convnet-Transformer for Parkinson's disease detection and severity prediction from gait

    Full text link
    In this paper, we propose a novel deep learning method based on a new Hybrid ConvNet-Transformer architecture to detect and stage Parkinson's disease (PD) from gait data. We adopt a two-step approach by dividing the problem into two sub-problems. Our Hybrid ConvNet-Transformer model first distinguishes healthy versus parkinsonian patients. If the patient is parkinsonian, a multi-class Hybrid ConvNet-Transformer model determines the Hoehn and Yahr (H&Y) score to assess the PD severity stage. Our hybrid architecture exploits the strengths of both Convolutional Neural Networks (ConvNets) and Transformers to accurately detect PD and determine the severity stage. In particular, we take advantage of ConvNets to capture local patterns and correlations in the data, while we exploit Transformers for handling long-term dependencies in the input signal. We show that our hybrid method achieves superior performance when compared to other state-of-the-art methods, with a PD detection accuracy of 97% and a severity staging accuracy of 87%. Our source code is available at: https://github.com/SafwenNaimiComment: 6 pages, 6 figures, 3 tables, Accepted for publication in IEEE International Conference on Machine Learning and Applications (ICMLA), copyright IEE

    STF: Spatio-Temporal Fusion Module for Improving Video Object Detection

    Full text link
    Consecutive frames in a video contain redundancy, but they may also contain relevant complementary information for the detection task. The objective of our work is to leverage this complementary information to improve detection. Therefore, we propose a spatio-temporal fusion framework (STF). We first introduce multi-frame and single-frame attention modules that allow a neural network to share feature maps between nearby frames to obtain more robust object representations. Second, we introduce a dual-frame fusion module that merges feature maps in a learnable manner to improve them. Our evaluation is conducted on three different benchmarks including video sequences of moving road users. The performed experiments demonstrate that the proposed spatio-temporal fusion module leads to improved detection performance compared to baseline object detectors. Code is available at https://github.com/noreenanwar/STF-moduleComment: 8 pages,3 figure

    Multi-Branch Siamese Networks with Online Selection for Object Tracking

    Get PDF
    In this paper, we propose a robust object tracking algorithm based on a branch selection mechanism to choose the most efficient object representations from multi-branch siamese networks. While most deep learning trackers use a single CNN for target representation, the proposed Multi-Branch Siamese Tracker (MBST) employs multiple branches of CNNs pre-trained for different tasks, and used for various target representations in our tracking method. With our branch selection mechanism, the appropriate CNN branch is selected depending on the target characteristics in an online manner. By using the most adequate target representation with respect to the tracked object, our method achieves real-time tracking, while obtaining improved performance compared to standard Siamese network trackers on object tracking benchmarks.Comment: ISVC2018, oral presentatio

    Reproducible Evaluation of Pan-Tilt-Zoom Tracking

    Get PDF
    Tracking with a Pan-Tilt-Zoom (PTZ) camera has been a research topic in computer vision for many years. However, it is very difficult to assess the progress that has been made on this topic because there is no standard evaluation methodology. The difficulty in evaluating PTZ tracking algorithms arises from their dynamic nature. In contrast to other forms of tracking, PTZ tracking involves both locating the target in the image and controlling the motors of the camera to aim it so that the target stays in its field of view. This type of tracking can only be performed online. In this paper, we propose a new evaluation framework based on a virtual PTZ camera. With this framework, tracking scenarios do not change for each experiment and we are able to replicate online PTZ camera control and behavior including camera positioning delays, tracker processing delays, and numerical zoom. We tested our evaluation framework with the Camshift tracker to show its viability and to establish baseline results.Comment: This is an extended version of the 2015 ICIP paper "Reproducible Evaluation of Pan-Tilt-Zoom Tracking

    Automatic counting of planting microsites via local visual detection and global count estimation

    Get PDF
    In forest industry, mechanical site preparation by mounding is widely used prior to planting operations. One of the main problems when planning planting operations is the difficulty in estimating the number of mounds present on a planting block, as their number may greatly vary depending on site characteristics. This estimation is often carried out through field surveys by several forestry workers. However, this procedure is prone to error and slowness. Motivated by recent advances in UAV imagery and artificial intelligence, we propose a fully automated framework to estimate the number of mounds on a planting block. Using computer vision and machine learning, we formulate the counting task as a supervised learning problem using two prediction models. A local detection model is firstly used to detect visible mounds based on deep features, while a global prediction function is subsequently applied to provide a final estimation based on block-level features. To evaluate the proposed method, we constructed a challenging UAV dataset representing several plantation blocks with different characteristics. The performed experiments demonstrated the robustness of the proposed method, which outperforms manual methods in precision, while significantly reducing time and cost

    HCT: Hybrid Convnet-Transformer for Parkinson’s disease detection and severity prediction from gait

    Get PDF

    Transformers for 1D Signals in Parkinson's Disease Detection from Gait

    Full text link
    This paper focuses on the detection of Parkinson's disease based on the analysis of a patient's gait. The growing popularity and success of Transformer networks in natural language processing and image recognition motivated us to develop a novel method for this problem based on an automatic features extraction via Transformers. The use of Transformers in 1D signal is not really widespread yet, but we show in this paper that they are effective in extracting relevant features from 1D signals. As Transformers require a lot of memory, we decoupled temporal and spatial information to make the model smaller. Our architecture used temporal Transformers, dimension reduction layers to reduce the dimension of the data, a spatial Transformer, two fully connected layers and an output layer for the final prediction. Our model outperforms the current state-of-the-art algorithm with 95.2\% accuracy in distinguishing a Parkinsonian patient from a healthy one on the Physionet dataset. A key learning from this work is that Transformers allow for greater stability in results. The source code and pre-trained models are released in https://github.com/DucMinhDimitriNguyen/Transformers-for-1D-signals-in-Parkinson-s-disease-detection-from-gait.gitComment: International Conference on Pattern Recognition (ICPR 2022
    • …
    corecore