62 research outputs found
HCT: Hybrid Convnet-Transformer for Parkinson's disease detection and severity prediction from gait
In this paper, we propose a novel deep learning method based on a new Hybrid
ConvNet-Transformer architecture to detect and stage Parkinson's disease (PD)
from gait data. We adopt a two-step approach by dividing the problem into two
sub-problems. Our Hybrid ConvNet-Transformer model first distinguishes healthy
versus parkinsonian patients. If the patient is parkinsonian, a multi-class
Hybrid ConvNet-Transformer model determines the Hoehn and Yahr (H&Y) score to
assess the PD severity stage. Our hybrid architecture exploits the strengths of
both Convolutional Neural Networks (ConvNets) and Transformers to accurately
detect PD and determine the severity stage. In particular, we take advantage of
ConvNets to capture local patterns and correlations in the data, while we
exploit Transformers for handling long-term dependencies in the input signal.
We show that our hybrid method achieves superior performance when compared to
other state-of-the-art methods, with a PD detection accuracy of 97% and a
severity staging accuracy of 87%. Our source code is available at:
https://github.com/SafwenNaimiComment: 6 pages, 6 figures, 3 tables, Accepted for publication in IEEE
International Conference on Machine Learning and Applications (ICMLA),
copyright IEE
STF: Spatio-Temporal Fusion Module for Improving Video Object Detection
Consecutive frames in a video contain redundancy, but they may also contain
relevant complementary information for the detection task. The objective of our
work is to leverage this complementary information to improve detection.
Therefore, we propose a spatio-temporal fusion framework (STF). We first
introduce multi-frame and single-frame attention modules that allow a neural
network to share feature maps between nearby frames to obtain more robust
object representations. Second, we introduce a dual-frame fusion module that
merges feature maps in a learnable manner to improve them. Our evaluation is
conducted on three different benchmarks including video sequences of moving
road users. The performed experiments demonstrate that the proposed
spatio-temporal fusion module leads to improved detection performance compared
to baseline object detectors. Code is available at
https://github.com/noreenanwar/STF-moduleComment: 8 pages,3 figure
Multi-Branch Siamese Networks with Online Selection for Object Tracking
In this paper, we propose a robust object tracking algorithm based on a
branch selection mechanism to choose the most efficient object representations
from multi-branch siamese networks. While most deep learning trackers use a
single CNN for target representation, the proposed Multi-Branch Siamese Tracker
(MBST) employs multiple branches of CNNs pre-trained for different tasks, and
used for various target representations in our tracking method. With our branch
selection mechanism, the appropriate CNN branch is selected depending on the
target characteristics in an online manner. By using the most adequate target
representation with respect to the tracked object, our method achieves
real-time tracking, while obtaining improved performance compared to standard
Siamese network trackers on object tracking benchmarks.Comment: ISVC2018, oral presentatio
Reproducible Evaluation of Pan-Tilt-Zoom Tracking
Tracking with a Pan-Tilt-Zoom (PTZ) camera has been a research topic in
computer vision for many years. However, it is very difficult to assess the
progress that has been made on this topic because there is no standard
evaluation methodology. The difficulty in evaluating PTZ tracking algorithms
arises from their dynamic nature. In contrast to other forms of tracking, PTZ
tracking involves both locating the target in the image and controlling the
motors of the camera to aim it so that the target stays in its field of view.
This type of tracking can only be performed online. In this paper, we propose a
new evaluation framework based on a virtual PTZ camera. With this framework,
tracking scenarios do not change for each experiment and we are able to
replicate online PTZ camera control and behavior including camera positioning
delays, tracker processing delays, and numerical zoom. We tested our evaluation
framework with the Camshift tracker to show its viability and to establish
baseline results.Comment: This is an extended version of the 2015 ICIP paper "Reproducible
Evaluation of Pan-Tilt-Zoom Tracking
Automatic counting of planting microsites via local visual detection and global count estimation
In forest industry, mechanical site preparation by mounding is widely used
prior to planting operations. One of the main problems when planning planting
operations is the difficulty in estimating the number of mounds present on a
planting block, as their number may greatly vary depending on site
characteristics. This estimation is often carried out through field surveys by
several forestry workers. However, this procedure is prone to error and
slowness. Motivated by recent advances in UAV imagery and artificial
intelligence, we propose a fully automated framework to estimate the number of
mounds on a planting block. Using computer vision and machine learning, we
formulate the counting task as a supervised learning problem using two
prediction models. A local detection model is firstly used to detect visible
mounds based on deep features, while a global prediction function is
subsequently applied to provide a final estimation based on block-level
features. To evaluate the proposed method, we constructed a challenging UAV
dataset representing several plantation blocks with different characteristics.
The performed experiments demonstrated the robustness of the proposed method,
which outperforms manual methods in precision, while significantly reducing
time and cost
Transformers for 1D Signals in Parkinson's Disease Detection from Gait
This paper focuses on the detection of Parkinson's disease based on the
analysis of a patient's gait. The growing popularity and success of Transformer
networks in natural language processing and image recognition motivated us to
develop a novel method for this problem based on an automatic features
extraction via Transformers. The use of Transformers in 1D signal is not really
widespread yet, but we show in this paper that they are effective in extracting
relevant features from 1D signals. As Transformers require a lot of memory, we
decoupled temporal and spatial information to make the model smaller. Our
architecture used temporal Transformers, dimension reduction layers to reduce
the dimension of the data, a spatial Transformer, two fully connected layers
and an output layer for the final prediction. Our model outperforms the current
state-of-the-art algorithm with 95.2\% accuracy in distinguishing a
Parkinsonian patient from a healthy one on the Physionet dataset. A key
learning from this work is that Transformers allow for greater stability in
results. The source code and pre-trained models are released in
https://github.com/DucMinhDimitriNguyen/Transformers-for-1D-signals-in-Parkinson-s-disease-detection-from-gait.gitComment: International Conference on Pattern Recognition (ICPR 2022
- …