21 research outputs found
Hand gesture spotting and recognition using HMMs and CRFs in color image sequences
Magdeburg, Univ., Fak. für Elektrotechnik und Informationstechnik, Diss., 2010von Mahmoud Othman Selim Mahmoud Elmezai
Feature-supported Multi-hypothesis Framework for Multi-object Tracking using Kalman Filter
A Kalman filter is a recursive estimator and has widely been used for tracking objects. However, unsatisfying tracking of
moving objects is observed under complex situations (i.e. inter-object merge and split) which are challenging for classical
Kalman filter. This paper describes a multi-hypothesis framework based on multiple features for tracking the moving objects
under complex situations using Kalman Tracker. In this framework, a hypothesis (i.e. merge, split, new) is generated on the
basis of contextual association probability which identifies the status of the moving objects in the respective occurrences. The
association among the moving objects is computed by multi-featured similarity criteria which include spatial size, color and
trajectory. Color similarity probability is computed by the correlation-weighted histogram intersection (CWHI). The
similarity probabilities of the size and the trajectory are computed and combined with the fused color correlation. The
accumulated association probability results in online hypothesis generation. This hypothesis assists Kalman tracker when
complex situations appear in real-time tracking (i.e. traffic surveillance, pedestrian tracking). Our algorithm achieves robust
tracking with 97.3% accuracy, and 0.07% covariance error in different real-time scenarios
Human Activity Recognition: Discriminative Models using Statistical Chord-length and Optical Flow Motion Features
Despite the vast amount of research on the analysis of existing and ongoing human activity, there are still significant challenges worthy of address. In this paper, an innovative approach for human action recognition based on discriminative models like CRFs, HCRFs and LDCRFs is proposed. To handle human action recognition, different number of window size ranging from 0 to 7 are applied using a compact computationally-efficient descriptor as statistical chord-length features (SCLF), in addition to optical flow motion features that derived from 3D spatio-temporal action volume. Our experiment on a standard benchmark action KTH, as well as our IIKT dataset show that the recognition rate, and the reliability of human activity is improved initially as the window size increase, but degrades as the window size increase further. Furthermore, LDCRFs is robust and efficient than CRFs and HCRFs, in addition to problematic phenomena than those previously reported. It also can carry out without sacrificing real-time performance for a wide range of practical action applications
Poisonous Plants Species Prediction Using a Convolutional Neural Network and Support Vector Machine Hybrid Model
The total number of discovered plant species is increasing yearly worldwide. Plant species differ from one region to another. Some of these discovered plant species are beneficial while others might be poisonous. Computer vision techniques can be an effective way to classify plant species and predict their poisonous status. However, the lack of comprehensive datasets that include not only plant images but also plant species’ scientific names, description, poisonous status, and local name make the issue of poisonous plants species prediction a very challenging issue. In this paper, we propose a hybrid model relying on transformers models in conjunction with support vector machine for plant species classification and poisonous status prediction. First, six different Convolutional Neural Network (CNN) architectures are used to determine which produces the best results. Second, the features are extracted using six different CNNs and then optimized and employed to Support Vector Machine (SVM) for testing. To prove the feasibility and benefits of our proposed approach, we used a real case study namely, plant species discovered in the Arabian Peninsula. We have gathered a dataset that contains 2500 images of 50 different Arabic plant species and includes plants images, plant species scientific name, description, local name, and poisonous status. This study on the types of Arabic plants species will help in the reduction of the number of poisonous plants victims and their negative impact on the individual and society. The results of our experiments for the CNN approach in conjunction SVM are favorable where the classifier scored 0.92, 0.94, and 0.95 in accuracy, precision, and F1-Score respectively
Real-time capable system for hand gesture recognition Using hidden Markov models in stereo color image sequences
This paper proposes a system to recognize the alphabets and numbers in real time from color image sequences
by the motion trajectory of a single hand using Hidden Markov Models (HMM). Our system is based on three
main stages; automatic segmentation and preprocessing of the hand regions, feature extraction and classification.
In automatic segmentation and preprocessing stage, YCbCr color space and depth information are used to detect
hands and face in connection with morphological operation where Gaussian Mixture Model (GMM) is used for
computing the skin probability. After the hand is detected and the centroid point of the hand region is
determined, the tracking will take place in the further steps to determine the hand motion trajectory by using a
search area around the hand region. In the feature extraction stage, the orientation is determined between two
consecutive points from hand motion trajectory and then it is quantized to give a discrete vector that is used as
input to HMM. The final stage so-called classification, Baum-Welch algorithm (BW) is used to do a full train
for HMM parameters. The gesture of alphabets and numbers is recognized by using Left-Right Banded model
(LRB) in conjunction with Forward algorithm. In our experiment, 720 trained gestures are used for training and
also 360 tested gestures for testing. Our system recognizes the alphabets from A to Z and numbers from 0 to 9
and achieves an average recognition rate of 94.72%
Brain Tumor Segmentation Using Deep Capsule Network and Latent-Dynamic Conditional Random Fields
Because of the large variabilities in brain tumors, automating segmentation remains a difficult task. We propose an automated method to segment brain tumors by integrating the deep capsule network (CapsNet) and the latent-dynamic condition random field (LDCRF). The method consists of three main processes to segment the brain tumor—pre-processing, segmentation, and post-processing. In pre-processing, the N4ITK process involves correcting each MR image’s bias field before normalizing the intensity. After that, image patches are used to train CapsNet during the segmentation process. Then, with the CapsNet parameters determined, we employ image slices from an axial view to learn the LDCRF-CapsNet. Finally, we use a simple thresholding method to correct the labels of some pixels and remove small 3D-connected regions from the segmentation outcomes. On the BRATS 2015 and BRATS 2021 datasets, we trained and evaluated our method and discovered that it outperforms and can compete with state-of-the-art methods in comparable conditions
A novel system for automatic hand gesture spotting and recognition in stereo color image sequences
Automatic gesture spotting and recognition is a challenging task for locating the start and end points that
correspond to a gesture of interest in Human-Computer Interaction. This paper proposes a novel gesture spotting
system that is suitable for real-time implementation. The system executes gesture segmentation and recognition
simultaneously without any time delay based on Hidden Markov Models. In the segmentation module, the hand
of the user is tracked using mean-shift algorithm, which is a non-parametric density estimator that optimizes the
smooth similarity function to find the direction of hand gesture path. In order to spot key gesture accurately, a
sophisticated method for designing a non-gesture model is proposed, which is constructed by collecting the
states of all gesture models in the system. The non-gesture model is a weak model compared to all trained
gesture models. Therefore, it provides a good confirmation for rejecting the non-gesture pattern. To reduce the
states of the non-gesture model, similar probability distributions states are merged based on relative entropy
measure. Experimental results show that the proposed system can automatically recognize isolated gestures with
97.78% and key gestures with 93.31% reliability for Arabic numbers from 0 to 9