1,127 research outputs found

    Fast and Accurate Algorithm for Eye Localization for Gaze Tracking in Low Resolution Images

    Full text link
    Iris centre localization in low-resolution visible images is a challenging problem in computer vision community due to noise, shadows, occlusions, pose variations, eye blinks, etc. This paper proposes an efficient method for determining iris centre in low-resolution images in the visible spectrum. Even low-cost consumer-grade webcams can be used for gaze tracking without any additional hardware. A two-stage algorithm is proposed for iris centre localization. The proposed method uses geometrical characteristics of the eye. In the first stage, a fast convolution based approach is used for obtaining the coarse location of iris centre (IC). The IC location is further refined in the second stage using boundary tracing and ellipse fitting. The algorithm has been evaluated in public databases like BioID, Gi4E and is found to outperform the state of the art methods.Comment: 12 pages, 10 figures, IET Computer Vision, 201

    Human Intention Inference using Fusion of Gaze and Motion Information

    Get PDF
    Enabling robots with the ability to quickly and accurately determine the intention of their human counterparts is a very important problem in Human-Robot Collaboration (HRC). The focus of this work is to provide a framework wherein multiple modalities of information, available to the robot through different sensors, are fused to estimate a human\u27s action intent. In this thesis, two human intention estimation schemes are presented. In both cases, human intention is defined as a motion profile associated with a single goal location. The first scheme presents the first human intention estimator to fuse information from pupil tracking data as well as skeletal tracking data during each iteration of an Interacting Multiple Model (IMM) filter in order to predict the goal location of a reaching motion. In the second, two variable structure IMM (VS-IMM) filters, which track gaze and skeletal motion, respectively, are run in parallel and their associated model probabilities fused. This method is advantageous over the first as it can be easily scaled to include more models and provides greater disparity between the most likely model and the other models. For each VS-IMM filter, a model selection algorithm is proposed which chooses the most likely models in each iteration based on physical constraints of the human body. Experimental results are provided to validate the proposed human intention estimation schemes

    Eye-Tracking Signals Based Affective Classification Employing Deep Gradient Convolutional Neural Networks

    Get PDF
    Utilizing biomedical signals as a basis to calculate the human affective states is an essential issue of affective computing (AC). With the in-depth research on affective signals, the combination of multi-model cognition and physiological indicators, the establishment of a dynamic and complete database, and the addition of high-tech innovative products become recent trends in AC. This research aims to develop a deep gradient convolutional neural network (DGCNN) for classifying affection by using an eye-tracking signals. General signal process tools and pre-processing methods were applied firstly, such as Kalman filter, windowing with hamming, short-time Fourier transform (SIFT), and fast Fourier transform (FTT). Secondly, the eye-moving and tracking signals were converted into images. A convolutional neural networks-based training structure was subsequently applied; the experimental dataset was acquired by an eye-tracking device by assigning four affective stimuli (nervous, calm, happy, and sad) of 16 participants. Finally, the performance of DGCNN was compared with a decision tree (DT), Bayesian Gaussian model (BGM), and k-nearest neighbor (KNN) by using indices of true positive rate (TPR) and false negative rate (FPR). Customizing mini-batch, loss, learning rate, and gradients definition for the training structure of the deep neural network was also deployed finally. The predictive classification matrix showed the effectiveness of the proposed method for eye moving and tracking signals, which performs more than 87.2% inaccuracy. This research provided a feasible way to find more natural human-computer interaction through eye moving and tracking signals and has potential application on the affective production design process

    Towards a Principled Integration of Multi-Camera Re-Identification and Tracking through Optimal Bayes Filters

    Full text link
    With the rise of end-to-end learning through deep learning, person detectors and re-identification (ReID) models have recently become very strong. Multi-camera multi-target (MCMT) tracking has not fully gone through this transformation yet. We intend to take another step in this direction by presenting a theoretically principled way of integrating ReID with tracking formulated as an optimal Bayes filter. This conveniently side-steps the need for data-association and opens up a direct path from full images to the core of the tracker. While the results are still sub-par, we believe that this new, tight integration opens many interesting research opportunities and leads the way towards full end-to-end tracking from raw pixels.Comment: First two authors have equal contribution. This is initial work into a new direction, not a benchmark-beating method. v2 only adds acknowledgements and fixes a typo in e-mai

    Utilising Visual Attention Cues for Vehicle Detection and Tracking

    Get PDF
    Advanced Driver-Assistance Systems (ADAS) have been attracting attention from many researchers. Vision-based sensors are the closest way to emulate human driver visual behavior while driving. In this paper, we explore possible ways to use visual attention (saliency) for object detection and tracking. We investigate: 1) How a visual attention map such as a \emph{subjectness} attention or saliency map and an \emph{objectness} attention map can facilitate region proposal generation in a 2-stage object detector; 2) How a visual attention map can be used for tracking multiple objects. We propose a neural network that can simultaneously detect objects as and generate objectness and subjectness maps to save computational power. We further exploit the visual attention map during tracking using a sequential Monte Carlo probability hypothesis density (PHD) filter. The experiments are conducted on KITTI and DETRAC datasets. The use of visual attention and hierarchical features has shown a considerable improvement of \approx8\% in object detection which effectively increased tracking performance by \approx4\% on KITTI dataset.Comment: Accepted in ICPR202
    corecore