35 research outputs found
A group sparsity-driven approach to 3-D action recognition
In this paper, a novel 3-D action recognition method based on sparse representation is presented. Silhouette images from multiple cameras are combined to obtain motion history volumes (MHVs). Cylindrical Fourier transform of MHVs is used as action descriptors. We assume that a test sample has a sparse representation in the space of training samples. We cast the action classification problem as an optimization problem and classify actions using group sparsity based on l1 regularization. We show experimental results using the IXMAS multi-view database and demonstratethe superiority of our method, especially when observations are low resolution, occluded, and noisy and when
the feature dimension is reduced
A graphical model based solution to the facial feature point tracking problem
In this paper a facial feature point tracker that is motivated by applications
such as human-computer interfaces and facial expression analysis systems is
proposed. The proposed tracker is based on a graphical model framework. The
facial features are tracked through video streams by incorporating statistical relations in time as well as spatial relations between feature points. By exploiting the spatial relationships between feature points, the proposed method provides robustness in real-world conditions such as arbitrary head movements and occlusions. A Gabor feature-based occlusion detector is developed and used to handle occlusions. The performance of the proposed tracker has been evaluated
on real video data under various conditions including occluded facial gestures
and head movements. It is also compared to two popular methods, one based
on Kalman filtering exploiting temporal relations, and the other based on active
appearance models (AAM). Improvements provided by the proposed approach
are demonstrated through both visual displays and quantitative analysis
A sparsity-driven approach to multi-camera tracking in visual sensor networks
In this paper, a sparsity-driven approach is presented for multi-camera tracking in visual sensor networks (VSNs). VSNs consist of image sensors, embedded processors and wireless transceivers which are powered by batteries. Since the energy and bandwidth resources are limited, setting up a tracking system in VSNs is a challenging problem. Motivated by the goal of tracking in a bandwidth-constrained environment, we present a sparsity-driven method to compress the features extracted by the camera nodes, which are then transmitted across the network for distributed inference. We have designed special overcomplete dictionaries that match the structure of the features, leading to very parsimonious yet accurate representations. We have tested our method in indoor and outdoor people tracking scenarios. Our experimental results demonstrate how our approach leads to communication savings without significant loss in tracking performance
Graphical model based facial feature point tracking in a vehicle environment
Facial feature point tracking is a research area that can be used in human-computer interaction (HCI), facial expression analysis, fatigue detection, etc. In this paper, a statistical method for facial feature point tracking is proposed. Feature point tracking is a challenging topic in case of uncertain
data because of noise and/or occlusions. With this motivation, a graphical model that incorporates not only temporal information about feature point movements, but also information about the spatial relationships between such points is built. Based on this model, an algorithm that achieves feature point tracking through a video observation sequence is implemented. The proposed method is applied on 2D gray scale real video sequences taken in a vehicle environment and the superiority of this approach over existing techniques is demonstrated
Human Re-Identification with a Robot Thermal Camera using Entropy-based Sampling
Human re-identification is an important feature of domestic service robots, in particular for elderly monitoring and assistance, because it allows them to perform personalized tasks and human-robot interactions. However vision-based re-identification systems are subject to limitations due to human pose and poor lighting conditions. This paper presents a new re-identification method for service robots using thermal images. In robotic applications, as the number and size of thermal datasets is limited, it is hard to use approaches that require huge amount of training samples. We propose a re-identification system that can work using only a small amount of data. During training, we perform entropy-based sampling to obtain a thermal dictionary for each person. Then, a symbolic representation is produced by converting each video into sequences of dictionary elements. Finally, we train a classifier using this symbolic representation and geometric distribution within the new representation domain. The experiments are performed on a new thermal dataset for human re-identification, which includes various situations of human motion, poses and occlusion, and which is made publicly available for research purposes. The proposed approach has been tested on this dataset and its improvements over standard approaches have been demonstrated
Feature compression: a framework for multi-view multi-person tracking in visual sensor networks
Visual sensor networks (VSNs) consist of image sensors, embedded processors and wireless transceivers which are powered by batteries. Since the energy and bandwidth resources are limited, setting up a tracking system in VSNs is a challenging problem. In this paper, we present a framework for human tracking in VSNs. The traditional approach of sending compressed images to a central node has certain disadvantages such as decreasing the performance of further processing (i.e., tracking) because of low quality images. Instead, we propose a feature compression-based decentralized tracking framework that is better matched with the further inference goal of tracking. In our method, each camera performs feature extraction and obtains likelihood functions. By transforming to an appropriate domain and taking only the significant coefficients, these likelihood functions are compressed and this new representation is sent to the fusion node. As a result, this allows us to reduce the communication in the network without significantly affecting the tracking performance. An appropriate domain is selected by performing a comparison between well-known transforms. We have applied our method for indoor people tracking and demonstrated the superiority of our system over the traditional approach and a decentralized approach that uses Kalman filter
Eye feature point tracking by using graphical models
In this paper, a statistical method for eye feature point tracking is proposed. The aim is to track feature points even when the observed data are uncertain because of noise and/or occlusion. With this motivation, a graphical model that uses the spatial information as well as the temporal information between points is built. The proposed method is applied on 2D grayscale real video sequences as a real data application
Facial feature point tracking based on a graphical model framework
In this thesis a facial feature point tracker that can be used in applications such as human-computer interfaces, facial expression analysis systems, driver fatigue detection systems, etc. is proposed. The proposed tracker is based on a graphical model framework. The position of the facial features are tracked through video streams by incorporating statistical relations in time and the spatial relations between feature points. In many application areas, including those mentioned above, tracking is a key intermediate step that has a significant effect on the overall system performance. For this reason, a good practical tracking algorithm should take into account real-world phenomena such as arbitrary head movements and occlusions. Many existing algorithms track each feature point independently, and do not properly handle occlusions. This causes drifts in the case of arbitrary head movements and occlusions. By exploiting the spatial relationships between feature points, the proposed method provides robustness in a number of scenarios, including e.g. various head movements. To prevent drifts because of occlusions, a Gabor feature based occlusion detector is developed and used in the proposed method. The performance of the proposed tracker has been evaluated on real video data under various conditions. These conditions include occluded facial gestures, low video resolution, illumination changes in the scene, in-plane head motion, and out-of-plane head motion. The proposed method has also been tested on videos recorded in a vehicle environment, in order to evaluate its performance in a practical setting. Given these results it can be concluded that the proposed method provides a general promising framework for facial feature tracking. It is a robust tracker for facial expression sequences in which there are occlusions and arbitrary head movements. The results in the vehicle environment suggest that the proposed method has the potential to be useful for tasks such as driver behavior analysis or driver fatigue detection
ENRICHME integration of ambient intelligence and robotics for AAL
Technological advances and affordability of recent smart sensors, as well as the consolidation of common software platforms for the integration of the latter and robotic sensors, are enabling the creation of complex active and assisted living environments for improving the quality of life of the elderly and the less able people. One such example is the integrated system developed by the European project ENRICHME, the aim of which is to monitor and prolong the independent living of old people affected by mild cognitive impairments with a combination of smart-home, robotics and web technologies. This paper presents in particular the design and technological solutions adopted to integrate, process and store the information provided by a set of fixed smart sensors and mobile robot sensors in a domestic scenario, including presence and contact detectors, environmental sensors, and RFID-tagged objects, for long-term user monitoring an