Search CORE

1,263 research outputs found

Remote Heart Rate Estimation Using Consumer-Grade Cameras

Author: Ruben Nathan E.
Publication venue: DigitalCommons@USU
Publication date: 01/05/2015
Field of study

There are many ways in which the remote non-contact detection of the human heart rate might be useful. This is especially true if it can be done using inexpensive equipment such as consumer-grade cameras. Many studies and experiments have been performed in recent years to help reliably determine the heart rate from video footage of a person. The methods have taken an analysis approach which involves temporal Itering and frequency spectrum examination. This study attempts to answer questions about the noise sources which inhibit these methods from estimating the heart rate. Other statistical processes are examined for their use in reducing the noise in the system. Methods for locating the skin of a moving individual are explored and used with the purpose for acquiring the heart rate. Alternative methods borrowed from other fields are also introduced to find if they have merit in remote heart rate detection

Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds

Author: Harandi Mehrtash
Hartley Richard
Lovell Brian
Sanderson Conrad
Shen Chunhua
Publication venue
Publication date: 01/01/2015
Field of study

Sparsity-based representations have recently led to notable results in various visual recognition tasks. In a separate line of research, Riemannian manifolds have been shown useful for dealing with features and models that do not lie in Euclidean spaces. With the aim of building a bridge between the two realms, we address the problem of sparse coding and dictionary learning over the space of linear subspaces, which form Riemannian structures known as Grassmann manifolds. To this end, we propose to embed Grassmann manifolds into the space of symmetric matrices by an isometric mapping. This in turn enables us to extend two sparse coding schemes to Grassmann manifolds. Furthermore, we propose closed-form solutions for learning a Grassmann dictionary, atom by atom. Lastly, to handle non-linearity in data, we extend the proposed Grassmann sparse coding and dictionary learning algorithms through embedding into Hilbert spaces. Experiments on several classification tasks (gender recognition, gesture classification, scene analysis, face recognition, action recognition and dynamic texture classification) show that the proposed approaches achieve considerable improvements in discrimination accuracy, in comparison to state-of-the-art methods such as kernelized Affine Hull Method and graph-embedding Grassmann discriminant analysis.Comment: Appearing in International Journal of Computer Visio

arXiv.org e-Print Archive

University of Queensland eSpace

Fast Dynamic Texture Detection

Author: Xianghua Xie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/09/2010
Field of study

Cronfa at Swansea University

Augmenting Reinforcement Learning with Transformer-based Scene Representation Learning for Decision-making of Autonomous Driving

Author: Huang Zhiyu
Liu Haochen
Lv Chen
Mo Xiaoyu
Publication venue
Publication date: 25/08/2023
Field of study

Decision-making for urban autonomous driving is challenging due to the stochastic nature of interactive traffic participants and the complexity of road structures. Although reinforcement learning (RL)-based decision-making scheme is promising to handle urban driving scenarios, it suffers from low sample efficiency and poor adaptability. In this paper, we propose Scene-Rep Transformer to improve the RL decision-making capabilities with better scene representation encoding and sequential predictive latent distillation. Specifically, a multi-stage Transformer (MST) encoder is constructed to model not only the interaction awareness between the ego vehicle and its neighbors but also intention awareness between the agents and their candidate routes. A sequential latent Transformer (SLT) with self-supervised learning objectives is employed to distill the future predictive information into the latent scene representation, in order to reduce the exploration space and speed up training. The final decision-making module based on soft actor-critic (SAC) takes as input the refined latent scene representation from the Scene-Rep Transformer and outputs driving actions. The framework is validated in five challenging simulated urban scenarios with dense traffic, and its performance is manifested quantitatively by the substantial improvements in data efficiency and performance in terms of success rate, safety, and efficiency. The qualitative results reveal that our framework is able to extract the intentions of neighbor agents to help make decisions and deliver more diversified driving behaviors

arXiv.org e-Print Archive