366,888 research outputs found

    Fast human activity recognition based on structure and motion

    Get PDF
    This is the post-print version of the final paper published in Pattern Recognition Letters. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2011 Elsevier B.V.We present a method for the recognition of human activities. The proposed approach is based on the construction of a set of templates for each activity as well as on the measurement of the motion in each activity. Templates are designed so that they capture the structural and motion information that is most discriminative among activities. The direct motion measurements capture the amount of translational motion in each activity. The two features are fused at the recognition stage. Recognition is achieved in two steps by calculating the similarity between the templates and motion features of the test and reference activities. The proposed methodology is experimentally assessed and is shown to yield excellent performance.European Commissio

    Combining Users' Activity Survey and Simulators to Evaluate Human Activity Recognition Systems

    Get PDF
    Open Access articleEvaluating human activity recognition systems usually implies following expensive and time-consuming methodologies,where experiments with humans are run with the consequent ethical and legal issues. We propose a novel evaluation methodology to overcome the enumerated problems, which is based on surveys for users and a synthetic dataset generator tool. Surveys allow capturing how different users perform activities of daily living, while the synthetic dataset generator is used to create properly labelled activity datasets modelled with the information extracted from surveys. Important aspects, such as sensor noise, varying time lapses and user erratic behaviour, can also be simulated using the tool. The proposed methodology is shown to have very important advantages that allow researchers to carry out their work more efficiently. To evaluate the approach, a syntheticdatasetgeneratedfollowingtheproposedmethodologyiscomparedtoarealdataset computing the similarity between sensor occurrence frequencies. It is concluded that the similarity between both datasets is more than significant

    A Realistic Radar Ray Tracing Simulator for Hand Pose Imaging

    Full text link
    With the increasing popularity of human-computer interaction applications, there is also growing interest in generating sufficiently large and diverse data sets for automatic radar-based recognition of hand poses and gestures. Radar simulations are a vital approach to generating training data (e.g., for machine learning). Therefore, this work applies a ray tracing method to radar imaging of the hand. The performance of the proposed simulation approach is verified by a comparison of simulation and measurement data based on an imaging radar with a high lateral resolution. In addition, the surface material model incorporated into the ray tracer is highlighted in more detail and parameterized for radar hand imaging. Measurements and simulations show a very high similarity between synthetic and real radar image captures. The presented results demonstrate that it is possible to generate very realistic simulations of radar measurement data even for complex radar hand pose imaging systems.Comment: 4 pages, 5 figures, accepted at European Microwave Week (EuMW 2023) to the topic "R28 Human Activity Monitoring, including Gesture Recognition

    Contextual Information for Applications in Video Surveillance

    Get PDF
    With a growing network of cameras being used for security applications, video-based monitoring relying on human operators is ineffective and lacking in reliability and scalability. In this thesis, I present automatic solutions that enable monitoring of humans in videos, such as identifying same individuals across different cameras (human re-identification) and recognizing human activities. Analyzing videos using only individual-based features can be very challenging because of the significant appearance and motion variance due to the changing viewpoints, different lighting conditions, and occlusions. Motivated by the fact that people often form groups, it is feasible to model the interaction among group members to disambiguate the individual features in video analysis tasks. This thesis introduces features that leverage the human group as contextual information and demonstrates its performance for the tasks of human re-identification and activity recognition. Two descriptors are introduced for human re-identification. The Subject Centric Group (SCG) feature captures a person’s group appearance and shape information using the estimate of persons' positions in 3D space. The metric is designed to consider both human appearance and group similarity. The Spatial Appearance Group (SAG) feature extracts group appearance and shape information directly from video frames. A random-forest model is trained to predict the group's similarity score. For human activity recognition, I propose context features along with a deep model to recognize the individual subject’s activity in videos of real-world scenes. Besides the motion features of the person, I also utilize group context information and scene context information to improve the recognition performance. This thesis demonstrates the application of proposed features in both problems. Our experiments show that proposed features can reach state-of-the-art accuracy on challenging re-identification datasets that represent real-world scenario, and can also outperform state-of-the art human activity recognition methods on 5-activities and 6-activities versions of the Collective Activities dataset.Computer Science, Department o

    Face Centered Image Analysis Using Saliency and Deep Learning Based Techniques

    Get PDF
    Image analysis starts with the purpose of configuring vision machines that can perceive like human to intelligently infer general principles and sense the surrounding situations from imagery. This dissertation studies the face centered image analysis as the core problem in high level computer vision research and addresses the problem by tackling three challenging subjects: Are there anything interesting in the image? If there is, what is/are that/they? If there is a person presenting, who is he/she? What kind of expression he/she is performing? Can we know his/her age? Answering these problems results in the saliency-based object detection, deep learning structured objects categorization and recognition, human facial landmark detection and multitask biometrics. To implement object detection, a three-level saliency detection based on the self-similarity technique (SMAP) is firstly proposed in the work. The first level of SMAP accommodates statistical methods to generate proto-background patches, followed by the second level that implements local contrast computation based on image self-similarity characteristics. At last, the spatial color distribution constraint is considered to realize the saliency detection. The outcome of the algorithm is a full resolution image with highlighted saliency objects and well-defined edges. In object recognition, the Adaptive Deconvolution Network (ADN) is implemented to categorize the objects extracted from saliency detection. To improve the system performance, L1/2 norm regularized ADN has been proposed and tested in different applications. The results demonstrate the efficiency and significance of the new structure. To fully understand the facial biometrics related activity contained in the image, the low rank matrix decomposition is introduced to help locate the landmark points on the face images. The natural extension of this work is beneficial in human facial expression recognition and facial feature parsing research. To facilitate the understanding of the detected facial image, the automatic facial image analysis becomes essential. We present a novel deeply learnt tree-structured face representation to uniformly model the human face with different semantic meanings. We show that the proposed feature yields unified representation in multi-task facial biometrics and the multi-task learning framework is applicable to many other computer vision tasks

    Study of similarity metrics for matching network-based personalised human activity recognition.

    Get PDF
    Personalised Human Activity Recognition (HAR) models trained using data from the target user (subject-dependent) have been shown to be superior to non personalised models that are trained on data from a general population (subject-independent). However, from a practical perspective, collecting sufficient training data from end users to create subject-dependent models is not feasible. We have previously introduced an approach based on Matching networks which has proved effective for training personalised HAR models while requiring very little data from the end user. Matching networks perform nearest-neighbour classification by reusing the class label of the most similar instances in a provided support set, which makes them very relevant to case-based reasoning. A key advantage of matching networks is that they use metric learning to produce feature embeddings or representations that maximise classification accuracy, given a chosen similarity metric. However, to the best of our knowledge, no study has been provided into the performance of different similarity metrics for matching networks. In this paper, we present a study of five different similarity metrics: Euclidean, Manhattan, Dot Product, Cosine and Jaccard, for personalised HAR. Our evaluation shows that substantial differences in performance are achieved using different metrics, with Cosine and Jaccard producing the best performance

    Human and Group Activity Recognition from Video Sequences

    Get PDF
    A good solution to human activity recognition enables the creation of a wide variety of useful applications such as applications in visual surveillance, vision-based Human-Computer-Interaction (HCI) and gesture recognition. In this thesis, a graph based approach to human activity recognition is proposed which models spatio-temporal features as contextual space-time graphs. In this method, spatio-temporal gradient cuboids were extracted at significant regions of activity, and feature graphs (gradient, space-time, local neighbours, immediate neighbours) are constructed using the similarity matrix. The Laplacian representation of the graph is utilised to reduce the computational complexity and to allow the use of traditional statistical classifiers. A second methodology is proposed to detect and localise abnormal activities in crowded scenes. This approach has two stages: training and identification. During the training stage, specific human activities are identified and characterised by employing modelling of medium-term movement flow through streaklines. Each streakline is formed by multiple optical flow vectors that represent and track locally the movement in the scene. A dictionary of activities is recorded for a given scene during the training stage. During the testing stage, the consistency of each observed activity with those from the dictionary is verified using the Kullback-Leibler (KL) divergence. The anomaly detection of the proposed methodology is compared to state of the art, producing state of the art results for localising anomalous activities. Finally, we propose an automatic group activity recognition approach by modelling the interdependencies of group activity features over time. We propose to model the group interdependences in both motion and location spaces. These spaces are extended to time-space and time-movement spaces and modelled using Kernel Density Estimation (KDE). The recognition performance of the proposed methodology shows an improvement in recognition performance over state of the art results on group activity datasets

    Exploring Natural Language Processing Methods for Interactive Behaviour Modelling

    Full text link
    Analysing and modelling interactive behaviour is an important topic in human-computer interaction (HCI) and a key requirement for the development of intelligent interactive systems. Interactive behaviour has a sequential (actions happen one after another) and hierarchical (a sequence of actions forms an activity driven by interaction goals) structure, which may be similar to the structure of natural language. Designed based on such a structure, natural language processing (NLP) methods have achieved groundbreaking success in various downstream tasks. However, few works linked interactive behaviour with natural language. In this paper, we explore the similarity between interactive behaviour and natural language by applying an NLP method, byte pair encoding (BPE), to encode mouse and keyboard behaviour. We then analyse the vocabulary, i.e., the set of action sequences, learnt by BPE, as well as use the vocabulary to encode the input behaviour for interactive task recognition. An existing dataset collected in constrained lab settings and our novel out-of-the-lab dataset were used for evaluation. Results show that this natural language-inspired approach not only learns action sequences that reflect specific interaction goals, but also achieves higher F1 scores on task recognition than other methods. Our work reveals the similarity between interactive behaviour and natural language, and presents the potential of applying the new pack of methods that leverage insights from NLP to model interactive behaviour in HCI
    corecore