24 research outputs found

    A discussion on the validation tests employed to compare human action recognition methods using the MSR Action3D dataset

    Get PDF
    This paper aims to determine which is the best human action recognition method based on features extracted from RGB-D devices, such as the Microsoft Kinect. A review of all the papers that make reference to MSR Action3D, the most used dataset that includes depth information acquired from a RGB-D device, has been performed. We found that the validation method used by each work differs from the others. So, a direct comparison among works cannot be made. However, almost all the works present their results comparing them without taking into account this issue. Therefore, we present different rankings according to the methodology used for the validation in orden to clarify the existing confusion.Comment: 16 pages and 7 table

    Two-Stream RNN/CNN for Action Recognition in 3D Videos

    Full text link
    The recognition of actions from video sequences has many applications in health monitoring, assisted living, surveillance, and smart homes. Despite advances in sensing, in particular related to 3D video, the methodologies to process the data are still subject to research. We demonstrate superior results by a system which combines recurrent neural networks with convolutional neural networks in a voting approach. The gated-recurrent-unit-based neural networks are particularly well-suited to distinguish actions based on long-term information from optical tracking data; the 3D-CNNs focus more on detailed, recent information from video data. The resulting features are merged in an SVM which then classifies the movement. In this architecture, our method improves recognition rates of state-of-the-art methods by 14% on standard data sets.Comment: Published in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS

    Multi-View Region Adaptive Multi-temporal DMM and RGB Action Recognition

    Get PDF
    Human action recognition remains an important yet challenging task. This work proposes a novel action recognition system. It uses a novel Multiple View Region Adaptive Multi-resolution in time Depth Motion Map (MV-RAMDMM) formulation combined with appearance information. Multiple stream 3D Convolutional Neural Networks (CNNs) are trained on the different views and time resolutions of the region adaptive Depth Motion Maps. Multiple views are synthesised to enhance the view invariance. The region adaptive weights, based on localised motion, accentuate and differentiate parts of actions possessing faster motion. Dedicated 3D CNN streams for multi-time resolution appearance information (RGB) are also included. These help to identify and differentiate between small object interactions. A pre-trained 3D-CNN is used here with fine-tuning for each stream along with multiple class Support Vector Machines (SVM)s. Average score fusion is used on the output. The developed approach is capable of recognising both human action and human-object interaction. Three public domain datasets including: MSR 3D Action,Northwestern UCLA multi-view actions and MSR 3D daily activity are used to evaluate the proposed solution. The experimental results demonstrate the robustness of this approach compared with state-of-the-art algorithms.Comment: 14 pages, 6 figures, 13 tables. Submitte

    Robust object representation by boosting-like deep learning architecture

    Get PDF
    This paper presents a new deep learning architecture for robust object representation, aiming at efficiently combining the proposed synchronized multi-stage feature (SMF) and a boosting-like algorithm. The SMF structure can capture a variety of characteristics from the inputting object based on the fusion of the handcraft features and deep learned features. With the proposed boosting-like algorithm, we can obtain more convergence stability on training multi-layer network by using the boosted samples. We show the generalization of our object representation architecture by applying it to undertake various tasks, i.e. pedestrian detection and action recognition. Our approach achieves 15.89% and 3.85% reduction in the average miss rate compared with ACF and JointDeep on the largest Caltech dataset, and acquires competitive results on the MSRAction3D dataset

    Artificial Intelligence Enabled Methods for Human Action Recognition using Surveillance Videos

    Get PDF
    Computer vision applications have been attracting researchers and academia. It is more so with cloud computing resources enabling such applications. Analysing video surveillance applications became an important research area due to its widespread applications. For instance, CCTV camera are used in public places in order to monitor situations, identify any theft or crime instances. In presence of thousands of such surveillance videos streaming simultaneously, manual analysis is very tedious and time consuming task. There is need for automated approach for analysis and giving notifications or findings to officers concerned. It is very useful to police and investigation agencies to ascertain facts, recover evidences and even exploit digital forensics. In this context, this paper throws light on different methods of human action recognition (HAR) using machine learning (ML) and deep learning (DL) that come under Artificial Intelligence (AI). It also reviews methods on privacy preserving action recognition and Generative Adversarial Networks (GANs). This paper also provides different datasets being used for human action recognition research besides giving an account of research gaps that help in pursuing further research in the area of human action recognition

    Vision based dynamic thermal comfort control using fuzzy logic and deep learning

    Get PDF
    A wide range of techniques exist to help control the thermal comfort of an occupant in indoor environments. A novel technique is presented here to adaptively estimate the occupant’s metabolic rate. This is performed by utilising occupant’s actions using computer vision system to identify the activity of an occupant. Recognized actions are then translated into metabolic rates. The widely used Predicted Mean Vote (PMV) thermal comfort index is computed using the adaptivey estimated metabolic rate value. The PMV is then used as an input to a fuzzy control system. The performance of the proposed system is evaluated using simulations of various activities. The integration of PMV thermal comfort index and action recognition system gives the opportunity to adaptively control occupant’s thermal comfort without the need to attach a sensor on an occupant all the time. The obtained results are compared with the results for the case of using one or two fixed metabolic rates. The included results appear to show improved performance, even in the presence of errors in the action recognition system

    Action Recognition

    Get PDF
    Tématem této bakalářské práce je rozpoznávání akcí. V této práci jsem se zaměřil na rozpoznávání a klasifikaci akcí pomocí dvou metod. Konkrétně na DMM a SSM. Tyto metody jsou v práci popsány a implementovány. K implementací byl použit programovací jazyk C++ a knihovna pro zpracování obrazu OpenCV.Theme of this bachelor’s thesis is action recognition. In this thesis I focused on action recognition and classification by using two methods. Specifically on DMM and SSM. These methods are in this thesis described and implemented. The work is based on programming language C++and computer vision library OpenCV.460 - Katedra informatikyvýborn
    corecore