52 research outputs found

    Human from Blur: Human Pose Tracking from Blurry Images

    Full text link
    We propose a method to estimate 3D human poses from substantially blurred images. The key idea is to tackle the inverse problem of image deblurring by modeling the forward problem with a 3D human model, a texture map, and a sequence of poses to describe human motion. The blurring process is then modeled by a temporal image aggregation step. Using a differentiable renderer, we can solve the inverse problem by backpropagating the pixel-wise reprojection error to recover the best human motion representation that explains a single or multiple input images. Since the image reconstruction loss alone is insufficient, we present additional regularization terms. To the best of our knowledge, we present the first method to tackle this problem. Our method consistently outperforms other methods on significantly blurry inputs since they lack one or multiple key functionalities that our method unifies, i.e. image deblurring with sub-frame accuracy and explicit 3D modeling of non-rigid human motion.Comment: typos and minor error fixe

    Bio-inspired Dynamic 3D Discriminative Skeletal Features for Human Action Recognition

    Full text link
    Over the last few years, with the immense popularity of the Kinect, there has been renewed interest in developing methods for human gesture and action recognition from 3D data. A number of approaches have been proposed that ex-tract representative features from 3D depth data, a recon-structed 3D surface mesh or more commonly from the re-covered estimate of the human skeleton. Recent advances in neuroscience have discovered a neural encoding of static 3D shapes in primate infero-temporal cortex that can be represented as a hierarchy of medial axis and surface fea-tures. We hypothesize a similar neural encoding might also exist for 3D shapes in motion and propose a hierarchy of dynamic medial axis structures at several spatio-temporal scales that can be modeled using a set of Linear Dynami-cal Systems (LDSs). We then propose novel discriminative metrics for comparing these sets of LDSs for the task of hu-man activity recognition. Combined with simple classifica-tion frameworks, our proposed features and corresponding hierarchical dynamical models provide the highest human activity recognition rates as compared to state-of-the-art methods on several skeletal datasets. 1

    Eigenvector-based Dimensionality Reduction for Human Activity Recognition and Data Classification

    Get PDF
    In the context of appearance-based human motion compression, representation, and recognition, we have proposed a robust framework based on the eigenspace technique. First, the new appearance-based template matching approach which we named Motion Intensity Image for compressing a human motion video into a simple and concise, yet very expressive representation. Second, a learning strategy based on the eigenspace technique is employed for dimensionality reduction using each of PCA and FDA, while providing maximum data variance and maximum class separability, respectively. Third, a new compound eigenspace is introduced for multiple directed motion recognition that takes care also of the possible changes in scale. This method extracts two more features that are used to control the recognition process. A similarity measure, based on Euclidean distance, has been employed for matching dimensionally-reduced testing templates against a projected set of known motions templates. In the stream of nonlinear classification, we have introduced a new eigenvector-based recognition model, built upon the idea of the kernel technique. A practical study on the use of the kernel technique with 18 different functions has been carried out. We have shown in this study how crucial choosing the right kernel function is, for the success of the subsequent linear discrimination in the feature space for a particular problem. Second, building upon the theory of reproducing kernels, we have proposed a new robust nonparametric discriminant analysis approach with kernels. Our proposed technique can efficiently find a nonparametric kernel representation where linear discriminants can perform better. Data classification is achieved by integrating the linear version of the NDA with the kernel mapping. Based on the kernel trick, we have provided a new formulation for Fisher\u27s criterion, defined in terms of the Gram matrix only

    People detection based on appearance and motion models

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. A. Garcia-Martin, A. Hauptmann, and J. M. Martínez "People detection based on appearance and motion models", in 8th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2011, p. 256-260The main contribution of this paper is a new people detection algorithm based on motion information. The algorithm builds a people motion model based on the Implicit Shape Model (ISM) Framework and the MoSIFT descriptor. We also propose a detection system that integrates appearance, motion and tracking information. Experimental results over sequences extracted from the TRECVID dataset show that our new people motion detector produces results comparable to the state of the art and that the proposed multimodal fusion system improves the obtained results combining the three information sources.This work has been partially supported by the Cátedra UAM-Infoglobal ("Nuevas tecnologías de vídeo aplicadas a sistemas de video-seguridad") and by the Universidad Autónoma de Madrid (“FPI-UAM: Programa propio de ayudas para la Formación de Personal Investigador”

    3D human motion sequences synchronization using dense matching algorithm

    Get PDF
    Annual Symposium of the German Association for Pattern Recognition (DAGM), 2006, Berlin (Germany)This work solves the problem of synchronizing pre-recorded human motion sequences, which show different speeds and accelerations, by using a novel dense matching algorithm. The approach is based on the dynamic programming principle that allows finding an optimal solution very fast. Additionally, an optimal sequence is automatically selected from the input data set to be a time scale pattern for all other sequences. The synchronized motion sequences are used to learn a model of human motion for action recognition and full-body tracking purposes.This work was supported by the project 'Integration of robust perception, learning, and navigation systems in mobile robotics' (J-0929).Peer Reviewe

    MotionBERT: A Unified Perspective on Learning Human Motion Representations

    Full text link
    We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources. Specifically, we propose a pretraining stage in which a motion encoder is trained to recover the underlying 3D motion from noisy partial 2D observations. The motion representations acquired in this way incorporate geometric, kinematic, and physical knowledge about human motion, which can be easily transferred to multiple downstream tasks. We implement the motion encoder with a Dual-stream Spatio-temporal Transformer (DSTformer) neural network. It could capture long-range spatio-temporal relationships among the skeletal joints comprehensively and adaptively, exemplified by the lowest 3D pose estimation error so far when trained from scratch. Furthermore, our proposed framework achieves state-of-the-art performance on all three downstream tasks by simply finetuning the pretrained motion encoder with a simple regression head (1-2 layers), which demonstrates the versatility of the learned motion representations. Code and models are available at https://motionbert.github.io/Comment: ICCV 2023 Camera Read

    Fast Adaptive Reparametrization (FAR) with Application to Human Action Recognition

    Get PDF
    In this paper, a fast approach for curve reparametrization, called Fast Adaptive Reparamterization (FAR), is introduced. Instead of computing an optimal matching between two curves such as Dynamic Time Warping (DTW) and elastic distance-based approaches, our method is applied to each curve independently, leading to linear computational complexity. It is based on a simple replacement of the curve parameter by a variable invariant under specific variations of reparametrization. The choice of this variable is heuristically made according to the application of interest. In addition to being fast, the proposed reparametrization can be applied not only to curves observed in Euclidean spaces but also to feature curves living in Riemannian spaces. To validate our approach, we apply it to the scenario of human action recognition using curves living in the Riemannian product Special Euclidean space SE(3) n. The obtained results on three benchmarks for human action recognition (MSRAction3D, Florence3D, and UTKinect) show that our approach competes with state-of-the-art methods in terms of accuracy and computational cost

    Enhancing Robot Collaborative Skills by Predicting Human Motion on Small Datasets through Deep Transfer Learning

    Get PDF
    openL’obiettivo della tesi è quello di esplorare l’applicazione del Deep Transfer Learning per modelli di previsione del movimento umano nel contesto della Robotica Collaborativa. Attualmente, gli algoritmi di anticipazione del movimento si concentrano principalmente su dataset basati su azioni umane generiche in spazi aperti, trascurando gli aspetti cruciali dell’interazione con l’ambiente circostante. Data la limitata disponibilità di tali dataset, l’obiettivo è quello di esplorare e valutare la trasferibilità della conoscenza dai modelli di Deep Learning per la previsione del movimento umano a nuovi dataset, più specifici del contesto della collaborazione.The objective of the thesis is to investigate the application of Deep Transfer Learning in Human Motion Prediction models within the context of Collaborative Robotics. Currently, motion anticipation algorithms primarily focus on datasets based on generic human actions in open spaces, neglecting the crucial aspects of environmental interaction. Given the limited availability of such datasets, the objective is to explore and evaluate the transferability of knowledge from Deep Learning models for Human Motion Prediction to previously unseen-small datasets, more specific for collaboration settings
    corecore