193 research outputs found

    Muscle synergies in neuroscience and robotics: from input-space to task-space perspectives

    Get PDF
    In this paper we review the works related to muscle synergies that have been carried-out in neuroscience and control engineering. In particular, we refer to the hypothesis that the central nervous system (CNS) generates desired muscle contractions by combining a small number of predefined modules, called muscle synergies. We provide an overview of the methods that have been employed to test the validity of this scheme, and we show how the concept of muscle synergy has been generalized for the control of artificial agents. The comparison between these two lines of research, in particular their different goals and approaches, is instrumental to explain the computational implications of the hypothesized modular organization. Moreover, it clarifies the importance of assessing the functional role of muscle synergies: although these basic modules are defined at the level of muscle activations (input-space), they should result in the effective accomplishment of the desired task. This requirement is not always explicitly considered in experimental neuroscience, as muscle synergies are often estimated solely by analyzing recorded muscle activities. We suggest that synergy extraction methods should explicitly take into account task execution variables, thus moving from a perspective purely based on input-space to one grounded on task-space as well

    SAVASA project @ TRECVid 2013: semantic indexing and interactive surveillance event detection

    Get PDF
    In this paper we describe our participation in the semantic indexing (SIN) and interactive surveillance event detection (SED) tasks at TRECVid 2013 [11]. Our work was motivated by the goals of the EU SAVASA project (Standards-based Approach to Video Archive Search and Analysis) which supports search over multiple video archives. Our aims were: to assess a standard object detection methodology (SIN); evaluate contrasting runs in automatic event detection (SED) and deploy a distributed, cloud-based search interface for the interactive component of the SED task. Results from the SIN task, underlying retrospective classifiers for the surveillance event detection and a discussion of the contrasting aims of the SAVASA user interface compared with the TRECVid task requirements are presented

    Human Motion Analysis for Efficient Action Recognition

    Get PDF
    Automatic understanding of human actions is at the core of several application domains, such as content-based indexing, human-computer interaction, surveillance, and sports video analysis. The recent advances in digital platforms and the exponential growth of video and image data have brought an urgent quest for intelligent frameworks to automatically analyze human motion and predict their corresponding action based on visual data and sensor signals. This thesis presents a collection of methods that targets human action recognition using different action modalities. The first method uses the appearance modality and classifies human actions based on heterogeneous global- and local-based features of scene and humanbody appearances. The second method harnesses 2D and 3D articulated human poses and analyizes the body motion using a discriminative combination of the parts’ velocities, locations, and correlations histograms for action recognition. The third method presents an optimal scheme for combining the probabilistic predictions from different action modalities by solving a constrained quadratic optimization problem. In addition to the action classification task, we present a study that compares the utility of different pose variants in motion analysis for human action recognition. In particular, we compare the recognition performance when 2D and 3D poses are used. Finally, we demonstrate the efficiency of our pose-based method for action recognition in spotting and segmenting motion gestures in real time from a continuous stream of an input video for the recognition of the Italian sign gesture language

    Efficient Human Activity Recognition in Large Image and Video Databases

    Get PDF
    Vision-based human action recognition has attracted considerable interest in recent research for its applications to video surveillance, content-based search, healthcare, and interactive games. Most existing research deals with building informative feature descriptors, designing efficient and robust algorithms, proposing versatile and challenging datasets, and fusing multiple modalities. Often, these approaches build on certain conventions such as the use of motion cues to determine video descriptors, application of off-the-shelf classifiers, and single-factor classification of videos. In this thesis, we deal with important but overlooked issues such as efficiency, simplicity, and scalability of human activity recognition in different application scenarios: controlled video environment (e.g.~indoor surveillance), unconstrained videos (e.g.~YouTube), depth or skeletal data (e.g.~captured by Kinect), and person images (e.g.~Flicker). In particular, we are interested in answering questions like (a) is it possible to efficiently recognize human actions in controlled videos without temporal cues? (b) given that the large-scale unconstrained video data are often of high dimension low sample size (HDLSS) nature, how to efficiently recognize human actions in such data? (c) considering the rich 3D motion information available from depth or motion capture sensors, is it possible to recognize both the actions and the actors using only the motion dynamics of underlying activities? and (d) can motion information from monocular videos be used for automatically determining saliency regions for recognizing actions in still images

    Detection and Simulation of Dangerous Human Crowd Behavior

    Get PDF
    Tragically, gatherings of large human crowds quite often end in crowd disasters such as the recent catastrophe at the Loveparade 2010. In the past, research on pedestrian and crowd dynamics focused on simulation of pedestrian motion. As of yet, however, there does not exist any automatic system which can detect hazardous situations in crowds, thus helping to prevent these tragic incidents. In the thesis at hand, we analyze pedestrian behavior in large crowds and observe characteristic motion patterns. Based on our findings, we present a computer vision system that detects unusual events and critical situations from video streams and thus alarms security personnel in order to take necessary actions. We evaluate the system’s performance on synthetic, experimental as well as on real-world data. In particular, we show its effectiveness on the surveillance videos recorded at the Loveparade crowd stampede. Since our method is based on optical flow computations, it meets two crucial prerequisites in video surveillance: Firstly, it works in real-time and, secondly, the privacy of the people being monitored is preserved. In addition to that, we integrate the observed motion patterns into models for simulating pedestrian motion and show that the proposed simulation model produces realistic trajectories. We employ this model to simulate large human crowds and use techniques from computer graphics to render synthetic videos for further evaluation of our automatic video surveillance system

    Representations for Cognitive Vision : a Review of Appearance-Based, Spatio-Temporal, and Graph-Based Approaches

    Get PDF
    The emerging discipline of cognitive vision requires a proper representation of visual information including spatial and temporal relationships, scenes, events, semantics and context. This review article summarizes existing representational schemes in computer vision which might be useful for cognitive vision, a and discusses promising future research directions. The various approaches are categorized according to appearance-based, spatio-temporal, and graph-based representations for cognitive vision. While the representation of objects has been covered extensively in computer vision research, both from a reconstruction as well as from a recognition point of view, cognitive vision will also require new ideas how to represent scenes. We introduce new concepts for scene representations and discuss how these might be efficiently implemented in future cognitive vision systems

    Supervised dictionary learning for action recognition and localization

    Get PDF
    PhDImage sequences with humans and human activities are everywhere. With the amount of produced and distributed data increasing at an unprecedented rate, there has been a lot of interest in building systems that can understand and interpret the visual data, and in particular detect and recognise human actions. Dictionary based approaches learn a dictionary from descriptors extracted from the videos in the first stage and a classifier or a detector in the second stage. The major drawback of such an approach is that the dictionary is learned in an unsupervised manner without considering the task (classification or detection) that follows it. In this work we develop task dependent(supervised) dictionaries for action recognition and localization, i.e., dictionaries that are best suited for the subsequent task. In the first part of the work, we propose a supervised max-margin framework for linear and non-linear Non-Negative Matrix Factorization (NMF). To achieve this, we impose max-margin constraints within the formulation of NMF and simultaneously solve for the classifier and the dictionary. The dictionary (basis matrix) thus obtained maximizes the margin of the classifier in the low dimensional space (in the linear case) or in the high dimensional feature space (in the non-linear case). In the second part the work, we develop methodologies for action localization. We first propose a dictionary weighting approach where we learn local and global weights for the dictionary by considering the localization information of the training sequences. We next extend this approach to learn a task-dependent dictionary for action localization that incorporates the localization information of the training sequences into dictionary learning. The results on publicly available datasets show that the performance of the system is improved by using the supervised information while learning dictionary.QMUL; EPSRC PhD scholarship program (EP/G033935/1)

    Entwicklung von Methoden zur Unterscheidung und Interpretation von Bewegungsmustern in dynamischen Szenen

    Get PDF
    Im Forschungsfeld der mobilen Assistenzroboter spielen Bewegungsabläufe eine zunehmend wichtige Rolle. Gerade in den Bewegungen der mit dem Assistenzroboter handelnden Person verstecken sich eine ganze Reihe Informationen, die zur Verbesserung der Interaktion herangezogen werden können. Eine wichtige Fragestellung bezüglich der Analyse von Bewegungen stellt die Repräsentation der Bewegungstrajektorien dar. Außerdem muss geklärt werden, welche Ähnlichkeitsmaße in den komplexeren Verfahren zum Einsatz kommen können bzw. welche speziellen Anforderungen sie erfüllen müssen. Den Kern der Arbeit stellen drei Verfahren dar, die im Wesentlichen den weiteren Verlauf einer beobachteten Bewegung über einen längeren Zeitraum vorhersagen können. Dabei handelt es sich um Echo State Netzwerke, Local Models und die spatio-temporale nicht-negative Matrixfaktorisierung (NMF). Die Arbeit als Ganzes versteht sich als einer der ersten Schritte zur systematischen Untersuchung von Bewegungsabläufen. Mit dieser Arbeit soll ein Entwickler in der Lage sein, aus einer breiten Palette an Werkzeugen sich für das Richtige für seinen speziellen Anwendungsfall zu entscheiden
    corecore