15 research outputs found

    Crowd Motion Capture

    Get PDF
    International audienceIn this paper a new and original technique to animate a crowd of human beings is presented. Following the success of data-driven animation models (such as motion capture) in the context of articulated figures control, we propose to derivate a similar type of approach for crowd motions. In our framework, the motion of the crowds are represented as a time series of velocity fields estimated from a video of a real crowd. This time series is used as an input of a simple animation model that ”advect” people along this timevarying flow. We demonstrate the power of our technique on both synthetic and real examples of crowd videos. We also introduce the notions of crowd motion editing and present possible extensions to our work

    AGORASET: a dataset for crowd video analysis

    Get PDF
    International audienceThe ability of efficient computer vision tools (detec- tion of pedestrians, tracking, ...) as well as advanced rendering techniques have enabled both the analysis of crowd phenomena and the simulation of realistic sce- narios. A recurrent problem lies in the evaluation of those methods since few common benchmark are avail- able to compare and evaluate the techniques is avail- able. This paper proposes a dataset of crowd sequences with associated ground truths (individual trajectories of each pedestrians inside the crowd, related continuous quantities of the scene such as the density and the veloc- ity field). We chosed to rely on realistic image synthesis to achieve our goal. As contributions of this paper, a typology of sequences relevant to the computer vision analysis problem is proposed, along with images of se- quences available in the database

    Optimal crowd editing

    Get PDF
    International audienceSimulating realistic crowd behaviors is a challenging problem in computer graphics. Yet, several satisfying simulation models exhibiting natural pedestrians or group emerging behaviors exist. Choosing among these model generally depends on the considered crowd density or the topology of the environment. Conversely, achieving a user-desired kinematic or dynamic pattern at a given instant of the simulation reveals to be much more tedious. In this paper, a novel generic control methodology is proposed to solve this crowd editing issue. Our method relies on an adjoint formulation of the underlying optimization procedure. It is independent to a certain extent of the choice of the simulation model, and is designed to handle several forms of constraints. A variety of examples attesting the benefits of our approach are proposed, along with quantitative performance measures

    Wildlife Communication

    Get PDF
    This report contains a progress report for the ph.d. project titled “Wildlife Communication”. The project focuses on investigating how signal processing and pattern recognition can be used to improve wildlife management in agriculture. Wildlife management systems used today experience habituation from wild animals which makes them ineffective. An intelligent wildlife management system could monitor its own effectiveness and alter its scaring strategy based on this

    Detection and Simulation of Dangerous Human Crowd Behavior

    Get PDF
    Tragically, gatherings of large human crowds quite often end in crowd disasters such as the recent catastrophe at the Loveparade 2010. In the past, research on pedestrian and crowd dynamics focused on simulation of pedestrian motion. As of yet, however, there does not exist any automatic system which can detect hazardous situations in crowds, thus helping to prevent these tragic incidents. In the thesis at hand, we analyze pedestrian behavior in large crowds and observe characteristic motion patterns. Based on our findings, we present a computer vision system that detects unusual events and critical situations from video streams and thus alarms security personnel in order to take necessary actions. We evaluate the system’s performance on synthetic, experimental as well as on real-world data. In particular, we show its effectiveness on the surveillance videos recorded at the Loveparade crowd stampede. Since our method is based on optical flow computations, it meets two crucial prerequisites in video surveillance: Firstly, it works in real-time and, secondly, the privacy of the people being monitored is preserved. In addition to that, we integrate the observed motion patterns into models for simulating pedestrian motion and show that the proposed simulation model produces realistic trajectories. We employ this model to simulate large human crowds and use techniques from computer graphics to render synthetic videos for further evaluation of our automatic video surveillance system

    Analysis Of Behaviors In Crowd Videos

    Get PDF
    In this dissertation, we address the problem of discovery and representation of group activity of humans and objects in a variety of scenarios, commonly encountered in vision applications. The overarching goal is to devise a discriminative representation of human motion in social settings, which captures a wide variety of human activities observable in video sequences. Such motion emerges from the collective behavior of individuals and their interactions and is a significant source of information typically employed for applications such as event detection, behavior recognition, and activity recognition. We present new representations of human group motion for static cameras, and propose algorithms for their application to variety of problems. We first propose a method to model and learn the scene activity of a crowd using Social Force Model for the first time in the computer vision community. We present a method to densely estimate the interaction forces between people in a crowd, observed by a static camera. Latent Dirichlet Allocation (LDA) is used to learn the model of the normal activities over extended periods of time. Randomly selected spatio-temporal volumes of interaction forces are used to learn the model of normal behavior of the scene. The model encodes the latent topics of social interaction forces in the scene for normal behaviors. We classify a short video sequence of n frames as normal or abnormal by using the learnt model. Once a sequence of frames is classified as an abnormal, iii the regions of anomalies in the abnormal frames are localized using the magnitude of interaction forces. The representation and estimation framework proposed above, however, has a few limitations. This algorithm proposes to use a global estimation of the interaction forces within the crowd. It, therefore, is incapable of identifying different groups of objects based on motion or behavior in the scene. Although the algorithm is capable of learning the normal behavior and detects the abnormality, but it is incapable of capturing the dynamics of different behaviors. To overcome these limitations, we then propose a method based on the Lagrangian framework for fluid dynamics, by introducing a streakline representation of flow. Streaklines are traced in a fluid flow by injecting color material, such as smoke or dye, which is transported with the flow and used for visualization. In the context of computer vision, streaklines may be used in a similar way to transport information about a scene, and they are obtained by repeatedly initializing a fixed grid of particles at each frame, then moving both current and past particles using optical flow. Streaklines are the locus of points that connect particles which originated from the same initial position. This approach is advantageous over the previous representations in two aspects: first, its rich representation captures the dynamics of the crowd and changes in space and time in the scene where the optical flow representation is not enough, and second, this model is capable of discovering groups of similar behavior within a crowd scene by performing motion segmentation. We propose a method to distinguish different group behaviors such as divergent/convergent motion and lanes using this framework. Finally, we introduce flow potentials as a discriminative feature to iv recognize crowd behaviors in a scene. Results of extensive experiments are presented for multiple real life crowd sequences involving pedestrian and vehicular traffic. The proposed method exploits optical flow as the low level feature and performs integration and clustering to obtain coherent group motion patterns. However, we observe that in crowd video sequences, as well as a variety of other vision applications, the co-occurrence and inter-relation of motion patterns are the main characteristics of group behaviors. In other words, the group behavior of objects is a mixture of individual actions or behaviors in specific geometrical layout and temporal order. We, therefore, propose a new representation for group behaviors of humans using the interrelation of motion patterns in a scene. The representation is based on bag of visual phrases of spatio-temporal visual words. We present a method to match the high-order spatial layout of visual words that preserve the geometry of the visual words under similarity transformations. To perform the experiments we collected a dataset of group choreography performances from the YouTube website. The dataset currently contains four categories of group dances

    Reinforcement learning in a multi-agent framework for pedestrian simulation

    Get PDF
    El objetivo de la tesis consiste en la utilización de Aprendizaje por refuerzo (Reinforcement Learning) para generar simulaciones plausibles de peatones en diferentes entornos. Metodología Se ha desarrollado un marco de trabajo multi-agente donde cada agente virtual que aprende un comportamiento de navegación por interacción con el mundo virtual en el que se encuentra junto con el resto de agentes. El mundo virtual es simulado con un motor físico (ODE) que está calibrado con parámetros de peatones humanos extraídos de la bibliografía de la materia. El marco de trabajo es flexible y permite utilizar diferentes algoritmos de aprendizaje (en concreto Q-Learning y Sarsa(lambda) en combinación con diferentes técnicas de generalización del espacio de estados (en concreto cuantización Vectorial y tile coding). Como herramientas de análisis de los comportamientos aprendidos se utilizan diagramas fundamentales (relación velocidad/densidad), mapas de densidad, cronogramas y rendimientos (en términos del porcentaje de agentes que consiguen llegar al objetivo). Conclusiones: Tras una batería de experimentos en diferentes escenarios (un total de 6 escenarios distintos) y los correspondientes analisis de resultados, las conclusiones son las siguientes: - Se han conseguido comportamientos plausibles de peatones -Los comportamientos son robustos al escalado y presentan capacidades de abstracción (comportamientos a niveles táctico y de planificación) -Los comportamientos aprendidos son capaces de generar comportamientos colectivos emergentes -La comparación con otro modelo de peatones estandar (Modelo de Helbing) y los análisis realizados a nivel de diagramas fundamentales, indican que la dinámica aprendida es coherente y similar a una dinámica de peatones

    Planning Plausible Human Motions for Navigation and Collision Avoidance

    Get PDF
    This thesis investigates the plausibility of computer-generated human motions for navigation and collision avoidance. To navigate a human character through obstacles in an virtual environment, the problem is often tackled by finding the shortest possible path to the destination with smoothest motions available. This is because such solution is regarded as cost-effective and free-flowing in that it implicitly minimises the biomechanical efforts and potentially precludes anomalies such as frequent and sudden change of behaviours, and hence more plausible to human eyes. Previous research addresses this problem in two stages: finding the shortest collision-free path (motion planning) and then fitting motions onto this path accordingly (motion synthesis). This conventional approach is not optimal because the decoupling of these two stages introduces two problems. First, it forces the motion-planning stage to deliberately simplify the collision model to avoid obstacles. Secondly, it over-constrains the motion-synthesis stage to approximate motions to a sub-optimal trajectory. This often results in implausible animations that travel along erratic long paths while making frequent and sudden behaviour changes. In this research, I argue that to provide more plausible navigation and collision avoidance animation, close-proximity interaction with obstacles is crucial. To address this, I propose to combine motion planning and motion synthesis to search for shorter and smoother solutions. The intuition is that by incorporating precise collision detection and avoidance with motion capture database queries, we will be able to plan fine-scale interactions between obstacles and moving crowds. The results demonstrate that my approach can discover shorter paths with steadier behaviour transitions in scene navigation and crowd avoidance. In addition, this thesis attempts to propose a set of metrics that can be used to evaluate the plausibility of computer-generated navigation animations

    Scene Reconstruction Beyond Structure-from-Motion and Multi-View Stereo

    Get PDF
    Image-based 3D reconstruction has become a robust technology for recovering accurate and realistic models of real-world objects and scenes. A common pipeline for 3D reconstruction is to first apply Structure-from-Motion (SfM), which recovers relative poses for the input images and sparse geometry for the scene, and then apply Multi-view Stereo (MVS), which estimates a dense depthmap for each image. While this two-stage process is quite effective in many 3D modeling scenarios, there are limits to what can be reconstructed. This dissertation focuses on three particular scenarios where the SfM+MVS pipeline fails and introduces new approaches to accomplish each reconstruction task. First, I introduce a novel method to recover dense surface reconstructions of endoscopic video. In this setting, SfM can generally provide sparse surface structure, but the lack of surface texture as well as complex, changing illumination often causes MVS to fail. To overcome these difficulties, I introduce a method that utilizes SfM both to guide surface reflectance estimation and to regularize shading-based depth reconstruction. I also introduce models of reflectance and illumination that improve the final result. Second, I introduce an approach for augmenting 3D reconstructions from large-scale Internet photo-collections by recovering the 3D position of transient objects --- specifically, people --- in the input imagery. Since no two images can be assumed to capture the same person in the same location, the typical triangulation constraints enjoyed by SfM and MVS cannot be directly applied. I introduce an alternative method to approximately triangulate people who stood in similar locations, aided by a height distribution prior and visibility constraints provided by SfM. The scale of the scene, gravity direction, and per-person ground-surface normals are also recovered. Finally, I introduce the concept of using crowd-sourced imagery to create living 3D reconstructions --- visualizations of real places that include dynamic representations of transient objects. A key difficulty here is that SfM+MVS pipelines often poorly reconstruct ground surfaces given Internet images. To address this, I introduce a volumetric reconstruction approach that leverages scene scale and person placements. Crowd simulation is then employed to add virtual pedestrians to the space and bring the reconstruction "to life."Doctor of Philosoph
    corecore