541 research outputs found

    Human Motion Trajectory Prediction: A Survey

    Full text link
    With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper provides a survey of human motion trajectory prediction. We review, analyze and structure a large selection of work from different communities and propose a taxonomy that categorizes existing methods based on the motion modeling approach and level of contextual information used. We provide an overview of the existing datasets and performance metrics. We discuss limitations of the state of the art and outline directions for further research.Comment: Submitted to the International Journal of Robotics Research (IJRR), 37 page

    Multi-scale Crowd Feature Detection using Vision Sensing and Statistical Mechanics Principles

    Get PDF
    Crowd behaviour analysis using vision has been subject to many different approaches. Multi-purpose crowd descriptors are one of the more recent approaches. These descriptors provide an opportunity to compare and categorise various types of crowds as well as classify their respective behaviours. Nevertheless, the automated calculation of descriptors which are expressed as measurements with accurate interpretation is a challenging problem. In this paper, analogies between human crowds and molecular thermodynamics systems are drawn for the measurement of crowd behaviour. Specifically, a novel descriptor is defined and measured for crowd behaviour at multiple scales. This descriptor uses the concept of Entropy for evaluating the state of crowd disorder. By results, the descriptor Entropy does indeed appear to capture the desired outcome for crowd entropy while utilizing easily detectable image features. Our new approach for machine understanding of crowd behaviour is promising, while it offers new complementary capabilities to the existing crowd descriptors, for example, as will be demonstrated, in the case of spectator crowds. The scope and performance of this descriptor is further discussed in details in this paper

    Advanced Cyber and Physical Situation Awareness in Urban Smart Spaces

    Get PDF
    The ever-growing adoption of big data technologies, smart sensing, data science and artificial intelligence is enabling the development of new intelligent urban spaces with real-time monitoring and advanced cyber-physical situational awareness capabilities. In the S4AllCities international research project, the advancement of cyber-physical situational awareness will be experimented for achieving safer smart city spaces in Europe and beyond. The deployment of digital twins will lead to understanding real-time situation awareness and risks of potential physical and/or cyber-attacks on urban critical infrastructure specifically. The critical extraction of knowledge using digital twins, which ingest, process and fuse observation data and information, prior to machine reasoning is performed in S4AllCities. In this paper, a cyber behavior detection module, which identifies unusualness in cyber traffic networks is described. Also, a physical behaviour detection module is introduced. The two modules function within the so-called Malicious Attacks Information Detection System (MAIDS) digital twin

    Human aware robot navigation

    Get PDF
    Abstract. Human aware robot navigation refers to the navigation of a robot in an environment shared with humans in such a way that the humans should feel comfortable, and natural with the presence of the robot. On top of that, the robot navigation should comply with the social norms of the environment. The robot can interact with humans in the environment, such as avoiding them, approaching them, or following them. In this thesis, we specifically focus on the approach behavior of the robot, keeping the other use cases still in mind. Studying and analyzing how humans move around other humans gives us the idea about the kind of navigation behaviors that we expect the robots to exhibit. Most of the previous research does not focus much on understanding such behavioral aspects while approaching people. On top of that, a straightforward mathematical modeling of complex human behaviors is very difficult. So, in this thesis, we proposed an Inverse Reinforcement Learning (IRL) framework based on Guided Cost Learning (GCL) to learn these behaviors from demonstration. After analyzing the CongreG8 dataset, we found that the incoming human tends to make an O-space (circle) with the rest of the group. Also, the approaching velocity slows down when the approaching human gets closer to the group. We utilized these findings in our framework that can learn the optimal reward and policy from the example demonstrations and imitate similar human motion

    Taming Crowded Visual Scenes

    Get PDF
    Computer vision algorithms have played a pivotal role in commercial video surveillance systems for a number of years. However, a common weakness among these systems is their inability to handle crowded scenes. In this thesis, we have developed algorithms that overcome some of the challenges encountered in videos of crowded environments such as sporting events, religious festivals, parades, concerts, train stations, airports, and malls. We adopt a top-down approach by first performing a global-level analysis that locates dynamically distinct crowd regions within the video. This knowledge is then employed in the detection of abnormal behaviors and tracking of individual targets within crowds. In addition, the thesis explores the utility of contextual information necessary for persistent tracking and re-acquisition of objects in crowded scenes. For the global-level analysis, a framework based on Lagrangian Particle Dynamics is proposed to segment the scene into dynamically distinct crowd regions or groupings. For this purpose, the spatial extent of the video is treated as a phase space of a time-dependent dynamical system in which transport from one region of the phase space to another is controlled by the optical flow. Next, a grid of particles is advected forward in time through the phase space using a numerical integration to generate a flow map . The flow map relates the initial positions of particles to their final positions. The spatial gradients of the flow map are used to compute a Cauchy Green Deformation tensor that quantifies the amount by which the neighboring particles diverge over the length of the integration. The maximum eigenvalue of the tensor is used to construct a forward Finite Time Lyapunov Exponent (FTLE) field that reveals the Attracting Lagrangian Coherent Structures (LCS). The same process is repeated by advecting the particles backward in time to obtain a backward FTLE field that reveals the repelling LCS. The attracting and repelling LCS are the time dependent invariant manifolds of the phase space and correspond to the boundaries between dynamically distinct crowd flows. The forward and backward FTLE fields are combined to obtain one scalar field that is segmented using a watershed segmentation algorithm to obtain the labeling of distinct crowd-flow segments. Next, abnormal behaviors within the crowd are localized by detecting changes in the number of crowd-flow segments over time. Next, the global-level knowledge of the scene generated by the crowd-flow segmentation is used as an auxiliary source of information for tracking an individual target within a crowd. This is achieved by developing a scene structure-based force model. This force model captures the notion that an individual, when moving in a particular scene, is subjected to global and local forces that are functions of the layout of that scene and the locomotive behavior of other individuals in his or her vicinity. The key ingredients of the force model are three floor fields that are inspired by research in the field of evacuation dynamics; namely, Static Floor Field (SFF), Dynamic Floor Field (DFF), and Boundary Floor Field (BFF). These fields determine the probability of moving from one location to the next by converting the long-range forces into local forces. The SFF specifies regions of the scene that are attractive in nature, such as an exit location. The DFF, which is based on the idea of active walker models, corresponds to the virtual traces created by the movements of nearby individuals in the scene. The BFF specifies influences exhibited by the barriers within the scene, such as walls and no-entry areas. By combining influence from all three fields with the available appearance information, we are able to track individuals in high-density crowds. The results are reported on real-world sequences of marathons and railway stations that contain thousands of people. A comparative analysis with respect to an appearance-based mean shift tracker is also conducted by generating the ground truth. The result of this analysis demonstrates the benefit of using floor fields in crowded scenes. The occurrence of occlusion is very frequent in crowded scenes due to a high number of interacting objects. To overcome this challenge, we propose an algorithm that has been developed to augment a generic tracking algorithm to perform persistent tracking in crowded environments. The algorithm exploits the contextual knowledge, which is divided into two categories consisting of motion context (MC) and appearance context (AC). The MC is a collection of trajectories that are representative of the motion of the occluded or unobserved object. These trajectories belong to other moving individuals in a given environment. The MC is constructed using a clustering scheme based on the Lyapunov Characteristic Exponent (LCE), which measures the mean exponential rate of convergence or divergence of the nearby trajectories in a given state space. Next, the MC is used to predict the location of the occluded or unobserved object in a regression framework. It is important to note that the LCE is used for measuring divergence between a pair of particles while the FTLE field is obtained by computing the LCE for a grid of particles. The appearance context (AC) of a target object consists of its own appearance history and appearance information of the other objects that are occluded. The intent is to make the appearance descriptor of the target object more discriminative with respect to other unobserved objects, thereby reducing the possible confusion between the unobserved objects upon re-acquisition. This is achieved by learning the distribution of the intra-class variation of each occluded object using all of its previous observations. In addition, a distribution of inter-class variation for each target-unobservable object pair is constructed. Finally, the re-acquisition decision is made using both the MC and the AC

    Dynamic Switching State Systems for Visual Tracking

    Get PDF
    This work addresses the problem of how to capture the dynamics of maneuvering objects for visual tracking. Towards this end, the perspective of recursive Bayesian filters and the perspective of deep learning approaches for state estimation are considered and their functional viewpoints are brought together

    Dynamic Switching State Systems for Visual Tracking

    Get PDF
    This work addresses the problem of how to capture the dynamics of maneuvering objects for visual tracking. Towards this end, the perspective of recursive Bayesian filters and the perspective of deep learning approaches for state estimation are considered and their functional viewpoints are brought together

    Dynamic Switching State Systems for Visual Tracking

    Get PDF
    This work addresses the problem of how to capture the dynamics of maneuvering objects for visual tracking. Towards this end, the perspective of recursive Bayesian filters and the perspective of deep learning approaches for state estimation are considered and their functional viewpoints are brought together

    Artificial Intelligence and Ambient Intelligence

    Get PDF
    This book includes a series of scientific papers published in the Special Issue on Artificial Intelligence and Ambient Intelligence at the journal Electronics MDPI. The book starts with an opinion paper on “Relations between Electronics, Artificial Intelligence and Information Society through Information Society Rules”, presenting relations between information society, electronics and artificial intelligence mainly through twenty-four IS laws. After that, the book continues with a series of technical papers that present applications of Artificial Intelligence and Ambient Intelligence in a variety of fields including affective computing, privacy and security in smart environments, and robotics. More specifically, the first part presents usage of Artificial Intelligence (AI) methods in combination with wearable devices (e.g., smartphones and wristbands) for recognizing human psychological states (e.g., emotions and cognitive load). The second part presents usage of AI methods in combination with laser sensors or Wi-Fi signals for improving security in smart buildings by identifying and counting the number of visitors. The last part presents usage of AI methods in robotics for improving robots’ ability for object gripping manipulation and perception. The language of the book is rather technical, thus the intended audience are scientists and researchers who have at least some basic knowledge in computer science
    corecore