7,504 research outputs found

    LCrowdV: Generating Labeled Videos for Simulation-based Crowd Behavior Learning

    Full text link
    We present a novel procedural framework to generate an arbitrary number of labeled crowd videos (LCrowdV). The resulting crowd video datasets are used to design accurate algorithms or training models for crowded scene understanding. Our overall approach is composed of two components: a procedural simulation framework for generating crowd movements and behaviors, and a procedural rendering framework to generate different videos or images. Each video or image is automatically labeled based on the environment, number of pedestrians, density, behavior, flow, lighting conditions, viewpoint, noise, etc. Furthermore, we can increase the realism by combining synthetically-generated behaviors with real-world background videos. We demonstrate the benefits of LCrowdV over prior lableled crowd datasets by improving the accuracy of pedestrian detection and crowd behavior classification algorithms. LCrowdV would be released on the WWW

    Automotive Interior Sensing - Temporal Consistent Human Body Pose Estimation

    Get PDF
    Com o surgimento e desenvolvimento de veículos autónomos, surgiu igualmente uma necessidade de monitorizar e identificar objetos e ações que ocorrem no ambiente que rodeia o veículo. Este tipo de monitorização é particularmente importante no caso de veículos partilhados, dada a necessidade de identificar ações não só no exterior mas também no interior do veículo devido à ausência de um condutor humano que possa detetar, por exemplo, potenciais ações de violência entre passageiros e/ou situações onde estes necessitem de assistência. Englobado neste contexto, a Bosch desenvolveu uma solução de estimação de postura humana com o objetivo de extrapolar a pose de todos os ocupantes presentes numa dada imagem, inferir o comportamento de cada passageiro e, consequentemente, identificar ações potencialmente maliciosas. Porém, para que este algoritmo possa ser aplicado não apenas a imagens isoladas mas também a vídeos é necessário adicionar contexto temporal entre frames. Por outras palavras, é necessário associar a estimação de pose de uma dada pessoa para uma dada frame às estimações de pose para a mesma pessoa em frames subsequentes de modo a que a identificação dessa pessoa (ou qualquer outra presente numa dada frame) ao longo do vídeo seja correta e consistente. O tópico de associação temporal, também conhecido como "pose tracking", é abordado e desenvolvido ao longo do presente projeto, culminando na proposta e implementação de uma solução que melhora consideravelmente a consistência temporal do algoritmo de estimação de pose humana da Bosch. A solução desenvolvida utiliza uma mistura de abordagens clássicas e atuais de associação de informação, como por exemplo o "Hungarian algorithm" e "Intersection over Union", e abordagens de lógica de informação desenvolvidas especificamente para o caso em questão. A performance do algoritmo implementado no presente projeto é avaliada usando duas das mais recorrentes métricas de avaliação em casos de rastreamento de pose.With the emergence and development of autonomous vehicles, a necessity to constantly monitor and identify objects and action that occur in the surrounding environment of the vehicle itself was also created. This type of monitoring is particularly important in the case of shared vehicles, given the necessity to identify actions not only in the exterior but also in the interior of the vehicle due to the absence of a human driver that can detect, for instance, potential violent actions between passengers and/or cases where assistence is required. Encompassed in this context, Bosch has developed a human body pose estimation solution in order to extrapolate the pose of all vehicle occupants present in a given image, infere the behaviour of each passenger and, consequently, identify potentially malicious actions. However, in order to apply this algorithm not only to isolated images but also to videos it is necessary to add temporal context between frames. In other words, an association is required between the body pose estimation for a given person in a given frame and the body pose estimations for the same person in subsequent frames in order to ensure that the identification of that passenger (or any other passenger present in the same frame) is accurate and consistent throughout the entire video. The temporal association topic, also known as pose tracking, is addressed and developed during the present project, culminating in the proposal and implementation of a solution that considerably improves the temporal consistency of the human body pose estimation algorithm developed by Bosch. The implemented solution uses a mixture of currently relevant classical approaches for data association, such as the Hungarian algorithm e Intersection over Union techniques, and approaches based on data logic developed specifically for the present case. Regarding performance, the developed algorithm is evaluated using two of the most recurrent metrics for pose tracking methods

    Edge Video Analytics: A Survey on Applications, Systems and Enabling Techniques

    Full text link
    Video, as a key driver in the global explosion of digital information, can create tremendous benefits for human society. Governments and enterprises are deploying innumerable cameras for a variety of applications, e.g., law enforcement, emergency management, traffic control, and security surveillance, all facilitated by video analytics (VA). This trend is spurred by the rapid advancement of deep learning (DL), which enables more precise models for object classification, detection, and tracking. Meanwhile, with the proliferation of Internet-connected devices, massive amounts of data are generated daily, overwhelming the cloud. Edge computing, an emerging paradigm that moves workloads and services from the network core to the network edge, has been widely recognized as a promising solution. The resulting new intersection, edge video analytics (EVA), begins to attract widespread attention. Nevertheless, only a few loosely-related surveys exist on this topic. The basic concepts of EVA (e.g., definition, architectures) were not fully elucidated due to the rapid development of this domain. To fill these gaps, we provide a comprehensive survey of the recent efforts on EVA. In this paper, we first review the fundamentals of edge computing, followed by an overview of VA. The EVA system and its enabling techniques are discussed next. In addition, we introduce prevalent frameworks and datasets to aid future researchers in the development of EVA systems. Finally, we discuss existing challenges and foresee future research directions. We believe this survey will help readers comprehend the relationship between VA and edge computing, and spark new ideas on EVA.Comment: 31 pages, 13 figure

    Local and Global Explanations of Agent Behavior: Integrating Strategy Summaries with Saliency Maps

    Get PDF
    With advances in reinforcement learning (RL), agents are now being developed in high-stakes application domains such as healthcare and transportation. Explaining the behavior of these agents is challenging, as the environments in which they act have large state spaces, and their decision-making can be affected by delayed rewards, making it difficult to analyze their behavior. To address this problem, several approaches have been developed. Some approaches attempt to convey the global\textit{global} behavior of the agent, describing the actions it takes in different states. Other approaches devised local\textit{local} explanations which provide information regarding the agent's decision-making in a particular state. In this paper, we combine global and local explanation methods, and evaluate their joint and separate contributions, providing (to the best of our knowledge) the first user study of combined local and global explanations for RL agents. Specifically, we augment strategy summaries that extract important trajectories of states from simulations of the agent with saliency maps which show what information the agent attends to. Our results show that the choice of what states to include in the summary (global information) strongly affects people's understanding of agents: participants shown summaries that included important states significantly outperformed participants who were presented with agent behavior in a randomly set of chosen world-states. We find mixed results with respect to augmenting demonstrations with saliency maps (local information), as the addition of saliency maps did not significantly improve performance in most cases. However, we do find some evidence that saliency maps can help users better understand what information the agent relies on in its decision making, suggesting avenues for future work that can further improve explanations of RL agents
    corecore