7,504 research outputs found
LCrowdV: Generating Labeled Videos for Simulation-based Crowd Behavior Learning
We present a novel procedural framework to generate an arbitrary number of
labeled crowd videos (LCrowdV). The resulting crowd video datasets are used to
design accurate algorithms or training models for crowded scene understanding.
Our overall approach is composed of two components: a procedural simulation
framework for generating crowd movements and behaviors, and a procedural
rendering framework to generate different videos or images. Each video or image
is automatically labeled based on the environment, number of pedestrians,
density, behavior, flow, lighting conditions, viewpoint, noise, etc.
Furthermore, we can increase the realism by combining synthetically-generated
behaviors with real-world background videos. We demonstrate the benefits of
LCrowdV over prior lableled crowd datasets by improving the accuracy of
pedestrian detection and crowd behavior classification algorithms. LCrowdV
would be released on the WWW
Automotive Interior Sensing - Temporal Consistent Human Body Pose Estimation
Com o surgimento e desenvolvimento de veículos autónomos, surgiu igualmente uma necessidade de monitorizar e identificar objetos e ações que ocorrem no ambiente que rodeia o veículo. Este tipo de monitorização é particularmente importante no caso de veículos partilhados, dada a necessidade de identificar ações não só no exterior mas também no interior do veículo devido à ausência de um condutor humano que possa detetar, por exemplo, potenciais ações de violência entre passageiros e/ou situações onde estes necessitem de assistência. Englobado neste contexto, a Bosch desenvolveu uma solução de estimação de postura humana com o objetivo de extrapolar a pose de todos os ocupantes presentes numa dada imagem, inferir o comportamento de cada passageiro e, consequentemente, identificar ações potencialmente maliciosas. Porém, para que este algoritmo possa ser aplicado não apenas a imagens isoladas mas também a vídeos é necessário adicionar contexto temporal entre frames. Por outras palavras, é necessário associar a estimação de pose de uma dada pessoa para uma dada frame às estimações de pose para a mesma pessoa em frames subsequentes de modo a que a identificação dessa pessoa (ou qualquer outra presente numa dada frame) ao longo do vídeo seja correta e consistente. O tópico de associação temporal, também conhecido como "pose tracking", é abordado e desenvolvido ao longo do presente projeto, culminando na proposta e implementação de uma solução que melhora consideravelmente a consistência temporal do algoritmo de estimação de pose humana da Bosch. A solução desenvolvida utiliza uma mistura de abordagens clássicas e atuais de associação de informação, como por exemplo o "Hungarian algorithm" e "Intersection over Union", e abordagens de lógica de informação desenvolvidas especificamente para o caso em questão. A performance do algoritmo implementado no presente projeto é avaliada usando duas das mais recorrentes métricas de avaliação em casos de rastreamento de pose.With the emergence and development of autonomous vehicles, a necessity to constantly monitor and identify objects and action that occur in the surrounding environment of the vehicle itself was also created. This type of monitoring is particularly important in the case of shared vehicles, given the necessity to identify actions not only in the exterior but also in the interior of the vehicle due to the absence of a human driver that can detect, for instance, potential violent actions between passengers and/or cases where assistence is required. Encompassed in this context, Bosch has developed a human body pose estimation solution in order to extrapolate the pose of all vehicle occupants present in a given image, infere the behaviour of each passenger and, consequently, identify potentially malicious actions. However, in order to apply this algorithm not only to isolated images but also to videos it is necessary to add temporal context between frames. In other words, an association is required between the body pose estimation for a given person in a given frame and the body pose estimations for the same person in subsequent frames in order to ensure that the identification of that passenger (or any other passenger present in the same frame) is accurate and consistent throughout the entire video. The temporal association topic, also known as pose tracking, is addressed and developed during the present project, culminating in the proposal and implementation of a solution that considerably improves the temporal consistency of the human body pose estimation algorithm developed by Bosch. The implemented solution uses a mixture of currently relevant classical approaches for data association, such as the Hungarian algorithm e Intersection over Union techniques, and approaches based on data logic developed specifically for the present case. Regarding performance, the developed algorithm is evaluated using two of the most recurrent metrics for pose tracking methods
Edge Video Analytics: A Survey on Applications, Systems and Enabling Techniques
Video, as a key driver in the global explosion of digital information, can
create tremendous benefits for human society. Governments and enterprises are
deploying innumerable cameras for a variety of applications, e.g., law
enforcement, emergency management, traffic control, and security surveillance,
all facilitated by video analytics (VA). This trend is spurred by the rapid
advancement of deep learning (DL), which enables more precise models for object
classification, detection, and tracking. Meanwhile, with the proliferation of
Internet-connected devices, massive amounts of data are generated daily,
overwhelming the cloud. Edge computing, an emerging paradigm that moves
workloads and services from the network core to the network edge, has been
widely recognized as a promising solution. The resulting new intersection, edge
video analytics (EVA), begins to attract widespread attention. Nevertheless,
only a few loosely-related surveys exist on this topic. The basic concepts of
EVA (e.g., definition, architectures) were not fully elucidated due to the
rapid development of this domain. To fill these gaps, we provide a
comprehensive survey of the recent efforts on EVA. In this paper, we first
review the fundamentals of edge computing, followed by an overview of VA. The
EVA system and its enabling techniques are discussed next. In addition, we
introduce prevalent frameworks and datasets to aid future researchers in the
development of EVA systems. Finally, we discuss existing challenges and foresee
future research directions. We believe this survey will help readers comprehend
the relationship between VA and edge computing, and spark new ideas on EVA.Comment: 31 pages, 13 figure
Local and Global Explanations of Agent Behavior: Integrating Strategy Summaries with Saliency Maps
With advances in reinforcement learning (RL), agents are now being developed
in high-stakes application domains such as healthcare and transportation.
Explaining the behavior of these agents is challenging, as the environments in
which they act have large state spaces, and their decision-making can be
affected by delayed rewards, making it difficult to analyze their behavior. To
address this problem, several approaches have been developed. Some approaches
attempt to convey the behavior of the agent, describing the
actions it takes in different states. Other approaches devised
explanations which provide information regarding the agent's decision-making in
a particular state. In this paper, we combine global and local explanation
methods, and evaluate their joint and separate contributions, providing (to the
best of our knowledge) the first user study of combined local and global
explanations for RL agents. Specifically, we augment strategy summaries that
extract important trajectories of states from simulations of the agent with
saliency maps which show what information the agent attends to. Our results
show that the choice of what states to include in the summary (global
information) strongly affects people's understanding of agents: participants
shown summaries that included important states significantly outperformed
participants who were presented with agent behavior in a randomly set of chosen
world-states. We find mixed results with respect to augmenting demonstrations
with saliency maps (local information), as the addition of saliency maps did
not significantly improve performance in most cases. However, we do find some
evidence that saliency maps can help users better understand what information
the agent relies on in its decision making, suggesting avenues for future work
that can further improve explanations of RL agents
- …