Search CORE

1,418 research outputs found

Informative scene decomposition for crowd analysis, comparison and simulation guidance

Author: Bian Jiang
Bishop Christopher
Charalambous Panayiotis
Curtis Sean
Ennis Cathy
Feixiang He
Golas Abhinav
He Wang
Helbing Dirk
Jordao Kévin
Karamouzas Ioannis
Kauffman Leonard
Lee Kang Hoon
Liu Ning
Rasmussen Carl Edward
Ren Jiaping
Ren Zhiguo
Sabokrou Mohammad
Wang He
Wang Xiaogang
Xi Zhao
Xu Yanyu
Yuanhang Xiang
Yurochkin Mikhail
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/04/2020
Field of study

Crowd simulation is a central topic in several fields including graphics. To achieve high-fidelity simulations, data has been increasingly relied upon for analysis and simulation guidance. However, the information in real-world data is often noisy, mixed and unstructured, making it difficult for effective analysis, therefore has not been fully utilized. With the fast-growing volume of crowd data, such a bottleneck needs to be addressed. In this paper, we propose a new framework which comprehensively tackles this problem. It centers at an unsupervised method for analysis. The method takes as input raw and noisy data with highly mixed multi-dimensional (space, time and dynamics) information, and automatically structure it by learning the correlations among these dimensions. The dimensions together with their correlations fully describe the scene semantics which consists of recurring activity patterns in a scene, manifested as space flows with temporal and dynamics profiles. The effectiveness and robustness of the analysis have been tested on datasets with great variations in volume, duration, environment and crowd dynamics. Based on the analysis, new methods for data visualization, simulation evaluation and simulation guidance are also proposed. Together, our framework establishes a highly automated pipeline from raw data to crowd analysis, comparison and simulation guidance. Extensive experiments and evaluations have been conducted to show the flexibility, versatility and intuitiveness of our framework

arXiv.org e-Print Archive

Crossref

White Rose Research Online

Human Trajectory Prediction via Neural Social Physics

Author: Manocha Dinesh
Wang He
Yue Jiangbei
Publication venue
Publication date: 31/03/2023
Field of study

Trajectory prediction has been widely pursued in many fields, and many model-based and model-free methods have been explored. The former include rule-based, geometric or optimization-based models, and the latter are mainly comprised of deep learning approaches. In this paper, we propose a new method combining both methodologies based on a new Neural Differential Equation model. Our new model (Neural Social Physics or NSP) is a deep neural network within which we use an explicit physics model with learnable parameters. The explicit physics model serves as a strong inductive bias in modeling pedestrian behaviors, while the rest of the network provides a strong data-fitting capability in terms of system parameter estimation and dynamics stochasticity modeling. We compare NSP with 15 recent deep learning methods on 6 datasets and improve the state-of-the-art performance by 5.56%-70%. Besides, we show that NSP has better generalizability in predicting plausible trajectories in drastically different scenarios where the density is 2-5 times as high as the testing data. Finally, we show that the physics model in NSP can provide plausible explanations for pedestrian behaviors, as opposed to black-box deep learning. Code is available: https://github.com/realcrane/Human-Trajectory-Prediction-via-Neural-Social-Physics.Comment: ECCV 202

arXiv.org e-Print Archive

Load Estimation, Structural Identification and Human Comfort Assessment of Flexible Structures

Author: Celik Ozan
Publication venue: University of Central Florida
Publication date: 01/01/2017
Field of study

Stadiums, pedestrian bridges, dance floors, and concert halls are distinct from other civil engineering structures due to several challenges in their design and dynamic behavior. These challenges originate from the flexible inherent nature of these structures coupled with human interactions in the form of loading. The investigations in past literature on this topic clearly state that the design of flexible structures can be improved with better load modeling strategies acquired with reliable load quantification, a deeper understanding of structural response, generation of simple and efficient human-structure interaction models and new measurement and assessment criteria for acceptable vibration levels. In contribution to these possible improvements, this dissertation taps into three specific areas: the load quantification of lively individuals or crowds, the structural identification under non-stationary and narrowband disturbances and the measurement of excessive vibration levels for human comfort. For load quantification, a computer vision based approach capable of tracking both individual and crowd motion is used. For structural identification, a noise-assisted Multivariate Empirical Mode Decomposition (MEMD) algorithm is incorporated into the operational modal analysis. The measurement of excessive vibration levels and the assessment of human comfort are accomplished through computer vision based human and object tracking, which provides a more convenient means for measurement and computation. All the proposed methods are tested in the laboratory environment utilizing a grandstand simulator and in the field on a pedestrian bridge and on a football stadium. Findings and interpretations from the experimental results are presented. The dissertation is concluded by highlighting the critical findings and the possible future work that may be conducted

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Core Challenges in Embodied Vision-Language Planning

Author: Francis Jonathan
Kitamura Nariaki
Labelle Felix
Lu Xiaopeng
Navarro Ingrid
Oh Jean
Publication venue
Publication date: 27/07/2021
Field of study

Recent advances in the areas of multimodal machine learning and artificial intelligence (AI) have led to the development of challenging tasks at the intersection of Computer Vision, Natural Language Processing, and Embodied AI. Whereas many approaches and previous survey pursuits have characterised one or two of these dimensions, there has not been a holistic analysis at the center of all three. Moreover, even when combinations of these topics are considered, more focus is placed on describing, e.g., current architectural methods, as opposed to also illustrating high-level challenges and opportunities for the field. In this survey paper, we discuss Embodied Vision-Language Planning (EVLP) tasks, a family of prominent embodied navigation and manipulation problems that jointly use computer vision and natural language. We propose a taxonomy to unify these tasks and provide an in-depth analysis and comparison of the new and current algorithmic approaches, metrics, simulated environments, as well as the datasets used for EVLP tasks. Finally, we present the core challenges that we believe new EVLP works should seek to address, and we advocate for task construction that enables model generalizability and furthers real-world deployment.Comment: 35 page

arXiv.org e-Print Archive

Occlusion reasoning for multiple object visual tracking

Author: Wu Zheng
Publication venue: Boston University
Publication date: 01/01/2013
Field of study

Thesis (Ph.D.)--Boston UniversityOcclusion reasoning for visual object tracking in uncontrolled environments is a challenging problem. It becomes significantly more difficult when dense groups of indistinguishable objects are present in the scene that cause frequent inter-object interactions and occlusions. We present several practical solutions that tackle the inter-object occlusions for video surveillance applications. In particular, this thesis proposes three methods. First, we propose "reconstruction-tracking," an online multi-camera spatial-temporal data association method for tracking large groups of objects imaged with low resolution. As a variant of the well-known Multiple-Hypothesis-Tracker, our approach localizes the positions of objects in 3D space with possibly occluded observations from multiple camera views and performs temporal data association in 3D. Second, we develop "track linking," a class of offline batch processing algorithms for long-term occlusions, where the decision has to be made based on the observations from the entire tracking sequence. We construct a graph representation to characterize occlusion events and propose an efficient graph-based/combinatorial algorithm to resolve occlusions. Third, we propose a novel Bayesian framework where detection and data association are combined into a single module and solved jointly. Almost all traditional tracking systems address the detection and data association tasks separately in sequential order. Such a design implies that the output of the detector has to be reliable in order to make the data association work. Our framework takes advantage of the often complementary nature of the two subproblems, which not only avoids the error propagation issue from which traditional "detection-tracking approaches" suffer but also eschews common heuristics such as "nonmaximum suppression" of hypotheses by modeling the likelihood of the entire image. The thesis describes a substantial number of experiments, involving challenging, notably distinct simulated and real data, including infrared and visible-light data sets recorded ourselves or taken from data sets publicly available. In these videos, the number of objects ranges from a dozen to a hundred per frame in both monocular and multiple views. The experiments demonstrate that our approaches achieve results comparable to those of state-of-the-art approaches

Boston University Institutional Repository (OpenBU)

View recommendation for multi-camera demonstration-based training

Author: Biswas Saugata
Kruijff Ernst
Veas Eduardo
Publication venue: Hochschule Bonn-Rhein-Sieg
Publication date: 03/08/2023
Field of study

While humans can effortlessly pick a view from multiple streams, automatically choosing the best view is a challenge. Choosing the best view from multi-camera streams poses a problem regarding which objective metrics should be considered. Existing works on view selection lack consensus about which metrics should be considered to select the best view. The literature on view selection describes diverse possible metrics. And strategies such as information-theoretic, instructional design, or aesthetics-motivated fail to incorporate all approaches. In this work, we postulate a strategy incorporating information-theoretic and instructional design-based objective metrics to select the best view from a set of views. Traditionally, information-theoretic measures have been used to find the goodness of a view, such as in 3D rendering. We adapted a similar measure known as the viewpoint entropy for real-world 2D images. Additionally, we incorporated similarity penalization to get a more accurate measure of the entropy of a view, which is one of the metrics for the best view selection. Since the choice of the best view is domain-dependent, we chose demonstration-based training scenarios as our use case. The limitation of our chosen scenarios is that they do not include collaborative training and solely feature a single trainer. To incorporate instructional design considerations, we included the trainer’s body pose, face, face when instructing, and hands visibility as metrics. To incorporate domain knowledge we included predetermined regions’ visibility as another metric. All of those metrics are taken into account to produce a parameterized view recommendation approach for demonstration-based training. An online study using recorded multi-camera video streams from a simulation environment was used to validate those metrics. Furthermore, the responses from the online study were used to optimize the view recommendation performance with a normalized discounted cumulative gain (NDCG) value of 0.912, which shows good performance with respect to matching user choices

pub H-BRS - Publikationsserver der Hochschule Bonn-Rhein-Sieg

Entropy in Image Analysis II

Author
Publication venue: 'MDPI AG'
Publication date: 01/05/2021
Field of study

Image analysis is a fundamental task for any application where extracting information from images is required. The analysis requires highly sophisticated numerical and analytical methods, particularly for those applications in medicine, security, and other fields where the results of the processing consist of data of vital importance. This fact is evident from all the articles composing the Special Issue "Entropy in Image Analysis II", in which the authors used widely tested methods to verify their results. In the process of reading the present volume, the reader will appreciate the richness of their methods and applications, in particular for medical imaging and image security, and a remarkable cross-fertilization among the proposed research areas

Directory of Open Access Books (DOAB)

Visual modeling and simulation of multiscale phenomena

Author: Narain Rahul
Publication venue: University of North Carolina at Chapel Hill
Publication date: 01/12/2011
Field of study

Many large-scale systems seen in real life, such as human crowds, fluids, and granular materials, exhibit complicated motion at many different scales, from a characteristic global behavior to important small-scale detail. Such multiscale systems are computationally expensive for traditional simulation techniques to capture over the full range of scales. In this dissertation, I present novel techniques for scalable and efficient simulation of these large, complex phenomena for visual computing applications. These techniques are based on a new approach of representing a complex system by coupling together separate models for its large-scale and fine-scale dynamics. In fluid simulation, it remains a challenge to efficiently simulate fine local detail such as foam, ripples, and turbulence without compromising the accuracy of the large-scale flow. I present two techniques for this problem that combine physically-based numerical simulation for the global flow with efficient local models for detail. For surface features, I propose the use of texture synthesis, guided by the physical characteristics of the macroscopic flow. For turbulence in the fluid motion itself, I present a technique that tracks the transfer of energy from the mean flow to the turbulent fluctuations and synthesizes these fluctuations procedurally, allowing extremely efficient visual simulation of turbulent fluids. Another large class of problems which are not easily handled by traditional approaches is the simulation of very large aggregates of discrete entities, such as dense pedestrian crowds and granular materials. I present a technique for crowd simulation that couples a discrete per-agent model of individual navigation with a novel continuum formulation for the collective motion of pedestrians. This approach allows simulation of dense crowds of a hundred thousand agents at near-real-time rates on desktop computers. I also present a technique for simulating granular materials, which generalizes this model and introduces a novel computational scheme for friction. This method efficiently reproduces a wide range of granular behavior and allows two-way interaction with simulated solid bodies. In all of these cases, the proposed techniques are typically an order of magnitude faster than comparable existing methods. Through these applications to a diverse set of challenging simulation problems, I demonstrate the benefits of the proposed approach, showing that it is a powerful and versatile technique for the simulation of a broad range of large and complex systems

Carolina Digital Repository