1,528 research outputs found
Towards predicting Pedestrian Evacuation Time and Density from Floorplans using a Vision Transformer
Conventional pedestrian simulators are inevitable tools in the design process
of a building, as they enable project engineers to prevent overcrowding
situations and plan escape routes for evacuation. However, simulation runtime
and the multiple cumbersome steps in generating simulation results are
potential bottlenecks during the building design process. Data-driven
approaches have demonstrated their capability to outperform conventional
methods in speed while delivering similar or even better results across many
disciplines. In this work, we present a deep learning-based approach based on a
Vision Transformer to predict density heatmaps over time and total evacuation
time from a given floorplan. Specifically, due to limited availability of
public datasets, we implement a parametric data generation pipeline including a
conventional simulator. This enables us to build a large synthetic dataset that
we use to train our architecture. Furthermore, we seamlessly integrate our
model into a BIM-authoring tool to generate simulation results instantly and
automatically
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
Existing audio-visual event localization (AVE) handles manually trimmed
videos with only a single instance in each of them. However, this setting is
unrealistic as natural videos often contain numerous audio-visual events with
different categories. To better adapt to real-life applications, in this paper
we focus on the task of dense-localizing audio-visual events, which aims to
jointly localize and recognize all audio-visual events occurring in an
untrimmed video. The problem is challenging as it requires fine-grained
audio-visual scene and context understanding. To tackle this problem, we
introduce the first Untrimmed Audio-Visual (UnAV-100) dataset, which contains
10K untrimmed videos with over 30K audio-visual events. Each video has 2.8
audio-visual events on average, and the events are usually related to each
other and might co-occur as in real-life scenes. Next, we formulate the task
using a new learning-based framework, which is capable of fully integrating
audio and visual modalities to localize audio-visual events with various
lengths and capture dependencies between them in a single pass. Extensive
experiments demonstrate the effectiveness of our method as well as the
significance of multi-scale cross-modal perception and dependency modeling for
this task.Comment: Accepted by CVPR202
Working With Incremental Spatial Data During Parallel (GPU) Computation
Central to many complex systems, spatial actors require an awareness of their local environment to enable behaviours such as communication and navigation. Complex system simulations represent this behaviour with Fixed Radius Near Neighbours (FRNN) search. This algorithm allows actors to store data at spatial locations and then query the data structure to find all data stored within a fixed radius of the search origin.
The work within this thesis answers the question: What techniques can be used for improving the performance of FRNN searches during complex system simulations on Graphics Processing Units (GPUs)?
It is generally agreed that Uniform Spatial Partitioning (USP) is the most suitable data structure for providing FRNN search on GPUs. However, due to the architectural complexities of GPUs, the performance is constrained such that FRNN search remains one of the most expensive common stages between complex systems models.
Existing innovations to USP highlight a need to take advantage of recent GPU advances, reducing the levels of divergence and limiting redundant memory accesses as viable routes to improve the performance of FRNN search. This thesis addresses these with three separate optimisations that can be used simultaneously.
Experiments have assessed the impact of optimisations to the general case of FRNN search found within complex system simulations and demonstrated their impact in practice when applied to full complex system models. Results presented show the performance of the construction and query stages of FRNN search can be improved by over 2x and 1.3x respectively. These improvements allow complex system simulations to be executed faster, enabling increases in scale and model complexity
A Novel Detection Refinement Technique for Accurate Identification of Nephrops norvegicus Burrows in Underwater Imagery
With the evolution of the convolutional neural network (CNN), object detection in the
underwater environment has gained a lot of attention. However, due to the complex nature of the
underwater environment, generic CNN-based object detectors still face challenges in underwater
object detection. These challenges include image blurring, texture distortion, color shift, and scale
variation, which result in low precision and recall rates. To tackle this challenge, we propose a
detection refinement algorithm based on spatial–temporal analysis to improve the performance of
generic detectors by suppressing the false positives and recovering the missed detections in underwater
videos. In the proposed work, we use state-of-the-art deep neural networks such as Inception,
ResNet50, and ResNet101 to automatically classify and detect the Norway lobster Nephrops norvegicus
burrows from underwater videos. Nephrops is one of the most important commercial species in
Northeast Atlantic waters, and it lives in burrow systems that it builds itself on muddy bottoms.
To evaluate the performance of proposed framework, we collected the data from the Gulf of Cadiz.
From experiment results, we demonstrate that the proposed framework effectively suppresses false
positives and recovers missed detections obtained from generic detectors. The mean average precision
(mAP) gained a 10% increase with the proposed refinement technique.Versión del edito
Homology modeling in the time of collective and artificial intelligence
Homology modeling is a method for building protein 3D structures using protein primary sequence and utilizing prior knowledge gained from structural similarities with other proteins. The homology modeling process is done in sequential steps where sequence/structure alignment is optimized, then a backbone is built and later, side-chains are added. Once the low-homology loops are modeled, the whole 3D structure is optimized and validated. In the past three decades, a few collective and collaborative initiatives allowed for continuous progress in both homology and ab initio modeling. Critical Assessment of protein Structure Prediction (CASP) is a worldwide community experiment that has historically recorded the progress in this field. Folding@Home and Rosetta@Home are examples of crowd-sourcing initiatives where the community is sharing computational resources, whereas RosettaCommons is an example of an initiative where a community is sharing a codebase for the development of computational algorithms. Foldit is another initiative where participants compete with each other in a protein folding video game to predict 3D structure. In the past few years, contact maps deep machine learning was introduced to the 3D structure prediction process, adding more information and increasing the accuracy of models significantly. In this review, we will take the reader in a journey of exploration from the beginnings to the most recent turnabouts, which have revolutionized the field of homology modeling. Moreover, we discuss the new trends emerging in this rapidly growing field.O
Movement Analytics: Current Status, Application to Manufacturing, and Future Prospects from an AI Perspective
Data-driven decision making is becoming an integral part of manufacturing
companies. Data is collected and commonly used to improve efficiency and
produce high quality items for the customers. IoT-based and other forms of
object tracking are an emerging tool for collecting movement data of
objects/entities (e.g. human workers, moving vehicles, trolleys etc.) over
space and time. Movement data can provide valuable insights like process
bottlenecks, resource utilization, effective working time etc. that can be used
for decision making and improving efficiency.
Turning movement data into valuable information for industrial management and
decision making requires analysis methods. We refer to this process as movement
analytics. The purpose of this document is to review the current state of work
for movement analytics both in manufacturing and more broadly.
We survey relevant work from both a theoretical perspective and an
application perspective. From the theoretical perspective, we put an emphasis
on useful methods from two research areas: machine learning, and logic-based
knowledge representation. We also review their combinations in view of movement
analytics, and we discuss promising areas for future development and
application. Furthermore, we touch on constraint optimization.
From an application perspective, we review applications of these methods to
movement analytics in a general sense and across various industries. We also
describe currently available commercial off-the-shelf products for tracking in
manufacturing, and we overview main concepts of digital twins and their
applications
Crowd detection and counting using a static and dynamic platform: state of the art
Automated object detection and crowd density estimation are popular and important area in visual surveillance research. The last decades witnessed many significant research in this field however, it is still a challenging problem for automatic visual surveillance. The ever increase in research of the field of crowd dynamics and crowd motion necessitates a detailed and updated survey of different techniques and trends in this field. This paper presents a survey on crowd detection and crowd density estimation from moving platform and surveys the different methods employed for this purpose. This review category and delineates several detections and counting estimation methods that have been applied for the examination of scenes from static and moving platforms
- …