14,914 research outputs found

    Proceedings of the 2020 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory

    Get PDF
    In 2020 fand der jährliche Workshop des Faunhofer IOSB und the Lehrstuhls für interaktive Echtzeitsysteme statt. Vom 27. bis zum 31. Juli trugen die Doktorranden der beiden Institute über den Stand ihrer Forschung vor in Themen wie KI, maschinellen Lernen, computer vision, usage control, Metrologie vor. Die Ergebnisse dieser Vorträge sind in diesem Band als technische Berichte gesammelt

    Sensing and perception technology to enable real time monitoring of passenger movement behaviours through congested rail stations

    Full text link
    © 2015 ATRF, Commonwealth of Australia. All rights reserved. Passenger behaviour can have a range of effects on rail operations from negative to positive. While rail service providers strive to design and operate systems in a manner that promotes positive passenger behaviour, congestion is a confounding factor, which can create responses that may undermine these efforts. The real time monitoring of passenger movement and behaviour through public transport environments including precincts, concourses, platforms and train vestibules would enable operators to more effectively manage congestion at a whole-of-station level. While existing crowd monitoring technologies allow operators to monitor crowd densities at critical locations and react to overcrowding incidents, they do not necessarily provide an understanding of the cause of such issues. Congestion is a complex phenomenon involving the movements of many people though a set of spaces and monitoring these spaces requires tracking large numbers of individuals. To do this, traditional surveillance technologies might be used but at the expense of introducing privacy concerns. Scalability is also a problem, as complete sensor coverage of entire rail station precinct, concourse and platform areas potentially requires a high number of sensors, increasing costs. In light of this, there is a need for sensing technology that collects data from a set of ‘sparse sensors’, each with a limited field of view, but which is capable of forming a network that can track the movement and behaviour of high numbers of associated individuals in a privacy sensitive manner. This paper presents work towards the core crowd sensing and perception technology needed to enable such a capability. Building on previous research using three-dimensional (3D) depth camera data for person detection, a privacy friendly approach to tracking and recognising individuals is discussed. The use of a head-to-shoulder signature is proposed to enable association between sensors. Our efforts to improve the reliability of this measure for this task are outlined and validated using data captured at Brisbane Central rail station

    High-Level Information Fusion in Visual Sensor Networks

    Get PDF
    Information fusion techniques combine data from multiple sensors, along with additional information and knowledge, to obtain better estimates of the observed scenario than could be achieved by the use of single sensors or information sources alone. According to the JDL fusion process model, high-level information fusion is concerned with the computation of a scene representation in terms of abstract entities such as activities and threats, as well as estimating the relationships among these entities. Recent experiences confirm that context knowledge plays a key role in the new-generation high-level fusion systems, especially in those involving complex scenarios that cause the failure of classical statistical techniques –as it happens in visual sensor networks. In this chapter, we study the architectural and functional issues of applying context information to improve high-level fusion procedures, with a particular focus on visual data applications. The use of formal knowledge representations (e.g. ontologies) is a promising advance in this direction, but there are still some unresolved questions that must be more extensively researched.The UC3M Team gratefully acknowledges that this research activity is supported in part by Projects CICYT TIN2008-06742-C02-02/TSI, CICYT TEC2008-06732-C02-02/TEC, CAM CONTEXTS (S2009/TIC-1485) and DPS2008-07029-C02-02

    World Modeling for Intelligent Autonomous Systems

    Get PDF
    The functioning of intelligent autonomous systems requires constant situation awareness and cognition analysis. Thus, it needs a memory structure that contains a description of the surrounding environment (world model) and serves as a central information hub. This book presents a row of theoretical and experimental results in the field of world modeling. This includes areas of dynamic and prior knowledge modeling, information fusion, management and qualitative/quantitative information analysis

    ConfLab: A Rich Multimodal Multisensor Dataset of Free-Standing Social Interactions in the Wild

    Full text link
    Recording the dynamics of unscripted human interactions in the wild is challenging due to the delicate trade-offs between several factors: participant privacy, ecological validity, data fidelity, and logistical overheads. To address these, following a 'datasets for the community by the community' ethos, we propose the Conference Living Lab (ConfLab): a new concept for multimodal multisensor data collection of in-the-wild free-standing social conversations. For the first instantiation of ConfLab described here, we organized a real-life professional networking event at a major international conference. Involving 48 conference attendees, the dataset captures a diverse mix of status, acquaintance, and networking motivations. Our capture setup improves upon the data fidelity of prior in-the-wild datasets while retaining privacy sensitivity: 8 videos (1920x1080, 60 fps) from a non-invasive overhead view, and custom wearable sensors with onboard recording of body motion (full 9-axis IMU), privacy-preserving low-frequency audio (1250 Hz), and Bluetooth-based proximity. Additionally, we developed custom solutions for distributed hardware synchronization at acquisition, and time-efficient continuous annotation of body keypoints and actions at high sampling rates. Our benchmarks showcase some of the open research tasks related to in-the-wild privacy-preserving social data analysis: keypoints detection from overhead camera views, skeleton-based no-audio speaker detection, and F-formation detection.Comment: v2 is the version submitted to Neurips 2022 Datasets and Benchmarks Trac

    World Modeling for Intelligent Autonomous Systems

    Get PDF
    The functioning of intelligent autonomous systems requires constant situation awareness and cognition analysis. Thus, it needs a memory structure that contains a description of the surrounding environment (world model) and serves as a central information hub. This book presents a row of theoretical and experimental results in the field of world modeling. This includes areas of dynamic and prior knowledge modeling, information fusion, management and qualitative/quantitative information analysis

    Deep learning of appearance affinity for multi-object tracking and re-identification: a comparative view

    Get PDF
    Recognizing the identity of a query individual in a surveillance sequence is the core of Multi-Object Tracking (MOT) and Re-Identification (Re-Id) algorithms. Both tasks can be addressed by measuring the appearance affinity between people observations with a deep neural model. Nevertheless, the differences in their specifications and, consequently, in the characteristics and constraints of the available training data for each one of these tasks, arise from the necessity of employing different learning approaches to attain each one of them. This article offers a comparative view of the Double-Margin-Contrastive and the Triplet loss function, and analyzes the benefits and drawbacks of applying each one of them to learn an Appearance Affinity model for Tracking and Re-Identification. A batch of experiments have been conducted, and their results support the hypothesis concluded from the presented study: Triplet loss function is more effective than the Contrastive one when an Re-Id model is learnt, and, conversely, in the MOT domain, the Contrastive loss can better discriminate between pairs of images rendering the same person or not.This research was funded by the Spanish Government through the CICYT projects (TRA2016-78886-C3-1-R and RTI2018-096036-B-C21), Universidad Carlos III of Madrid through (PEAVAUTO-CM-UC3M), the Comunidad de Madrid through SEGVAUTO-4.0-CM (P2018/EMT-4362), and the Ministerio de Educación, Cultura y Deporte para la Formación de Profesorado Universitario (FPU14/02143)
    corecore