Search CORE

4 research outputs found

IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting

Author: de With Peter H. N.
Houben Tim
Onvlee Hans
Schoonbeek Tim J.
van der Sommen Fons
Publication venue
Publication date: 26/10/2023
Field of study

Although action recognition for procedural tasks has received notable attention, it has a fundamental flaw in that no measure of success for actions is provided. This limits the applicability of such systems especially within the industrial domain, since the outcome of procedural actions is often significantly more important than the mere execution. To address this limitation, we define the novel task of procedure step recognition (PSR), focusing on recognizing the correct completion and order of procedural steps. Alongside the new task, we also present the multi-modal IndustReal dataset. Unlike currently available datasets, IndustReal contains procedural errors (such as omissions) as well as execution errors. A significant part of these errors are exclusively present in the validation and test sets, making IndustReal suitable to evaluate robustness of algorithms to new, unseen mistakes. Additionally, to encourage reproducibility and allow for scalable approaches trained on synthetic data, the 3D models of all parts are publicly available. Annotations and benchmark performance are provided for action recognition and assembly state detection, as well as the new PSR task. IndustReal, along with the code and model weights, is available at: https://github.com/TimSchoonbeek/IndustReal .Comment: Accepted for WACV 2024. 15 pages, 9 figures, including supplementary material

arXiv.org e-Print Archive

Learning to Predict Collision Risk from Simulated Video Data

Author: Abdolhay Hamid R.
Dubbelman Gijs
Piva Fabrizio
Schoonbeek Tim J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/07/2022
Field of study

We propose an image-based collision risk prediction model and a training strategy that allows training on simulated video data and successfully generalizes to real data. By doing so, we solve the data scarcity problem of collecting and labeling real (near) collisions, which are exceptionally rare events. Domain generalization from simulated to real data is taken into account by design by decoupling the learning strategy, and using task-specific, domain-resilient intermediate representations. Specifically, we use optical flow and vehicle bounding boxes, since they are instinctively related to the task of collision risk prediction and because their simulated-to-real domain gap is significantly lower than that of camera video data, i.e., they are more domain resilient. To demonstrate our approach, we present RiskNet, a novel neural network for image-based collision risk prediction, which classifies individual frames of a video sequence of a front-facing camera as safe or unsafe. Additionally, we present two novel datasets: the simulated Prescan dataset (which we intend to make publicly available) for training and the YouTube Driving Incidents Database (YDID) for real-world testing. The performance of RiskNet, trained solely on simulated data and tested on the real-world YDID, is comparable to that of a human driver, both in accuracy (91.8% vs. 93.6%) and F1-score (0.92 vs 0.94)

Pure OAI Repository

Beyond Action Recognition: Extracting Meaningful Information from Procedure Recordings

Author: de With Peter H.N.
Frisco Pierluigi
Onvlee Hans
Schoonbeek Tim J.
van der Sommen Fons
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2023
Field of study

Understanding procedural actions is important, as it can be used to automatically analyze the execution of a procedure and provide assistance to users by warning for potential mistakes or forgotten steps. However, current approaches require a rigid, step-by-step execution order, laborious and impractical datasets. Furthermore, they are unreliable to variations in viewpoint, or measure the performance of actions rather than the actual completion of actions. To address these limitations and stimulate research in this field, this work proposes the novel task of procedure state recognition (PSR) together with a set of evaluation metrics

Pure OAI Repository

Augmented Reality for Automatically Generating Robust Manufacturing and Maintenance Logs

Author: de With Peter H.N.
Frisco Pierluigi
Onvlee Hans
Schoonbeek Tim J.
van der Sommen Fons
Publication venue: 'Society for Imaging Science & Technology'
Publication date: 08/07/2022
Field of study

Logs describing the execution of procedural steps during manufacturing and maintenance tasks are important for quality control and configuration management. Such logs are currently hand-written or typed during a procedure, which requires engineers to frequently step away from their work and results in difficulties for searching and optimizing logs. In this paper, we propose to automatically generate standardized, searchable logs, by visually perceiving and monitoring the progress of the procedure in real-time, and comparing this to the expected procedure. Unlike related work, we propose an approach which does not restrict the engineers to rigid, sequential sequences and instead allows them to execute procedures in a variety of different sequences where possible. The proposed framework is experimentally validated on the task of (dis)assembling a Duplo block model and operates properly when occlusions are absent

Pure OAI Repository