28 research outputs found
A Large-Scale Re-identification Analysis in Sporting Scenarios: the Betrayal of Reaching a Critical Point
Re-identifying participants in ultra-distance running competitions can be
daunting due to the extensive distances and constantly changing terrain. To
overcome these challenges, computer vision techniques have been developed to
analyze runners' faces, numbers on their bibs, and clothing. However, our study
presents a novel gait-based approach for runners' re-identification (re-ID) by
leveraging various pre-trained human action recognition (HAR) models and loss
functions. Our results show that this approach provides promising results for
re-identifying runners in ultra-distance competitions. Furthermore, we
investigate the significance of distinct human body movements when athletes are
approaching their endurance limits and their potential impact on re-ID
accuracy. Our study examines how the recognition of a runner's gait is affected
by a competition's critical point (CP), defined as a moment of severe fatigue
and the point where the finish line comes into view, just a few kilometers away
from this location. We aim to determine how this CP can improve the accuracy of
athlete re-ID. Our experimental results demonstrate that gait recognition can
be significantly enhanced (up to a 9% increase in mAP) as athletes approach
this point. This highlights the potential of utilizing gait recognition in
real-world scenarios, such as ultra-distance competitions or long-duration
surveillance tasks.Comment: Accepted at 7th International Joint Conference on Biometrics (IJCB
2023
An X3D Neural Network Analysis for Runner's Performance Assessment in a Wild Sporting Environment
We present a transfer learning analysis on a sporting environment of the
expanded 3D (X3D) neural networks. Inspired by action quality assessment
methods in the literature, our method uses an action recognition network to
estimate athletes' cumulative race time (CRT) during an ultra-distance
competition. We evaluate the performance considering the X3D, a family of
action recognition networks that expand a small 2D image classification
architecture along multiple network axes, including space, time, width, and
depth. We demonstrate that the resulting neural network can provide remarkable
performance for short input footage, with a mean absolute error of 12 minutes
and a half when estimating the CRT for runners who have been active from 8 to
20 hours. Our most significant discovery is that X3D achieves state-of-the-art
performance while requiring almost seven times less memory to achieve better
precision than previous work.Comment: Accepted to the 18th International Conference on Machine Vision
Applications (MVA 2023
Personal Guides: Heterogeneous Robots Sharing Personal Tours in Multi-Floor Environments
GidaBot is an application designed to setup and run a heterogeneous team of robots to act as tour guides in multi-floor buildings. Although the tours can go through several floors, the robots can only service a single floor, and thus, a guiding task may require collaboration among several robots. The designed system makes use of a robust inter-robot communication strategy to share goals and paths during the guiding tasks. Such tours work as personal services carried out by one or more robots. In this paper, a face re-identification/verification module based on state-of-the-art techniques is developed, evaluated offline, and integrated into GidaBot鈥檚 real daily activities, to avoid new visitors interfering with those attended. It is a complex problem because, as users are casual visitors, no long-term information is stored, and consequently, faces are unknown in the training step. Initially, re-identification and verification are evaluated offline considering different face detectors and computing distances in a face embedding representation. To fulfil the goal online, several face detectors are fused in parallel to avoid face alignment bias produced by face detectors under certain circumstances, and the decision is made based on a minimum distance criterion. This fused approach outperforms any individual method and highly improves the real system鈥檚 reliability, as the tests carried out using real robots at the Faculty of Informatics in San Sebastian show.This work has been partially funded by the Basque Government, Spain, grant number IT900-16, and the Spanish Ministry of Economy and Competitiveness (MINECO), grant number RTI2018-093337-B-I00
Competition on Counter Measures to 2-D Facial Spoofing Attacks
Spoofing identities using photographs is one of the most common techniques to attack 2-D face recognition systems. There seems to exist no comparative studies of different techniques using the same protocols and data. The motivation behind this competition is to compare the performance of different state-of-the-art algorithms on the same database using a unique evaluation method. Six different teams from universities around the world have participated in the contest. Use of one or multiple techniques from motion, texture analysis and liveness detection appears to be the common trend in this competition. Most of the algorithms are able to clearly separate spoof attempts from real accesses. The results suggest the investigation of more complex attacks
Improving Automatic Pedestrian Detection by means of Human Perception
Automatic detection systems do not perform as well as human observers, even on simple detection tasks. A potential solution to this problem is training vision systems on appropriate regions of interests (ROIs), in contrast to training on predefined regions. Here we focus on detecting pedestrians in static scenes. Can automatic vision systems for pedestrian detection be improved by training them on perceptually-defined ROIs? Results I: Comparison of default detectors and p-ROI detectors p-ROI Detectors [1301/1488
Decontextualized I3D ConvNet for ultra-distance runners performance analysis at a glance
In May 2021, the site runnersworld.com published that participation in
ultra-distance races has increased by 1,676% in the last 23 years. Moreover,
nearly 41% of those runners participate in more than one race per year. The
development of wearable devices has undoubtedly contributed to motivating
participants by providing performance measures in real-time. However, we
believe there is room for improvement, particularly from the organizers point
of view. This work aims to determine how the runners performance can be
quantified and predicted by considering a non-invasive technique focusing on
the ultra-running scenario. In this sense, participants are captured when they
pass through a set of locations placed along the race track. Each footage is
considered an input to an I3D ConvNet to extract the participant's running gait
in our work. Furthermore, weather and illumination capture conditions or
occlusions may affect these footages due to the race staff and other runners.
To address this challenging task, we have tracked and codified the
participant's running gait at some RPs and removed the context intending to
ensure a runner-of-interest proper evaluation. The evaluation suggests that the
features extracted by an I3D ConvNet provide enough information to estimate the
participant's performance along the different race tracks.Comment: Accepted at 21st International Conference on Image Analysis and
Processing (ICIAP 2021
Article On the Use of Simple Geometric Descriptors Provided by RGB-D Sensors for Re-Identification
sensor
Deep learning for source camera identification on mobile devices
In the present paper, we propose a source camera identification method for
mobile devices based on deep learning. Recently, convolutional neural networks
(CNNs) have shown a remarkable performance on several tasks such as image
recognition, video analysis or natural language processing. A CNN consists on a
set of layers where each layer is composed by a set of high pass filters which
are applied all over the input image. This convolution process provides the
unique ability to extract features automatically from data and to learn from
those features. Our proposal describes a CNN architecture which is able to
infer the noise pattern of mobile camera sensors (also known as camera
fingerprint) with the aim at detecting and identifying not only the mobile
device used to capture an image (with a 98\% of accuracy), but also from which
embedded camera the image was captured. More specifically, we provide an
extensive analysis on the proposed architecture considering different
configurations. The experiment has been carried out using the images captured
from different mobile devices cameras (MICHE-I Dataset was used) and the
obtained results have proved the robustness of the proposed method.Comment: 15 pages single column, 9 figure