10,134 research outputs found
Attention Allocation Aid for Visual Search
This paper outlines the development and testing of a novel, feedback-enabled
attention allocation aid (AAAD), which uses real-time physiological data to
improve human performance in a realistic sequential visual search task. Indeed,
by optimizing over search duration, the aid improves efficiency, while
preserving decision accuracy, as the operator identifies and classifies targets
within simulated aerial imagery. Specifically, using experimental eye-tracking
data and measurements about target detectability across the human visual field,
we develop functional models of detection accuracy as a function of search
time, number of eye movements, scan path, and image clutter. These models are
then used by the AAAD in conjunction with real time eye position data to make
probabilistic estimations of attained search accuracy and to recommend that the
observer either move on to the next image or continue exploring the present
image. An experimental evaluation in a scenario motivated from human
supervisory control in surveillance missions confirms the benefits of the AAAD.Comment: To be presented at the ACM CHI conference in Denver, Colorado in May
201
ViZDoom Competitions: Playing Doom from Pixels
This paper presents the first two editions of Visual Doom AI Competition,
held in 2016 and 2017. The challenge was to create bots that compete in a
multi-player deathmatch in a first-person shooter (FPS) game, Doom. The bots
had to make their decisions based solely on visual information, i.e., a raw
screen buffer. To play well, the bots needed to understand their surroundings,
navigate, explore, and handle the opponents at the same time. These aspects,
together with the competitive multi-agent aspect of the game, make the
competition a unique platform for evaluating the state of the art reinforcement
learning algorithms. The paper discusses the rules, solutions, results, and
statistics that give insight into the agents' behaviors. Best-performing agents
are described in more detail. The results of the competition lead to the
conclusion that, although reinforcement learning can produce capable Doom bots,
they still are not yet able to successfully compete against humans in this
game. The paper also revisits the ViZDoom environment, which is a flexible,
easy to use, and efficient 3D platform for research for vision-based
reinforcement learning, based on a well-recognized first-person perspective
game Doom
Handgun detection using combined human pose and weapon appearance
Closed-circuit television (CCTV) systems are essential nowadays to prevent
security threats or dangerous situations, in which early detection is crucial.
Novel deep learning-based methods have allowed to develop automatic weapon
detectors with promising results. However, these approaches are mainly based on
visual weapon appearance only. For handguns, body pose may be a useful cue,
especially in cases where the gun is barely visible. In this work, a novel
method is proposed to combine, in a single architecture, both weapon appearance
and human pose information. First, pose keypoints are estimated to extract hand
regions and generate binary pose images, which are the model inputs. Then, each
input is processed in different subnetworks and combined to produce the handgun
bounding box. Results obtained show that the combined model improves the
handgun detection state of the art, achieving from 4.23 to 18.9 AP points more
than the best previous approach.Comment: 17 pages, 18 figure
Weak Supervision for Label Efficient Visual Bug Detection
As video games evolve into expansive, detailed worlds, visual quality becomes
essential, yet increasingly challenging. Traditional testing methods, limited
by resources, face difficulties in addressing the plethora of potential bugs.
Machine learning offers scalable solutions; however, heavy reliance on large
labeled datasets remains a constraint. Addressing this challenge, we propose a
novel method, utilizing unlabeled gameplay and domain-specific augmentations to
generate datasets & self-supervised objectives used during pre-training or
multi-task settings for downstream visual bug detection. Our methodology uses
weak-supervision to scale datasets for the crafted objectives and facilitates
both autonomous and interactive weak-supervision, incorporating unsupervised
clustering and/or an interactive approach based on text and geometric prompts.
We demonstrate on first-person player clipping/collision bugs (FPPC) within the
expansive Giantmap game world, that our approach is very effective, improving
over a strong supervised baseline in a practical, very low-prevalence, low data
regime (0.336 0.550 F1 score). With just 5 labeled "good"
exemplars (i.e., 0 bugs), our self-supervised objective alone captures enough
signal to outperform the low-labeled supervised settings. Building on
large-pretrained vision models, our approach is adaptable across various visual
bugs. Our results suggest applicability in curating datasets for broader image
and video tasks within video games beyond visual bugs.Comment: Accepted to BMVC 2023: Workshop on Computer Vision for Games and
Games for Computer Vision (CVG). 9 page
Real-time gun detection in CCTV: An open problem
Object detectors have improved in recent years, obtaining better results and faster inference time. However, small object detection is still a problem that has not yet a definitive solution. The autonomous weapons detection on Closed-circuit television (CCTV) has been studied recently, being extremely useful in the field of security, counter-terrorism, and risk mitigation. This article presents a new dataset obtained from a real CCTV installed in a university and the generation of synthetic images, to which Faster R-CNN was applied using Feature Pyramid Network with ResNet-50 resulting in a weapon detection model able to be used in quasi real-time CCTV (90 ms of inference time with an NVIDIA GeForce GTX-1080Ti card) improving the state of the art on weapon detection in a two stages training. In this work, an exhaustive experimental study of the detector with these datasets was performed, showing the impact of synthetic datasets on the training of weapons detection systems, as well as the main limitations that these systems present nowadays. The generated synthetic dataset and the real CCTV dataset are available to the whole research community.Ministerio de Economía y Competitividad TIN2017-82113-C2-1-
- …