626 research outputs found
The Right (Angled) Perspective: Improving the Understanding of Road Scenes Using Boosted Inverse Perspective Mapping
Many tasks performed by autonomous vehicles such as road marking detection,
object tracking, and path planning are simpler in bird's-eye view. Hence,
Inverse Perspective Mapping (IPM) is often applied to remove the perspective
effect from a vehicle's front-facing camera and to remap its images into a 2D
domain, resulting in a top-down view. Unfortunately, however, this leads to
unnatural blurring and stretching of objects at further distance, due to the
resolution of the camera, limiting applicability. In this paper, we present an
adversarial learning approach for generating a significantly improved IPM from
a single camera image in real time. The generated bird's-eye-view images
contain sharper features (e.g. road markings) and a more homogeneous
illumination, while (dynamic) objects are automatically removed from the scene,
thus revealing the underlying road layout in an improved fashion. We
demonstrate our framework using real-world data from the Oxford RobotCar
Dataset and show that scene understanding tasks directly benefit from our
boosted IPM approach.Comment: equal contribution of first two authors, 8 full pages, 6 figures,
accepted at IV 201
Intelligent Generation of Graphical Game Assets: A Conceptual Framework and Systematic Review of the State of the Art
Procedural content generation (PCG) can be applied to a wide variety of tasks
in games, from narratives, levels and sounds, to trees and weapons. A large
amount of game content is comprised of graphical assets, such as clouds,
buildings or vegetation, that do not require gameplay function considerations.
There is also a breadth of literature examining the procedural generation of
such elements for purposes outside of games. The body of research, focused on
specific methods for generating specific assets, provides a narrow view of the
available possibilities. Hence, it is difficult to have a clear picture of all
approaches and possibilities, with no guide for interested parties to discover
possible methods and approaches for their needs, and no facility to guide them
through each technique or approach to map out the process of using them.
Therefore, a systematic literature review has been conducted, yielding 200
accepted papers. This paper explores state-of-the-art approaches to graphical
asset generation, examining research from a wide range of applications, inside
and outside of games. Informed by the literature, a conceptual framework has
been derived to address the aforementioned gaps
YoloCurvSeg: You Only Label One Noisy Skeleton for Vessel-style Curvilinear Structure Segmentation
Weakly-supervised learning (WSL) has been proposed to alleviate the conflict
between data annotation cost and model performance through employing
sparsely-grained (i.e., point-, box-, scribble-wise) supervision and has shown
promising performance, particularly in the image segmentation field. However,
it is still a very challenging problem due to the limited supervision,
especially when only a small number of labeled samples are available.
Additionally, almost all existing WSL segmentation methods are designed for
star-convex structures which are very different from curvilinear structures
such as vessels and nerves. In this paper, we propose a novel sparsely
annotated segmentation framework for curvilinear structures, named YoloCurvSeg,
based on image synthesis. A background generator delivers image backgrounds
that closely match real distributions through inpainting dilated skeletons. The
extracted backgrounds are then combined with randomly emulated curves generated
by a Space Colonization Algorithm-based foreground generator and through a
multilayer patch-wise contrastive learning synthesizer. In this way, a
synthetic dataset with both images and curve segmentation labels is obtained,
at the cost of only one or a few noisy skeleton annotations. Finally, a
segmenter is trained with the generated dataset and possibly an unlabeled
dataset. The proposed YoloCurvSeg is evaluated on four publicly available
datasets (OCTA500, CORN, DRIVE and CHASEDB1) and the results show that
YoloCurvSeg outperforms state-of-the-art WSL segmentation methods by large
margins. With only one noisy skeleton annotation (respectively 0.14%, 0.03%,
1.40%, and 0.65% of the full annotation), YoloCurvSeg achieves more than 97% of
the fully-supervised performance on each dataset. Code and datasets will be
released at https://github.com/llmir/YoloCurvSeg.Comment: 11 pages, 10 figures, submitted to IEEE Transactions on Medical
Imaging (TMI
Practical License Plate Recognition in Unconstrained Surveillance Systems with Adversarial Super-Resolution
Although most current license plate (LP) recognition applications have been
significantly advanced, they are still limited to ideal environments where
training data are carefully annotated with constrained scenes. In this paper,
we propose a novel license plate recognition method to handle unconstrained
real world traffic scenes. To overcome these difficulties, we use adversarial
super-resolution (SR), and one-stage character segmentation and recognition.
Combined with a deep convolutional network based on VGG-net, our method
provides simple but reasonable training procedure. Moreover, we introduce
GIST-LP, a challenging LP dataset where image samples are effectively collected
from unconstrained surveillance scenes. Experimental results on AOLP and
GIST-LP dataset illustrate that our method, without any scene-specific
adaptation, outperforms current LP recognition approaches in accuracy and
provides visual enhancement in our SR results that are easier to understand
than original data.Comment: Accepted at VISAPP, 201
Generative Adversarial Networks Based Scene Generation on Indian Driving Dataset
The rate of advancement in the field of artificial intelligence (AI) has drastically increased over the past twenty years or so. From AI models that can classify every object in an image to realistic chatbots, the signs of progress can be found in all fields. This work focused on tackling a relatively new problem in the current scenario-generative capabilities of AI. While the classification and prediction models have matured and entered the mass market across the globe, generation through AI is still in its initial stages. Generative tasks consist of an AI model learning the features of a given input and using these learned values to generate completely new output values that were not originally part of the input dataset. The most common input type given to generative models are images. The most popular architectures for generative models are autoencoders and generative adversarial networks (GANs). Our study aimed to use GANs to generate realistic images from a purely semantic representation of a scene. While our model can be used on any kind of scene, we used the Indian Driving Dataset to train our model. Through this work, we could arrive at answers to the following questions: (1) the scope of GANs in interpreting and understanding textures and variables in complex scenes; (2) the application of such a model in the field of gaming and virtual reality; (3) the possible impact of generating realistic deep fakes on society
Generative Adversarial Networks Based Scene Generation on Indian Driving Dataset
The rate of advancement in the field of artificial intelligence (AI) has drastically increased over the past twenty years or so. From AI models that can classify every object in an image to realistic chatbots, the signs of progress can be found in all fields. This work focused on tackling a relatively new problem in the current scenario-generative capabilities of AI. While the classification and prediction models have matured and entered the mass market across the globe, generation through AI is still in its initial stages. Generative tasks consist of an AI model learning the features of a given input and using these learned values to generate completely new output values that were not originally part of the input dataset. The most common input type given to generative models are images. The most popular architectures for generative models are autoencoders and generative adversarial networks (GANs). Our study aimed to use GANs to generate realistic images from a purely semantic representation of a scene. While our model can be used on any kind of scene, we used the Indian Driving Dataset to train our model. Through this work, we could arrive at answers to the following questions: (1) the scope of GANs in interpreting and understanding textures and variables in complex scenes; (2) the application of such a model in the field of gaming and virtual reality; (3) the possible impact of generating realistic deep fakes on society
From Model-Based to Data-Driven Simulation: Challenges and Trends in Autonomous Driving
Simulation is an integral part in the process of developing autonomous
vehicles and advantageous for training, validation, and verification of driving
functions. Even though simulations come with a series of benefits compared to
real-world experiments, various challenges still prevent virtual testing from
entirely replacing physical test-drives. Our work provides an overview of these
challenges with regard to different aspects and types of simulation and
subsumes current trends to overcome them. We cover aspects around perception-,
behavior- and content-realism as well as general hurdles in the domain of
simulation. Among others, we observe a trend of data-driven, generative
approaches and high-fidelity data synthesis to increasingly replace model-based
simulation.Comment: Ferdinand M\"utsch, Helen Gremmelmaier, and Nicolas Becker
contributed equally. Accepted for publication at CVPR 2023 VCAD worksho
- …