17 research outputs found
Towards End-to-End Lane Detection: an Instance Segmentation Approach
Modern cars are incorporating an increasing number of driver assist features,
among which automatic lane keeping. The latter allows the car to properly
position itself within the road lanes, which is also crucial for any subsequent
lane departure or trajectory planning decision in fully autonomous cars.
Traditional lane detection methods rely on a combination of highly-specialized,
hand-crafted features and heuristics, usually followed by post-processing
techniques, that are computationally expensive and prone to scalability due to
road scene variations. More recent approaches leverage deep learning models,
trained for pixel-wise lane segmentation, even when no markings are present in
the image due to their big receptive field. Despite their advantages, these
methods are limited to detecting a pre-defined, fixed number of lanes, e.g.
ego-lanes, and can not cope with lane changes. In this paper, we go beyond the
aforementioned limitations and propose to cast the lane detection problem as an
instance segmentation problem - in which each lane forms its own instance -
that can be trained end-to-end. To parametrize the segmented lane instances
before fitting the lane, we further propose to apply a learned perspective
transformation, conditioned on the image, in contrast to a fixed "bird's-eye
view" transformation. By doing so, we ensure a lane fitting which is robust
against road plane changes, unlike existing approaches that rely on a fixed,
pre-defined transformation. In summary, we propose a fast lane detection
algorithm, running at 50 fps, which can handle a variable number of lanes and
cope with lane changes. We verify our method on the tuSimple dataset and
achieve competitive results
One-Shot Transfer of Affordance Regions? AffCorrs!
In this work, we tackle one-shot visual search of object parts. Given a
single reference image of an object with annotated affordance regions, we
segment semantically corresponding parts within a target scene. We propose
AffCorrs, an unsupervised model that combines the properties of pre-trained
DINO-ViT's image descriptors and cyclic correspondences. We use AffCorrs to
find corresponding affordances both for intra- and inter-class one-shot part
segmentation. This task is more difficult than supervised alternatives, but
enables future work such as learning affordances via imitation and assisted
teleoperation.Comment: Published in Conference on Robot Learning, 2022 For code and dataset,
refer to https://sites.google.com/view/affcorr
Few-Shot Object Detection in Real Life: Case Study on Auto-Harvest
Confinement during COVID-19 has caused serious effects on agriculture all
over the world. As one of the efficient solutions, mechanical
harvest/auto-harvest that is based on object detection and robotic harvester
becomes an urgent need. Within the auto-harvest system, robust few-shot object
detection model is one of the bottlenecks, since the system is required to deal
with new vegetable/fruit categories and the collection of large-scale annotated
datasets for all the novel categories is expensive. There are many few-shot
object detection models that were developed by the community. Yet whether they
could be employed directly for real life agricultural applications is still
questionable, as there is a context-gap between the commonly used training
datasets and the images collected in real life agricultural scenarios. To this
end, in this study, we present a novel cucumber dataset and propose two data
augmentation strategies that help to bridge the context-gap. Experimental
results show that 1) the state-of-the-art few-shot object detection model
performs poorly on the novel `cucumber' category; and 2) the proposed
augmentation strategies outperform the commonly used ones.Comment: 6 page
Learning to count anything: reference-less class-agnostic counting with weak supervision
Current class-agnostic counting methods can generalise to unseen classes but usually require reference images to define the type of object to be counted, as well as instance annotations during training. Reference-less class-agnostic counting is an emerging field that identifies counting as, at its core, a repetition-recognition task. Such methods facilitate counting on a changing set composition. We show that a general feature space with global context can enumerate instances in an image without a prior on the object type present. Specifically, we demonstrate that regression from vision transformer features without point-level supervision or reference images is superior to other reference-less methods and is competitive with methods that use reference images. We show this on the current standard few-shot counting dataset FSC-147. We also propose an improved dataset, FSC-133, which removes errors, ambiguities, and repeated images from FSC-147 and demonstrate similar performance on it. To the best of our knowledge, we are the first weakly-supervised reference-less class-agnostic counting method
Learning to count anything: reference-less class-agnostic counting with weak supervision
Current class-agnostic counting methods can
generalise to unseen classes but usually require reference
images to define the type of object to be counted, as well
as instance annotations during training. Reference-less
class-agnostic counting is an emerging field that identifies
counting as, at its core, a repetition-recognition task. Such
methods facilitate counting on a changing set composition.
We show that a general feature space with global context
can enumerate instances in an image without a prior on
the object type present. Specifically, we demonstrate that
regression from vision transformer features without pointlevel supervision or reference images is superior to other
reference-less methods and is competitive with methods
that use reference images. We show this on the current
standard few-shot counting dataset FSC-147. We also
propose an improved dataset, FSC-133, which removes
errors, ambiguities, and repeated images from FSC-147
and demonstrate similar performance on it. To the best
of our knowledge, we are the first weakly-supervised
reference-less class-agnostic counting method