17 research outputs found

    Towards End-to-End Lane Detection: an Instance Segmentation Approach

    Full text link
    Modern cars are incorporating an increasing number of driver assist features, among which automatic lane keeping. The latter allows the car to properly position itself within the road lanes, which is also crucial for any subsequent lane departure or trajectory planning decision in fully autonomous cars. Traditional lane detection methods rely on a combination of highly-specialized, hand-crafted features and heuristics, usually followed by post-processing techniques, that are computationally expensive and prone to scalability due to road scene variations. More recent approaches leverage deep learning models, trained for pixel-wise lane segmentation, even when no markings are present in the image due to their big receptive field. Despite their advantages, these methods are limited to detecting a pre-defined, fixed number of lanes, e.g. ego-lanes, and can not cope with lane changes. In this paper, we go beyond the aforementioned limitations and propose to cast the lane detection problem as an instance segmentation problem - in which each lane forms its own instance - that can be trained end-to-end. To parametrize the segmented lane instances before fitting the lane, we further propose to apply a learned perspective transformation, conditioned on the image, in contrast to a fixed "bird's-eye view" transformation. By doing so, we ensure a lane fitting which is robust against road plane changes, unlike existing approaches that rely on a fixed, pre-defined transformation. In summary, we propose a fast lane detection algorithm, running at 50 fps, which can handle a variable number of lanes and cope with lane changes. We verify our method on the tuSimple dataset and achieve competitive results

    One-Shot Transfer of Affordance Regions? AffCorrs!

    Get PDF
    In this work, we tackle one-shot visual search of object parts. Given a single reference image of an object with annotated affordance regions, we segment semantically corresponding parts within a target scene. We propose AffCorrs, an unsupervised model that combines the properties of pre-trained DINO-ViT's image descriptors and cyclic correspondences. We use AffCorrs to find corresponding affordances both for intra- and inter-class one-shot part segmentation. This task is more difficult than supervised alternatives, but enables future work such as learning affordances via imitation and assisted teleoperation.Comment: Published in Conference on Robot Learning, 2022 For code and dataset, refer to https://sites.google.com/view/affcorr

    Few-Shot Object Detection in Real Life: Case Study on Auto-Harvest

    Full text link
    Confinement during COVID-19 has caused serious effects on agriculture all over the world. As one of the efficient solutions, mechanical harvest/auto-harvest that is based on object detection and robotic harvester becomes an urgent need. Within the auto-harvest system, robust few-shot object detection model is one of the bottlenecks, since the system is required to deal with new vegetable/fruit categories and the collection of large-scale annotated datasets for all the novel categories is expensive. There are many few-shot object detection models that were developed by the community. Yet whether they could be employed directly for real life agricultural applications is still questionable, as there is a context-gap between the commonly used training datasets and the images collected in real life agricultural scenarios. To this end, in this study, we present a novel cucumber dataset and propose two data augmentation strategies that help to bridge the context-gap. Experimental results show that 1) the state-of-the-art few-shot object detection model performs poorly on the novel `cucumber' category; and 2) the proposed augmentation strategies outperform the commonly used ones.Comment: 6 page

    Learning to count anything: reference-less class-agnostic counting with weak supervision

    Get PDF
    Current class-agnostic counting methods can generalise to unseen classes but usually require reference images to define the type of object to be counted, as well as instance annotations during training. Reference-less class-agnostic counting is an emerging field that identifies counting as, at its core, a repetition-recognition task. Such methods facilitate counting on a changing set composition. We show that a general feature space with global context can enumerate instances in an image without a prior on the object type present. Specifically, we demonstrate that regression from vision transformer features without point-level supervision or reference images is superior to other reference-less methods and is competitive with methods that use reference images. We show this on the current standard few-shot counting dataset FSC-147. We also propose an improved dataset, FSC-133, which removes errors, ambiguities, and repeated images from FSC-147 and demonstrate similar performance on it. To the best of our knowledge, we are the first weakly-supervised reference-less class-agnostic counting method

    Learning to count anything: reference-less class-agnostic counting with weak supervision

    Get PDF
    Current class-agnostic counting methods can generalise to unseen classes but usually require reference images to define the type of object to be counted, as well as instance annotations during training. Reference-less class-agnostic counting is an emerging field that identifies counting as, at its core, a repetition-recognition task. Such methods facilitate counting on a changing set composition. We show that a general feature space with global context can enumerate instances in an image without a prior on the object type present. Specifically, we demonstrate that regression from vision transformer features without pointlevel supervision or reference images is superior to other reference-less methods and is competitive with methods that use reference images. We show this on the current standard few-shot counting dataset FSC-147. We also propose an improved dataset, FSC-133, which removes errors, ambiguities, and repeated images from FSC-147 and demonstrate similar performance on it. To the best of our knowledge, we are the first weakly-supervised reference-less class-agnostic counting method
    corecore