187 research outputs found
Generate Your Own Scotland: Satellite Image Generation Conditioned on Maps
Despite recent advancements in image generation, diffusion models still
remain largely underexplored in Earth Observation. In this paper we show that
state-of-the-art pretrained diffusion models can be conditioned on cartographic
data to generate realistic satellite images. We provide two large datasets of
paired OpenStreetMap images and satellite views over the region of Mainland
Scotland and the Central Belt. We train a ControlNet model and qualitatively
evaluate the results, demonstrating that both image quality and map fidelity
are possible. Finally, we provide some insights on the opportunities and
challenges of applying these models for remote sensing. Our model weights and
code for creating the dataset are publicly available at
https://github.com/miquel-espinosa/map-sat.Comment: 13 pages, 6 figures. preprin
Plug and Play Active Learning for Object Detection
Annotating data for supervised learning is expensive and tedious, and we want
to do as little of it as possible. To make the most of a given "annotation
budget" we can turn to active learning (AL) which aims to identify the most
informative samples in a dataset for annotation. Active learning algorithms are
typically uncertainty-based or diversity-based. Both have seen success in image
classification, but fall short when it comes to object detection. We
hypothesise that this is because: (1) it is difficult to quantify uncertainty
for object detection as it consists of both localisation and classification,
where some classes are harder to localise, and others are harder to classify;
(2) it is difficult to measure similarities for diversity-based AL when images
contain different numbers of objects. We propose a two-stage active learning
algorithm Plug and Play Active Learning (PPAL) that overcomes these
difficulties. It consists of (1) Difficulty Calibrated Uncertainty Sampling, in
which we used a category-wise difficulty coefficient that takes both
classification and localisation into account to re-weight object uncertainties
for uncertainty-based sampling; (2) Category Conditioned Matching Similarity to
compute the similarities of multi-instance images as ensembles of their
instance similarities. PPAL is highly generalisable because it makes no change
to model architectures or detector training pipelines. We benchmark PPAL on the
MS-COCO and Pascal VOC datasets using different detector architectures and show
that our method outperforms the prior state-of-the-art. Code is available at
https://github.com/ChenhongyiYang/PPA
Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning
The goal of contrastive learning based pre-training is to leverage large
quantities of unlabeled data to produce a model that can be readily adapted
downstream. Current approaches revolve around solving an image discrimination
task: given an anchor image, an augmented counterpart of that image, and some
other images, the model must produce representations such that the distance
between the anchor and its counterpart is small, and the distances between the
anchor and the other images are large. There are two significant problems with
this approach: (i) by contrasting representations at the image-level, it is
hard to generate detailed object-sensitive features that are beneficial to
downstream object-level tasks such as instance segmentation; (ii) the
augmentation strategy of producing an augmented counterpart is fixed, making
learning less effective at the later stages of pre-training. In this work, we
introduce Curricular Contrastive Object-level Pre-training (CCOP) to tackle
these problems: (i) we use selective search to find rough object regions and
use them to build an inter-image object-level contrastive loss and an
intra-image object-level discrimination loss into our pre-training objective;
(ii) we present a curriculum learning mechanism that adaptively augments the
generated regions, which allows the model to consistently acquire a useful
learning signal, even in the later stages of pre-training. Our experiments show
that our approach improves on the MoCo v2 baseline by a large margin on
multiple object-level tasks when pre-training on multi-object scene image
datasets. Code is available at https://github.com/ChenhongyiYang/CCOP
The State of the Art: Object Retrieval in Paintings using Discriminative Regions
The objective of this work is to recognize object categories (such as animals and vehicles) in paintings, whilst learning these categories from natural images. This is a challenging problem given the substantial differences between paintings and natural images, and variations in depiction of objects in paintings. We first demonstrate that classifiers trained on natural images of an object category have some success in retrieving paintings containing that category. We then draw upon recent work in mid-level discriminative patches to develop a novel method for re-ranking paintings based on their spatial consistency with natural images of an object category. This method combines both class based and instance based retrieval in a single framework. We quantitatively evaluate the method over a number of classes from the PASCAL VOC dataset, and demonstrate significant improvements in rankings of the retrieved paintings over a variety of object categories
- …