780 research outputs found
GeoSay: A Geometric Saliency for Extracting Buildings in Remote Sensing Images
Automatic extraction of buildings in remote sensing images is an important
but challenging task and finds many applications in different fields such as
urban planning, navigation and so on. This paper addresses the problem of
buildings extraction in very high-spatial-resolution (VHSR) remote sensing (RS)
images, whose spatial resolution is often up to half meters and provides rich
information about buildings. Based on the observation that buildings in VHSR-RS
images are always more distinguishable in geometry than in texture or spectral
domain, this paper proposes a geometric building index (GBI) for accurate
building extraction, by computing the geometric saliency from VHSR-RS images.
More precisely, given an image, the geometric saliency is derived from a
mid-level geometric representations based on meaningful junctions that can
locally describe geometrical structures of images. The resulting GBI is finally
measured by integrating the derived geometric saliency of buildings.
Experiments on three public and commonly used datasets demonstrate that the
proposed GBI achieves the state-of-the-art performance and shows impressive
generalization capability. Additionally, GBI preserves both the exact position
and accurate shape of single buildings compared to existing methods
Vision-based Real-Time Aerial Object Localization and Tracking for UAV Sensing System
The paper focuses on the problem of vision-based obstacle detection and
tracking for unmanned aerial vehicle navigation. A real-time object
localization and tracking strategy from monocular image sequences is developed
by effectively integrating the object detection and tracking into a dynamic
Kalman model. At the detection stage, the object of interest is automatically
detected and localized from a saliency map computed via the image background
connectivity cue at each frame; at the tracking stage, a Kalman filter is
employed to provide a coarse prediction of the object state, which is further
refined via a local detector incorporating the saliency map and the temporal
information between two consecutive frames. Compared to existing methods, the
proposed approach does not require any manual initialization for tracking, runs
much faster than the state-of-the-art trackers of its kind, and achieves
competitive tracking performance on a large number of image sequences.
Extensive experiments demonstrate the effectiveness and superior performance of
the proposed approach.Comment: 8 pages, 7 figure
On morphological hierarchical representations for image processing and spatial data clustering
Hierarchical data representations in the context of classi cation and data
clustering were put forward during the fties. Recently, hierarchical image
representations have gained renewed interest for segmentation purposes. In this
paper, we briefly survey fundamental results on hierarchical clustering and
then detail recent paradigms developed for the hierarchical representation of
images in the framework of mathematical morphology: constrained connectivity
and ultrametric watersheds. Constrained connectivity can be viewed as a way to
constrain an initial hierarchy in such a way that a set of desired constraints
are satis ed. The framework of ultrametric watersheds provides a generic scheme
for computing any hierarchical connected clustering, in particular when such a
hierarchy is constrained. The suitability of this framework for solving
practical problems is illustrated with applications in remote sensing
Saliency-based cooperative landing of a multirotor aerial vehicle on an autonomous surface vehicle
This paper presents a method for vision-based landing of a multirotor unmanned aerial vehicle (UAV) on an autonomous surface vehicle (ASV) equipped with a helipad. The method includes a mechanism for helipad behavioural search when outside the UAV’s field of view, a learning saliency-based mechanism for visual tracking the helipad, and a cooperative strategy for the final vision-based landing phase. Learning how to track the helipad from above occurs during takeoff and cooperation results from having the ASV tracking the UAV for assisting its landing. A set of experimental results with both simulated and physical robots show the feasibility of the presented method.info:eu-repo/semantics/acceptedVersio
Shape Representations Using Nested Descriptors
The problem of shape representation is a core problem in computer vision. It can be argued that shape representation is the most central representational problem for computer vision, since unlike texture or color, shape alone can be used for perceptual tasks such as image matching, object detection and object categorization.
This dissertation introduces a new shape representation called the nested descriptor. A nested descriptor represents shape both globally and locally by pooling salient scaled and oriented complex gradients in a large nested support set. We show that this nesting property introduces a nested correlation structure that enables a new local distance function called the nesting distance, which provides a provably robust similarity function for image matching. Furthermore, the nesting property suggests an elegant flower like normalization strategy called a log-spiral difference. We show that this normalization enables a compact binary representation and is equivalent to a form a bottom up saliency. This suggests that the nested descriptor representational power is due to representing salient edges, which makes a fundamental connection between the saliency and local feature descriptor literature. In this dissertation, we introduce three examples of shape representation using nested descriptors: nested shape descriptors for imagery, nested motion descriptors for video and nested pooling for activities. We show evaluation results for these representations that demonstrate state-of-the-art performance for image matching, wide baseline stereo and activity recognition tasks
Remote Sensing Image Scene Classification: Benchmark and State of the Art
Remote sensing image scene classification plays an important role in a wide
range of applications and hence has been receiving remarkable attention. During
the past years, significant efforts have been made to develop various datasets
or present a variety of approaches for scene classification from remote sensing
images. However, a systematic review of the literature concerning datasets and
methods for scene classification is still lacking. In addition, almost all
existing datasets have a number of limitations, including the small scale of
scene classes and the image numbers, the lack of image variations and
diversity, and the saturation of accuracy. These limitations severely limit the
development of new approaches especially deep learning-based methods. This
paper first provides a comprehensive review of the recent progress. Then, we
propose a large-scale dataset, termed "NWPU-RESISC45", which is a publicly
available benchmark for REmote Sensing Image Scene Classification (RESISC),
created by Northwestern Polytechnical University (NWPU). This dataset contains
31,500 images, covering 45 scene classes with 700 images in each class. The
proposed NWPU-RESISC45 (i) is large-scale on the scene classes and the total
image number, (ii) holds big variations in translation, spatial resolution,
viewpoint, object pose, illumination, background, and occlusion, and (iii) has
high within-class diversity and between-class similarity. The creation of this
dataset will enable the community to develop and evaluate various data-driven
algorithms. Finally, several representative methods are evaluated using the
proposed dataset and the results are reported as a useful baseline for future
research.Comment: This manuscript is the accepted version for Proceedings of the IEE
- …