9,437 research outputs found
Quality assessment by region in spot images fused by means dual-tree complex wavelet transform
This work is motivated in providing and evaluating a fusion algorithm of remotely sensed images, i.e. the fusion of a high spatial resolution panchromatic image with a multi-spectral image (also known as pansharpening) using the dual-tree complex wavelet transform (DT-CWT), an effective approach for conducting an analytic and oversampled wavelet transform to reduce aliasing, and in turn reduce shift dependence of the wavelet transform. The proposed scheme includes the definition of a model to establish how information will be extracted from the PAN band and how that information will be injected into the MS bands with low spatial resolution. The approach was applied to Spot 5 images where there are bands falling outside PANâs spectrum. We propose an optional step in the quality evaluation protocol, which is to study the quality of the merger by regions, where each region represents a specific feature of the image. The results show that DT-CWT based approach offers good spatial quality while retaining the spectral information of original images, case SPOT 5. The additional step facilitates the identification of the most affected regions by the fusion process
Hybrid image representation methods for automatic image annotation: a survey
In most automatic image annotation systems, images are represented with low level features using either global
methods or local methods. In global methods, the entire image is used as a unit. Local methods divide images into blocks where fixed-size sub-image blocks are adopted as sub-units; or into regions by using segmented regions as sub-units in images. In contrast to typical automatic image annotation methods that use either global or local features exclusively, several recent methods have considered incorporating the two kinds of information, and believe that the combination of the two levels of features is
beneficial in annotating images. In this paper, we provide a
survey on automatic image annotation techniques according to
one aspect: feature extraction, and, in order to complement
existing surveys in literature, we focus on the emerging image annotation methods: hybrid methods that combine both global and local features for image representation
Audio-coupled video content understanding of unconstrained video sequences
Unconstrained video understanding is a difficult task. The main aim of this thesis is to
recognise the nature of objects, activities and environment in a given video clip using
both audio and video information. Traditionally, audio and video information has not
been applied together for solving such complex task, and for the first time we propose,
develop, implement and test a new framework of multi-modal (audio and video) data
analysis for context understanding and labelling of unconstrained videos.
The framework relies on feature selection techniques and introduces a novel algorithm
(PCFS) that is faster than the well-established SFFS algorithm. We use the framework for
studying the benefits of combining audio and video information in a number of different
problems. We begin by developing two independent content recognition modules. The
first one is based on image sequence analysis alone, and uses a range of colour, shape,
texture and statistical features from image regions with a trained classifier to recognise
the identity of objects, activities and environment present. The second module uses audio
information only, and recognises activities and environment. Both of these approaches
are preceded by detailed pre-processing to ensure that correct video segments containing
both audio and video content are present, and that the developed system can be made
robust to changes in camera movement, illumination, random object behaviour etc. For
both audio and video analysis, we use a hierarchical approach of multi-stage
classification such that difficult classification tasks can be decomposed into simpler and
smaller tasks.
When combining both modalities, we compare fusion techniques at different levels of
integration and propose a novel algorithm that combines advantages of both feature and
decision-level fusion. The analysis is evaluated on a large amount of test data comprising
unconstrained videos collected for this work. We finally, propose a decision correction
algorithm which shows that further steps towards combining multi-modal classification
information effectively with semantic knowledge generates the best possible results
- âŠ