72,015 research outputs found
Fusion of airborne LiDAR with multispectral SPOT 5 image for enhancement of feature extraction using Dempster–Shafer theory
This paper presents an application of data-driven Dempster-Shafer theory (DST) of evidence to fuse multisensor data for land-cover feature extraction. Over the years, researchers have focused on DST for a variety of applications. However, less attention has been given to generate and interpret probability, certainty, and conflict maps. Moreover, quantitative assessment of DST performance is often overlooked. In this paper, for implementation of DST, two main types of data were used: multisensor data such as Light Detection and Ranging (LiDAR) and multispectral satellite imagery [Satellite Pour l'Observation de la Terre 5 (SPOT 5)]. The objectives are to classify land-cover types from fused multisensor data using DST, to quantitatively assess the accuracy of the classification, and to examine the potential of slope data derived from LiDAR for feature detection. First, we derived the normalized difference vegetation index (NDVI) from SPOT 5 image and the normalized digital surface model (DSM) (nDSM) from LiDAR by subtracting the digital terrain model from the DSM. The two products were fused using the DST algorithm, and the accuracy of the classification was assessed. Second, we generated a surface slope from LiDAR and fused it with NDVI. Subsequently, the classification accuracy was assessed using an IKONOS image of the study area as ground truth data. From the two processing stages, the NDVI/nDSM fusion had an overall accuracy of 88.7%, while the NDVI/slope fusion had 75.3%. The result indicates that NDVI/nDSM integration performed better than NDVI/slope. Although the overall accuracy of the former is better than the latter (NDVI/slope), the contribution of individual class reveals that building extraction from fused slope and NDVI performed poorly. This study proves that DST is a time- and cost-effective method for accurate land-cover feature identification and extraction without the need for a prior knowledge of the scene. Furthermore, the ability to generate ot- er products like certainty, conflict, and maximum probability maps for better visual understanding of the decision process makes it more reliable for applications such as urban planning, forest management, 3-D feature extraction, and map updating
FedDiff: Diffusion Model Driven Federated Learning for Multi-Modal and Multi-Clients
With the rapid development of imaging sensor technology in the field of
remote sensing, multi-modal remote sensing data fusion has emerged as a crucial
research direction for land cover classification tasks. While diffusion models
have made great progress in generative models and image classification tasks,
existing models primarily focus on single-modality and single-client control,
that is, the diffusion process is driven by a single modal in a single
computing node. To facilitate the secure fusion of heterogeneous data from
clients, it is necessary to enable distributed multi-modal control, such as
merging the hyperspectral data of organization A and the LiDAR data of
organization B privately on each base station client. In this study, we propose
a multi-modal collaborative diffusion federated learning framework called
FedDiff. Our framework establishes a dual-branch diffusion model feature
extraction setup, where the two modal data are inputted into separate branches
of the encoder. Our key insight is that diffusion models driven by different
modalities are inherently complementary in terms of potential denoising steps
on which bilateral connections can be built. Considering the challenge of
private and efficient communication between multiple clients, we embed the
diffusion model into the federated learning communication structure, and
introduce a lightweight communication module. Qualitative and quantitative
experiments validate the superiority of our framework in terms of image quality
and conditional consistency
Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment
We present a deep neural network-based approach to image quality assessment
(IQA). The network is trained end-to-end and comprises ten convolutional layers
and five pooling layers for feature extraction, and two fully connected layers
for regression, which makes it significantly deeper than related IQA models.
Unique features of the proposed architecture are that: 1) with slight
adaptations it can be used in a no-reference (NR) as well as in a
full-reference (FR) IQA setting and 2) it allows for joint learning of local
quality and local weights, i.e., relative importance of local quality to the
global quality estimate, in an unified framework. Our approach is purely
data-driven and does not rely on hand-crafted features or other types of prior
domain knowledge about the human visual system or image statistics. We evaluate
the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the
LIVE In the wild image quality challenge database and show superior performance
to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation
shows a high ability to generalize between different databases, indicating a
high robustness of the learned features
An In-Depth Study on Open-Set Camera Model Identification
Camera model identification refers to the problem of linking a picture to the
camera model used to shoot it. As this might be an enabling factor in different
forensic applications to single out possible suspects (e.g., detecting the
author of child abuse or terrorist propaganda material), many accurate camera
model attribution methods have been developed in the literature. One of their
main drawbacks, however, is the typical closed-set assumption of the problem.
This means that an investigated photograph is always assigned to one camera
model within a set of known ones present during investigation, i.e., training
time, and the fact that the picture can come from a completely unrelated camera
model during actual testing is usually ignored. Under realistic conditions, it
is not possible to assume that every picture under analysis belongs to one of
the available camera models. To deal with this issue, in this paper, we present
the first in-depth study on the possibility of solving the camera model
identification problem in open-set scenarios. Given a photograph, we aim at
detecting whether it comes from one of the known camera models of interest or
from an unknown one. We compare different feature extraction algorithms and
classifiers specially targeting open-set recognition. We also evaluate possible
open-set training protocols that can be applied along with any open-set
classifier, observing that a simple of those alternatives obtains best results.
Thorough testing on independent datasets shows that it is possible to leverage
a recently proposed convolutional neural network as feature extractor paired
with a properly trained open-set classifier aiming at solving the open-set
camera model attribution problem even to small-scale image patches, improving
over state-of-the-art available solutions.Comment: Published through IEEE Access journa
- …