196,090 research outputs found
Multi-level fusion of hard and soft information
Proceedings of: 17th International Conference on Information Fusion (FUSION 2014): Salamanca, Spain 7-10 July 2014.Driven by the underlying need for a yet to be developed framework for fusing heterogeneous data and information at different semantic levels coming from both sensory and human sources, we present some results of the research being conducted within the NATO Research Task Group IST-106/RTG-051 on "Information Filtering and Multi Source Information Fusion". As part of this on-going effort, we discuss here a first outcome of our investigation on multi-level fusion. It deals with removing the first hurdle between data/information sources and processes being at different levels: representation. Our contention here is that a common representation and description framework is the premise for enabling processing overarching different semantic levels. To this end we discuss here the use of the Battle Management Language (BML) as a way ("lingua franca") to encode sensory data, a priori and contextual knowledge, both as hard and soft data.Publicad
Deep learning for fusion of APEX hyperspectral and full-waveform LiDAR remote sensing data for tree species mapping
Deep learning has been widely used to fuse multi-sensor data for classification. However, current deep learning architecture for multi-sensor data fusion might not always perform better than single data source, especially for the fusion of hyperspectral and light detection and ranging (LiDAR) remote sensing data for tree species mapping in complex, closed forest canopies. In this paper, we propose a new deep fusion framework to integrate the complementary information from hyperspectral and LiDAR data for tree species mapping. We also investigate the fusion of either “single-band” or multi-band (i.e., full-waveform) LiDAR with hyperspectral data for tree species mapping. Additionally, we provide a solution to estimate the crown size of tree species by the fusion of multi-sensor data. Experimental results on fusing real APEX hyperspectral and LiDAR data demonstrate the effectiveness of the proposed deep fusion framework. Compared to using only single data source or current deep fusion architecture, our proposed method yields improvements in overall and average classification accuracies ranging from 82.21% to 87.10% and 76.71% to 83.45%, respectively
A Generalized Multi-Modal Fusion Detection Framework
LiDAR point clouds have become the most common data source in autonomous
driving. However, due to the sparsity of point clouds, accurate and reliable
detection cannot be achieved in specific scenarios. Because of their
complementarity with point clouds, images are getting increasing attention.
Although with some success, existing fusion methods either perform hard fusion
or do not fuse in a direct manner. In this paper, we propose a generic 3D
detection framework called MMFusion, using multi-modal features. The framework
aims to achieve accurate fusion between LiDAR and images to improve 3D
detection in complex scenes. Our framework consists of two separate streams:
the LiDAR stream and the camera stream, which can be compatible with any
single-modal feature extraction network. The Voxel Local Perception Module in
the LiDAR stream enhances local feature representation, and then the
Multi-modal Feature Fusion Module selectively combines feature output from
different streams to achieve better fusion. Extensive experiments have shown
that our framework not only outperforms existing benchmarks but also improves
their detection, especially for detecting cyclists and pedestrians on KITTI
benchmarks, with strong robustness and generalization capabilities. Hopefully,
our work will stimulate more research into multi-modal fusion for autonomous
driving tasks
BRECCIA: A novel multi-source fusion framework for dynamic geospatial data analysis
Geospatial Intelligence analysis involves the combination of multi-source information expressed in logical form, computational form, and sensor data. Each of these forms has its own way to describe uncertainty or error: e.g., frequency models, algorithmic truncation, floating point roundoff, Gaussian distributions, etc. We propose BRECCIA, a Geospatial Intelligence analysis system, which receives information from humans (as logical sentences), simulations (e.g., weather or environmental predictions), and sensors (e.g. cameras, weather stations, microphones, etc.), where each piece of information has an associated uncertainty; BRECCIA then provides responses to user queries based on a new probabilistic logic system which determines a coherent overall response to the query and the probability of that response; this new method avoids the exponential complexity of previous approaches. In addition, BRECCIA attempts to identify concrete mechanisms (proposed actions) to acquire new data dynamically in order to reduce the uncertainty of the query response. The basis for this is a novel approach to probabilistic argumentation analysis
Online Distillation-enhanced Multi-modal Transformer for Sequential Recommendation
Multi-modal recommendation systems, which integrate diverse types of
information, have gained widespread attention in recent years. However,
compared to traditional collaborative filtering-based multi-modal
recommendation systems, research on multi-modal sequential recommendation is
still in its nascent stages. Unlike traditional sequential recommendation
models that solely rely on item identifier (ID) information and focus on
network structure design, multi-modal recommendation models need to emphasize
item representation learning and the fusion of heterogeneous data sources. This
paper investigates the impact of item representation learning on downstream
recommendation tasks and examines the disparities in information fusion at
different stages. Empirical experiments are conducted to demonstrate the need
to design a framework suitable for collaborative learning and fusion of diverse
information. Based on this, we propose a new model-agnostic framework for
multi-modal sequential recommendation tasks, called Online
Distillation-enhanced Multi-modal Transformer (ODMT), to enhance feature
interaction and mutual learning among multi-source input (ID, text, and image),
while avoiding conflicts among different features during training, thereby
improving recommendation accuracy. To be specific, we first introduce an
ID-aware Multi-modal Transformer module in the item representation learning
stage to facilitate information interaction among different features. Secondly,
we employ an online distillation training strategy in the prediction
optimization stage to make multi-source data learn from each other and improve
prediction robustness. Experimental results on a video content recommendation
dataset and three e-commerce recommendation datasets demonstrate the
effectiveness of the proposed two modules, which is approximately 10%
improvement in performance compared to baseline models.Comment: 11 pages, 7 figure
Graph-based Data Modeling and Analysis for Data Fusion in Remote Sensing
Hyperspectral imaging provides the capability of increased sensitivity and discrimination over traditional imaging methods by combining standard digital imaging with spectroscopic methods. For each individual pixel in a hyperspectral image (HSI), a continuous spectrum is sampled as the spectral reflectance/radiance signature to facilitate identification of ground cover and surface material. The abundant spectrum knowledge allows all available information from the data to be mined. The superior qualities within hyperspectral imaging allow wide applications such as mineral exploration, agriculture monitoring, and ecological surveillance, etc. The processing of massive high-dimensional HSI datasets is a challenge since many data processing techniques have a computational complexity that grows exponentially with the dimension. Besides, a HSI dataset may contain a limited number of degrees of freedom due to the high correlations between data points and among the spectra. On the other hand, merely taking advantage of the sampled spectrum of individual HSI data point may produce inaccurate results due to the mixed nature of raw HSI data, such as mixed pixels, optical interferences and etc.
Fusion strategies are widely adopted in data processing to achieve better performance, especially in the field of classification and clustering. There are mainly three types of fusion strategies, namely low-level data fusion, intermediate-level feature fusion, and high-level decision fusion. Low-level data fusion combines multi-source data that is expected to be complementary or cooperative. Intermediate-level feature fusion aims at selection and combination of features to remove redundant information. Decision level fusion exploits a set of classifiers to provide more accurate results. The fusion strategies have wide applications including HSI data processing. With the fast development of multiple remote sensing modalities, e.g. Very High Resolution (VHR) optical sensors, LiDAR, etc., fusion of multi-source data can in principal produce more detailed information than each single source. On the other hand, besides the abundant spectral information contained in HSI data, features such as texture and shape may be employed to represent data points from a spatial perspective. Furthermore, feature fusion also includes the strategy of removing redundant and noisy features in the dataset.
One of the major problems in machine learning and pattern recognition is to develop appropriate representations for complex nonlinear data. In HSI processing, a particular data point is usually described as a vector with coordinates corresponding to the intensities measured in the spectral bands. This vector representation permits the application of linear and nonlinear transformations with linear algebra to find an alternative representation of the data. More generally, HSI is multi-dimensional in nature and the vector representation may lose the contextual correlations. Tensor representation provides a more sophisticated modeling technique and a higher-order generalization to linear subspace analysis.
In graph theory, data points can be generalized as nodes with connectivities measured from the proximity of a local neighborhood. The graph-based framework efficiently characterizes the relationships among the data and allows for convenient mathematical manipulation in many applications, such as data clustering, feature extraction, feature selection and data alignment. In this thesis, graph-based approaches applied in the field of multi-source feature and data fusion in remote sensing area are explored. We will mainly investigate the fusion of spatial, spectral and LiDAR information with linear and multilinear algebra under graph-based framework for data clustering and classification problems
Unsupervised Image Fusion Using Deep Image Priors
A significant number of researchers have applied deep learning methods to
image fusion. However, most works require a large amount of training data or
depend on pre-trained models or frameworks to capture features from source
images. This is inevitably hampered by a shortage of training data or a
mismatch between the framework and the actual problem. Deep Image Prior (DIP)
has been introduced to exploit convolutional neural networks' ability to
synthesize the 'prior' in the input image. However, the original design of DIP
is hard to be generalized to multi-image processing problems, particularly for
image fusion. Therefore, we propose a new image fusion technique that extends
DIP to fusion tasks formulated as inverse problems. Additionally, we apply a
multi-channel approach to enhance DIP's effect further. The evaluation is
conducted with several commonly used image fusion assessment metrics. The
results are compared with state-of-the-art image fusion methods. Our method
outperforms these techniques for a range of metrics. In particular, it is shown
to provide the best objective results for most metrics when applied to medical
images
You Only Need Two Detectors to Achieve Multi-Modal 3D Multi-Object Tracking
Firstly, a new multi-object tracking framework is proposed in this paper
based on multi-modal fusion. By integrating object detection and multi-object
tracking into the same model, this framework avoids the complex data
association process in the classical TBD paradigm, and requires no additional
training. Secondly, confidence of historical trajectory regression is explored,
possible states of a trajectory in the current frame (weak object or strong
object) are analyzed and a confidence fusion module is designed to guide
non-maximum suppression of trajectory and detection for ordered association.
Finally, extensive experiments are conducted on the KITTI and Waymo datasets.
The results show that the proposed method can achieve robust tracking by using
only two modal detectors and it is more accurate than many of the latest TBD
paradigm-based multi-modal tracking methods. The source codes of the proposed
method are available at https://github.com/wangxiyang2022/YONTD-MOTComment: 10 pages, 9 figure
Apprentissage multi-tâche de l'élévation et de la sémantique à partir d'images aériennes
International audienceAerial or satellite imagery is a great source for land surface analysis, which might yield land use maps or elevation models. In this investigation, we present a neural network framework for learning semantics and local height together. We show how this joint multi-task learning benefits to each task on the large dataset of the 2018 Data Fusion Contest. Moreover, our framework also yields an uncertainty map which allows assessing the prediction of the model. Code is available at https://github.com/marcelampc/mtl_aerial_images
PPF - A Parallel Particle Filtering Library
We present the parallel particle filtering (PPF) software library, which
enables hybrid shared-memory/distributed-memory parallelization of particle
filtering (PF) algorithms combining the Message Passing Interface (MPI) with
multithreading for multi-level parallelism. The library is implemented in Java
and relies on OpenMPI's Java bindings for inter-process communication. It
includes dynamic load balancing, multi-thread balancing, and several
algorithmic improvements for PF, such as input-space domain decomposition. The
PPF library hides the difficulties of efficient parallel programming of PF
algorithms and provides application developers with the necessary tools for
parallel implementation of PF methods. We demonstrate the capabilities of the
PPF library using two distributed PF algorithms in two scenarios with different
numbers of particles. The PPF library runs a 38 million particle problem,
corresponding to more than 1.86 GB of particle data, on 192 cores with 67%
parallel efficiency. To the best of our knowledge, the PPF library is the first
open-source software that offers a parallel framework for PF applications.Comment: 8 pages, 8 figures; will appear in the proceedings of the IET Data
Fusion & Target Tracking Conference 201
- …