4,260 research outputs found
G2NPAN: GAN-guided nuance perceptual attention network for multimodal medical fusion image quality assessment
Multimodal medical fusion images (MMFI) are formed by fusing medical images of two or more modalities with the aim of displaying as much valuable information as possible in a single image. However, due to the different strategies of various fusion algorithms, the quality of the generated fused images is uneven. Thus, an effective blind image quality assessment (BIQA) method is urgently required. The challenge of MMFI quality assessment is to enable the network to perceive the nuances between fused images of different qualities, and the key point for the success of BIQA is the availability of valid reference information. To this end, this work proposes a generative adversarial network (GAN) -guided nuance perceptual attention network (G2NPAN) to implement BIQA for MMFI. Specifically, we achieve the blind evaluation style via the design of a GAN and develop a Unique Feature Warehouse module to learn the effective features of fused images from the pixel level. The redesigned loss function guides the network to perceive the image quality. In the end, the class activation mapping supervised quality assessment network is employed to obtain the MMFI quality score. Extensive experiments and validation have been conducted in a database of medical fusion images, and the proposed method is superior to the state-of-the-art BIQA method
Perceptual Quality Assessment of Omnidirectional Audio-visual Signals
Omnidirectional videos (ODVs) play an increasingly important role in the
application fields of medical, education, advertising, tourism, etc. Assessing
the quality of ODVs is significant for service-providers to improve the user's
Quality of Experience (QoE). However, most existing quality assessment studies
for ODVs only focus on the visual distortions of videos, while ignoring that
the overall QoE also depends on the accompanying audio signals. In this paper,
we first establish a large-scale audio-visual quality assessment dataset for
omnidirectional videos, which includes 375 distorted omnidirectional
audio-visual (A/V) sequences generated from 15 high-quality pristine
omnidirectional A/V contents, and the corresponding perceptual audio-visual
quality scores. Then, we design three baseline methods for full-reference
omnidirectional audio-visual quality assessment (OAVQA), which combine existing
state-of-the-art single-mode audio and video QA models via multimodal fusion
strategies. We validate the effectiveness of the A/V multimodal fusion method
for OAVQA on our dataset, which provides a new benchmark for omnidirectional
QoE evaluation. Our dataset is available at https://github.com/iamazxl/OAVQA.Comment: 12 pages, 5 figures, to be published in CICAI202
Psychophysiology-based QoE assessment : a survey
We present a survey of psychophysiology-based assessment for quality of experience (QoE) in advanced multimedia technologies. We provide a classification of methods relevant to QoE and describe related psychological processes, experimental design considerations, and signal analysis techniques. We summarize multimodal techniques and discuss several important aspects of psychophysiology-based QoE assessment, including the synergies with psychophysical assessment and the need for standardized experimental design. This survey is not considered to be exhaustive but serves as a guideline for those interested to further explore this emerging field of research
An Attention-based Multi-Scale Feature Learning Network for Multimodal Medical Image Fusion
Medical images play an important role in clinical applications. Multimodal
medical images could provide rich information about patients for physicians to
diagnose. The image fusion technique is able to synthesize complementary
information from multimodal images into a single image. This technique will
prevent radiologists switch back and forth between different images and save
lots of time in the diagnostic process. In this paper, we introduce a novel
Dilated Residual Attention Network for the medical image fusion task. Our
network is capable to extract multi-scale deep semantic features. Furthermore,
we propose a novel fixed fusion strategy termed Softmax-based weighted strategy
based on the Softmax weights and matrix nuclear norm. Extensive experiments
show our proposed network and fusion strategy exceed the state-of-the-art
performance compared with reference image fusion methods on four commonly used
fusion metrics.Comment: 8 pages, 8 figures, 3 table
A Digital Twin City Model for Age-Friendly Communities: Capturing Environmental Distress from Multimodal Sensory Data
As the worldwide population is aging, the demands of aging-in-place are also increasing and require smarter and more connected cities to keep mobility independence of older adults. However, todayâs aging built environment often poses great environmental demands to older adultsâ mobility and causes their distresses. To better understand and help mitigating older adultsâ distress in their daily trips, this paper proposes constructing the digital twin city (DTC) model that integrates multimodal data (i.e., physiological sensing, visual sensing) on environmental demands in urban communities, so that such environmental demands can be considered in mobility planning of older adults. Specifically, this paper examines how data acquired from various modalities (i.e., electrodermal activity, gait patterns, visual sensing) can portray environmental demands associated with older adultsâ mobility. In addition, it discusses the challenges and opportunities of multimodal data fusion in capturing environmental distresses in urban communities
An approach for cross-modality guided quality enhancement of liver image
A novel approach for multimodal liver image contrast enhancement is put forward in this paper. The proposed approach utilizes magnetic resonance imaging (MRI) scan of liver as a guide to enhance the structures of computed tomography (CT) liver. The enhancement process consists of two phases: The first phase is the transformation of MRI and CT modalities to be in the same range. Then the histogram of CT liver is adjusted to match the histogram of MRI. In the second phase, an adaptive histogram equalization technique is presented by splitting the CT histogram into two sub-histograms and replacing their cumulative distribution functions with two smooths sigmoid. The subjective and objective assessments of experimental results indicated that the proposed approach yields better results. In addition, the image contrast is effectively enhanced as well as the mean brightness and details are well preserved
Signal processing algorithms for enhanced image fusion performance and assessment
The dissertation presents several signal processing algorithms for image fusion in noisy multimodal
conditions. It introduces a novel image fusion method which performs well for image
sets heavily corrupted by noise. As opposed to current image fusion schemes, the method has
no requirements for a priori knowledge of the noise component. The image is decomposed with
Chebyshev polynomials (CP) being used as basis functions to perform fusion at feature level. The
properties of CP, namely fast convergence and smooth approximation, renders it ideal for heuristic
and indiscriminate denoising fusion tasks. Quantitative evaluation using objective fusion assessment
methods show favourable performance of the proposed scheme compared to previous efforts
on image fusion, notably in heavily corrupted images.
The approach is further improved by incorporating the advantages of CP with a state-of-the-art
fusion technique named independent component analysis (ICA), for joint-fusion processing
based on region saliency. Whilst CP fusion is robust under severe noise conditions, it is prone to
eliminating high frequency information of the images involved, thereby limiting image sharpness.
Fusion using ICA, on the other hand, performs well in transferring edges and other salient features
of the input images into the composite output. The combination of both methods, coupled with
several mathematical morphological operations in an algorithm fusion framework, is considered a
viable solution. Again, according to the quantitative metrics the results of our proposed approach
are very encouraging as far as joint fusion and denoising are concerned.
Another focus of this dissertation is on a novel metric for image fusion evaluation that is based
on texture. The conservation of background textural details is considered important in many fusion
applications as they help define the image depth and structure, which may prove crucial in
many surveillance and remote sensing applications. Our work aims to evaluate the performance of image fusion algorithms based on their ability to retain textural details from the fusion process.
This is done by utilising the gray-level co-occurrence matrix (GLCM) model to extract second-order
statistical features for the derivation of an image textural measure, which is then used to
replace the edge-based calculations in an objective-based fusion metric. Performance evaluation
on established fusion methods verifies that the proposed metric is viable, especially for multimodal
scenarios
Colorization of Multispectral Image Fusion using Convolutional Neural Network approach
The proposed technique offers a significant advantage in enhancing multiband nighttime imagery for surveillance and navigation purposes., The multi-band image data set comprises visual and infrared motion sequences with various military and civilian surveillance scenarios which include people that are stationary, walking or running, Vehicles and buildings or other man-made structures. Colorization method led to provide superior discrimination, identification of objects (Lesions), faster reaction times and an increased scene understanding than monochrome fused image. The guided filtering approach is used to decompose the source images hence they are divided into two parts: approximation part and detail content part further the weighted-averaging method is used to fuse the approximation part. The multi-layer features are extracted from the detail content part using the VGG-19 network. Finally, the approximation part and detail content part will be combined to reconstruct the fused image. The proposed approach has offers better outcomes equated to prevailing state-of-the-art techniques in terms of quantitative and qualitative parameters. In future, propose technique will help Battlefield monitoring, Defence for situation awareness, Surveillance, Target tracking and Person authentication
- âŠ