4,260 research outputs found

    G2NPAN: GAN-guided nuance perceptual attention network for multimodal medical fusion image quality assessment

    Get PDF
    Multimodal medical fusion images (MMFI) are formed by fusing medical images of two or more modalities with the aim of displaying as much valuable information as possible in a single image. However, due to the different strategies of various fusion algorithms, the quality of the generated fused images is uneven. Thus, an effective blind image quality assessment (BIQA) method is urgently required. The challenge of MMFI quality assessment is to enable the network to perceive the nuances between fused images of different qualities, and the key point for the success of BIQA is the availability of valid reference information. To this end, this work proposes a generative adversarial network (GAN) -guided nuance perceptual attention network (G2NPAN) to implement BIQA for MMFI. Specifically, we achieve the blind evaluation style via the design of a GAN and develop a Unique Feature Warehouse module to learn the effective features of fused images from the pixel level. The redesigned loss function guides the network to perceive the image quality. In the end, the class activation mapping supervised quality assessment network is employed to obtain the MMFI quality score. Extensive experiments and validation have been conducted in a database of medical fusion images, and the proposed method is superior to the state-of-the-art BIQA method

    Perceptual Quality Assessment of Omnidirectional Audio-visual Signals

    Full text link
    Omnidirectional videos (ODVs) play an increasingly important role in the application fields of medical, education, advertising, tourism, etc. Assessing the quality of ODVs is significant for service-providers to improve the user's Quality of Experience (QoE). However, most existing quality assessment studies for ODVs only focus on the visual distortions of videos, while ignoring that the overall QoE also depends on the accompanying audio signals. In this paper, we first establish a large-scale audio-visual quality assessment dataset for omnidirectional videos, which includes 375 distorted omnidirectional audio-visual (A/V) sequences generated from 15 high-quality pristine omnidirectional A/V contents, and the corresponding perceptual audio-visual quality scores. Then, we design three baseline methods for full-reference omnidirectional audio-visual quality assessment (OAVQA), which combine existing state-of-the-art single-mode audio and video QA models via multimodal fusion strategies. We validate the effectiveness of the A/V multimodal fusion method for OAVQA on our dataset, which provides a new benchmark for omnidirectional QoE evaluation. Our dataset is available at https://github.com/iamazxl/OAVQA.Comment: 12 pages, 5 figures, to be published in CICAI202

    Psychophysiology-based QoE assessment : a survey

    Get PDF
    We present a survey of psychophysiology-based assessment for quality of experience (QoE) in advanced multimedia technologies. We provide a classification of methods relevant to QoE and describe related psychological processes, experimental design considerations, and signal analysis techniques. We summarize multimodal techniques and discuss several important aspects of psychophysiology-based QoE assessment, including the synergies with psychophysical assessment and the need for standardized experimental design. This survey is not considered to be exhaustive but serves as a guideline for those interested to further explore this emerging field of research

    An Attention-based Multi-Scale Feature Learning Network for Multimodal Medical Image Fusion

    Full text link
    Medical images play an important role in clinical applications. Multimodal medical images could provide rich information about patients for physicians to diagnose. The image fusion technique is able to synthesize complementary information from multimodal images into a single image. This technique will prevent radiologists switch back and forth between different images and save lots of time in the diagnostic process. In this paper, we introduce a novel Dilated Residual Attention Network for the medical image fusion task. Our network is capable to extract multi-scale deep semantic features. Furthermore, we propose a novel fixed fusion strategy termed Softmax-based weighted strategy based on the Softmax weights and matrix nuclear norm. Extensive experiments show our proposed network and fusion strategy exceed the state-of-the-art performance compared with reference image fusion methods on four commonly used fusion metrics.Comment: 8 pages, 8 figures, 3 table

    A Digital Twin City Model for Age-Friendly Communities: Capturing Environmental Distress from Multimodal Sensory Data

    Get PDF
    As the worldwide population is aging, the demands of aging-in-place are also increasing and require smarter and more connected cities to keep mobility independence of older adults. However, today’s aging built environment often poses great environmental demands to older adults’ mobility and causes their distresses. To better understand and help mitigating older adults’ distress in their daily trips, this paper proposes constructing the digital twin city (DTC) model that integrates multimodal data (i.e., physiological sensing, visual sensing) on environmental demands in urban communities, so that such environmental demands can be considered in mobility planning of older adults. Specifically, this paper examines how data acquired from various modalities (i.e., electrodermal activity, gait patterns, visual sensing) can portray environmental demands associated with older adults’ mobility. In addition, it discusses the challenges and opportunities of multimodal data fusion in capturing environmental distresses in urban communities

    An approach for cross-modality guided quality enhancement of liver image

    Get PDF
    A novel approach for multimodal liver image contrast enhancement is put forward in this paper. The proposed approach utilizes magnetic resonance imaging (MRI) scan of liver as a guide to enhance the structures of computed tomography (CT) liver. The enhancement process consists of two phases: The first phase is the transformation of MRI and CT modalities to be in the same range. Then the histogram of CT liver is adjusted to match the histogram of MRI. In the second phase, an adaptive histogram equalization technique is presented by splitting the CT histogram into two sub-histograms and replacing their cumulative distribution functions with two smooths sigmoid. The subjective and objective assessments of experimental results indicated that the proposed approach yields better results. In addition, the image contrast is effectively enhanced as well as the mean brightness and details are well preserved

    Signal processing algorithms for enhanced image fusion performance and assessment

    Get PDF
    The dissertation presents several signal processing algorithms for image fusion in noisy multimodal conditions. It introduces a novel image fusion method which performs well for image sets heavily corrupted by noise. As opposed to current image fusion schemes, the method has no requirements for a priori knowledge of the noise component. The image is decomposed with Chebyshev polynomials (CP) being used as basis functions to perform fusion at feature level. The properties of CP, namely fast convergence and smooth approximation, renders it ideal for heuristic and indiscriminate denoising fusion tasks. Quantitative evaluation using objective fusion assessment methods show favourable performance of the proposed scheme compared to previous efforts on image fusion, notably in heavily corrupted images. The approach is further improved by incorporating the advantages of CP with a state-of-the-art fusion technique named independent component analysis (ICA), for joint-fusion processing based on region saliency. Whilst CP fusion is robust under severe noise conditions, it is prone to eliminating high frequency information of the images involved, thereby limiting image sharpness. Fusion using ICA, on the other hand, performs well in transferring edges and other salient features of the input images into the composite output. The combination of both methods, coupled with several mathematical morphological operations in an algorithm fusion framework, is considered a viable solution. Again, according to the quantitative metrics the results of our proposed approach are very encouraging as far as joint fusion and denoising are concerned. Another focus of this dissertation is on a novel metric for image fusion evaluation that is based on texture. The conservation of background textural details is considered important in many fusion applications as they help define the image depth and structure, which may prove crucial in many surveillance and remote sensing applications. Our work aims to evaluate the performance of image fusion algorithms based on their ability to retain textural details from the fusion process. This is done by utilising the gray-level co-occurrence matrix (GLCM) model to extract second-order statistical features for the derivation of an image textural measure, which is then used to replace the edge-based calculations in an objective-based fusion metric. Performance evaluation on established fusion methods verifies that the proposed metric is viable, especially for multimodal scenarios

    Colorization of Multispectral Image Fusion using Convolutional Neural Network approach

    Get PDF
    The proposed technique  offers a significant advantage in enhancing multiband nighttime imagery for surveillance and navigation purposes., The multi-band image data set comprises visual  and infrared  motion sequences with various military and civilian surveillance scenarios which include people that are stationary, walking or running, Vehicles and buildings or other man-made structures. Colorization method led to provide superior discrimination, identification of objects (Lesions), faster reaction times and an increased scene understanding than monochrome fused image. The guided filtering approach is used to decompose the source images hence they are divided into two parts: approximation part and detail content part further the weighted-averaging method is used to fuse the approximation part. The multi-layer features are extracted from the detail content part using the VGG-19 network. Finally, the approximation part and detail content part will be combined to reconstruct the fused image. The proposed approach has offers better outcomes equated to prevailing state-of-the-art techniques in terms of quantitative and qualitative parameters. In future, propose technique will help Battlefield monitoring, Defence for situation awareness, Surveillance, Target tracking and Person authentication

    Perceptual Image Fusion Using Wavelets

    Get PDF
    • 

    corecore