64 research outputs found

    Blind Multimodal Quality Assessment of Low-light Images

    Full text link
    Blind image quality assessment (BIQA) aims at automatically and accurately forecasting objective scores for visual signals, which has been widely used to monitor product and service quality in low-light applications, covering smartphone photography, video surveillance, autonomous driving, etc. Recent developments in this field are dominated by unimodal solutions inconsistent with human subjective rating patterns, where human visual perception is simultaneously reflected by multiple sensory information. In this article, we present a unique blind multimodal quality assessment (BMQA) of low-light images from subjective evaluation to objective score. To investigate the multimodal mechanism, we first establish a multimodal low-light image quality (MLIQ) database with authentic low-light distortions, containing image-text modality pairs. Further, we specially design the key modules of BMQA, considering multimodal quality representation, latent feature alignment and fusion, and hybrid self-supervised and supervised learning. Extensive experiments show that our BMQA yields state-of-the-art accuracy on the proposed MLIQ benchmark database. In particular, we also build an independent single-image modality Dark-4K database, which is used to verify its applicability and generalization performance in mainstream unimodal applications. Qualitative and quantitative results on Dark-4K show that BMQA achieves superior performance to existing BIQA approaches as long as a pre-trained model is provided to generate text description. The proposed framework and two databases as well as the collected BIQA methods and evaluation metrics are made publicly available on here.Comment: 15 page

    IBVC: Interpolation-driven B-frame Video Compression

    Full text link
    Learned B-frame video compression aims to adopt bi-directional motion estimation and motion compensation (MEMC) coding for middle frame reconstruction. However, previous learned approaches often directly extend neural P-frame codecs to B-frame relying on bi-directional optical-flow estimation or video frame interpolation. They suffer from inaccurate quantized motions and inefficient motion compensation. To address these issues, we propose a simple yet effective structure called Interpolation-driven B-frame Video Compression (IBVC). Our approach only involves two major operations: video frame interpolation and artifact reduction compression. IBVC introduces a bit-rate free MEMC based on interpolation, which avoids optical-flow quantization and additional compression distortions. Later, to reduce duplicate bit-rate consumption and focus on unaligned artifacts, a residual guided masking encoder is deployed to adaptively select the meaningful contexts with interpolated multi-scale dependencies. In addition, a conditional spatio-temporal decoder is proposed to eliminate location errors and artifacts instead of using MEMC coding in other methods. The experimental results on B-frame coding demonstrate that IBVC has significant improvements compared to the relevant state-of-the-art methods. Meanwhile, our approach can save bit rates compared with the random access (RA) configuration of H.266 (VTM). The code will be available at https://github.com/ruhig6/IBVC.Comment: Submitted to IEEE TCSV

    Efficient Joint Optimization of Layer-Adaptive Weight Pruning in Deep Neural Networks

    Full text link
    In this paper, we propose a novel layer-adaptive weight-pruning approach for Deep Neural Networks (DNNs) that addresses the challenge of optimizing the output distortion minimization while adhering to a target pruning ratio constraint. Our approach takes into account the collective influence of all layers to design a layer-adaptive pruning scheme. We discover and utilize a very important additivity property of output distortion caused by pruning weights on multiple layers. This property enables us to formulate the pruning as a combinatorial optimization problem and efficiently solve it through dynamic programming. By decomposing the problem into sub-problems, we achieve linear time complexity, making our optimization algorithm fast and feasible to run on CPUs. Our extensive experiments demonstrate the superiority of our approach over existing methods on the ImageNet and CIFAR-10 datasets. On CIFAR-10, our method achieves remarkable improvements, outperforming others by up to 1.0% for ResNet-32, 0.5% for VGG-16, and 0.7% for DenseNet-121 in terms of top-1 accuracy. On ImageNet, we achieve up to 4.7% and 4.6% higher top-1 accuracy compared to other methods for VGG-16 and ResNet-50, respectively. These results highlight the effectiveness and practicality of our approach for enhancing DNN performance through layer-adaptive weight pruning. Code will be available on https://github.com/Akimoto-Cris/RD_VIT_PRUNE

    Deep reinforcement learning for optimal hydropower reservoir operation

    Get PDF
    Optimal operation of hydropower reservoir systems is a classical optimization problem of high dimensionality and stochastic nature. A key challenge lies in improving the interpretability of operation strategies, i.e., the cause–effect relationship between system outputs (or actions) and contributing variables such as states and inputs. This paper reports for the first time a new deep reinforcement learning (DRL) framework for optimal operation of reservoir systems based on deep Q-networks (DQNs), which provides a significant advance in understanding the performance of optimal operations. DQN combines Q-learning and two deep artificial neural networks (ANNs), and acts as the agent to interact with the reservoir system through learning its states and providing actions. Three knowledge forms of learning considering the states, actions, and rewards were constructed to improve the interpretability of operation strategies. The impacts of these knowledge forms and DRL learning parameters on operation performance were analyzed. The DRL framework was tested on the Huanren hydropower system in China, using 400-year synthetic flow data for training and 30-year observed flow data for verification. The discretization levels of reservoir water level and energy output yield contrasting effects: finer discretization of water level improved performance in terms of annual hydropower generated and hydropower production reliability; however, finer discretization of hydropower production can reduce search efficiency, and thus the resulting DRL performance. Compared with benchmark algorithms including dynamic programming, stochastic dynamic programming, and decision tree, the proposed DRL approach can effectively factor in future inflow uncertainties when determining optimal operations and can generate markedly higher hydropower. This study provides new knowledge of the performance of DRL in the context of hydropower system characteristics and data input features, and shows promise for potentially being implemented in practice to derive operation policies that can be updated automatically by learning from new data

    An integrated framework on autonomous-EV charging and autonomous valet parking (AVP) management system

    Get PDF
    Autonomous vehicles (AVs) transform traditional commuting by decreasing congestion, improving road safety, and naturally integrate better with electric controls for flexible implementation of autonomous driving technologies. Indeed, electric-powered AVs or autonomous electric vehicles (AEVs) are benefiting each other in many aspects. While autonomy brings great efficiency in driving as well as battery use, EVs require less maintenance and drastically cut fuel costs. With AVs, a pivotal concern is within the realm of long-range Autonomous Valet Parking (LAVP), such as diverse customer demands on parking (or drop-off / pick-up) for various journey planning. On the other hand, electric-powered AVs are typically with limited cruising range, and locating convenient charging services are also among the major impediments. As of yet, recent studies have started to investigate EV charging and LAVP in isolation as they rarely consider a joint optimization on user trip and energy refueling. Rather, we target in this work the integration of vehicle charging with autonomy in the sense of a systemic approach. Specifically, we propose an integrated AEV charging and LAVP management scheme, to resolve critical decision-making on convenient charging and parking management upon customer requirements during their journeys. The proposed scheme jointly considers charging reservations as well as parking duration at car parks (CPs), aiming to enable accurate predictions on future charging (and parking) states at CPs. Results show the advantage of our proposal over benchmarks, in terms of enhanced customer experiences in traveling period, as well as charging performances at both AEV and CP sides. Particularly, effective load balancing can be achieved across the network regarding the amount of charged as well as parked vehicles

    Dynamic hypergraph convolutional network for no-reference point cloud quality assessment

    Get PDF
    With the rapid advancement of three-dimensional (3D) sensing technology, point cloud has emerged as one of the most important approaches for representing 3D data. However, quality degradation inevitably occurs during the acquisition, transmission, and process of point clouds. Therefore, point cloud quality assessment (PCQA) with automatic visual quality perception is particularly critical. In the literature, the graph convolutional networks (GCNs) have achieved certain performance in point cloud-related tasks. However, they cannot fully characterize the nonlinear high-order relationship of such complex data. In this paper, we propose a novel no-reference (NR) PCQA method with hypergraph learning. Specifically, a dynamic hypergraph convolutional network (DHCN) composing of a projected image encoder, a point group encoder, a dynamic hypergraph generator, and a perceptual quality predictor, is devised. First, a projected image encoder and a point group encoder are used to extract feature representations from projected images and point groups, respectively. Then, using the feature representations obtained by the two encoders, dynamic hypergraphs are generated during each iteration, aiming to constantly update the interactive information between the vertices of hypergraphs. Finally, we design the perceptual quality predictor to conduct quality reasoning on the generated hypergraphs. By leveraging the interactive information among hypergraph vertices, feature representations are well aggregated, resulting in a notable improvement in the accuracy of quality pediction. Experimental results on several point cloud quality assessment databases demonstrate that our proposed DHCN can achieve state-of-the-art performance. The code will be available at: https://github.com/chenwuwq/DHCN

    Molecular communication via subdiffusion with a spherical absorbing receiver

    Get PDF
    In molecular communication (MC), the motion of information molecules in the medium is usually described by the Brownian motion and governed by the Fick’s laws. However, there are some potential scenarios of MC where the kinetics of information molecules is non-Fickian. In this letter, we investigate one of this kind of MC. The manner of information molecules in the channel is subdiffusion. A three-dimensional MC system with a spherical absorbing receiver is considered. The subdiffusion channel is analyzed. The closed-form expressions of the first hitting probability and its peak time are given. Furthermore, we investigate the performance of MC by timing and amplitude modulation schemes in a subdiffusion channel. The error probability for both modulation schemes is analyze
    • …
    corecore