39 research outputs found

    Online Video Super-Resolution with Convolutional Kernel Bypass Graft

    Full text link
    Deep learning-based models have achieved remarkable performance in video super-resolution (VSR) in recent years, but most of these models are less applicable to online video applications. These methods solely consider the distortion quality and ignore crucial requirements for online applications, e.g., low latency and low model complexity. In this paper, we focus on online video transmission, in which VSR algorithms are required to generate high-resolution video sequences frame by frame in real time. To address such challenges, we propose an extremely low-latency VSR algorithm based on a novel kernel knowledge transfer method, named convolutional kernel bypass graft (CKBG). First, we design a lightweight network structure that does not require future frames as inputs and saves extra time costs for caching these frames. Then, our proposed CKBG method enhances this lightweight base model by bypassing the original network with ``kernel grafts'', which are extra convolutional kernels containing the prior knowledge of external pretrained image SR models. In the testing phase, we further accelerate the grafted multi-branch network by converting it into a simple single-path structure. Experiment results show that our proposed method can process online video sequences up to 110 FPS, with very low model complexity and competitive SR performance

    PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation

    Full text link
    Dynamic sparsity, where the sparsity patterns are unknown until runtime, poses a significant challenge to deep learning. The state-of-the-art sparsity-aware deep learning solutions are restricted to pre-defined, static sparsity patterns due to significant overheads associated with preprocessing. Efficient execution of dynamic sparse computation often faces the misalignment between the GPU-friendly tile configuration for efficient execution and the sparsity-aware tile shape that minimizes coverage wastes (non-zero values in tensor). In this paper, we propose PIT, a deep-learning compiler for dynamic sparsity. PIT proposes a novel tiling mechanism that leverages Permutation Invariant Transformation (PIT), a mathematically proven property, to transform multiple sparsely located micro-tiles into a GPU-efficient dense tile without changing the computation results, thus achieving both high GPU utilization and low coverage waste. Given a model, PIT first finds feasible PIT rules for all its operators and generates efficient GPU kernels accordingly. At runtime, with the novel SRead and SWrite primitives, PIT rules can be executed extremely fast to support dynamic sparsity in an online manner. Extensive evaluation on diverse models shows that PIT can accelerate dynamic sparsity computation by up to 5.9x (average 2.43x) over state-of-the-art compilers

    Online Streaming Video Super-Resolution with Convolutional Look-Up Table

    Full text link
    Online video streaming has fundamental limitations on the transmission bandwidth and computational capacity and super-resolution is a promising potential solution. However, applying existing video super-resolution methods to online streaming is non-trivial. Existing video codecs and streaming protocols (\eg, WebRTC) dynamically change the video quality both spatially and temporally, which leads to diverse and dynamic degradations. Furthermore, online streaming has a strict requirement for latency that most existing methods are less applicable. As a result, this paper focuses on the rarely exploited problem setting of online streaming video super resolution. To facilitate the research on this problem, a new benchmark dataset named LDV-WebRTC is constructed based on a real-world online streaming system. Leveraging the new benchmark dataset, we proposed a novel method specifically for online video streaming, which contains a convolution and Look-Up Table (LUT) hybrid model to achieve better performance-latency trade-off. To tackle the changing degradations, we propose a mixture-of-expert-LUT module, where a set of LUT specialized in different degradations are built and adaptively combined to handle different degradations. Experiments show our method achieves 720P video SR around 100 FPS, while significantly outperforms existing LUT-based methods and offers competitive performance compared to efficient CNN-based methods

    Evaluation of capacities of bucket foundations in soft-stiff-soft clays under combined loading

    No full text
    Bucket foundations are widely constructed for offshore wind turbines, which are subjected to combined vertical-horizontal-moment (V-H-M) loading during operation. This technical note presents an extensive investigation into the response of bucket foundations in soft clay interbedded with a stiff clay layer (or soft-stiff-soft clays) under combined loading, which is a supplement to the existing design specification. The numerical method employed in this study is validated by comparing the bearing capacities of bucket foundations with previously published data. The numerical modeling results from this study show that the failure mechanism of bucket foundations in soft-stiff-soft clays under combined loading is significantly different from that in single-layer soft clay condition. A series of numerical analyses are conducted to explore the effects of geometric variation with the interbedded stiff clay, soil material properties and combined loading onto the bearing capacities of bucket foundations. Based on the parametric studies, a new failure mechanism for bucket foundations in soft-stiff-soft clays under general loading is obtained. And a corresponding design method is established, which can be used to calculate the monotonic vertical, horizontal, and moment bearing capacities, as well as the capacity envelopes under combined loading

    The Differential Influences of Parenting Styles on Children's Academic Achievement in Chinese, Mathematics and English: The Chain-Mediating Effects of Academic Motivation and Academic Engagement

    No full text
    This study investigated the chain mediating roles of children’s academic motivation (intrinsic and extrinsic) and academic engagement in the relationships between parenting styles (positive and negative) and their academic achievements in Chinese, mathematics, and English. The participants were 433 elementary school children from grades 3 to 5. After controlling for child’s sex, age, nonverbal intelligence, and family socioeconomic status, the results found that: (1) positive parenting styles exhibited a beneficial effect on academic achievement in Chinese, but not in mathematics and English. Negative parenting styles did not yield a significant effect on academic achievement in any of the subjects examined; (2) positive parenting styles could affect children’s Chinese achievement through their academic engagement and thorough the chain mediating roles of intrinsic academic motivation and academic engagement. However, these mediating and chain mediating effects were not observed in the case of mathematics and English. This study emphasizes the significance of positive parenting styles specifically in relation to the Chinese subject, as opposed to mathematics and English. These findings hold important theoretical and practical implications for promoting children’s academic achievement

    Nitric oxide sensing revisited

    No full text
    Nitric oxide (NO) sensing is an ancient trait enabled by hemoproteins harboring a highly conserved Heme-Nitric oxide/OXygen (H-NOX) domain that operates throughout bacteria, fungi, and animal kingdoms including in humans, but that has long thought to be absent in plants. Recently, H-NOX-containing plant hemoproteins mediating crucial NO-dependent responses such as stomatal closure and pollen tube guidance have been reported. There are indications that the detection method that led to these discoveries will uncover many more heme-based NO sensors that operate as regulatory sites in complex proteins. Their characterizations will in turn offer a much more complete picture of plant NO responses at both the molecular and systems level

    EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention

    Full text link
    Vision transformers have shown great success due to their high model capabilities. However, their remarkable performance is accompanied by heavy computation costs, which makes them unsuitable for real-time applications. In this paper, we propose a family of high-speed vision transformers named EfficientViT. We find that the speed of existing transformer models is commonly bounded by memory inefficient operations, especially the tensor reshaping and element-wise functions in MHSA. Therefore, we design a new building block with a sandwich layout, i.e., using a single memory-bound MHSA between efficient FFN layers, which improves memory efficiency while enhancing channel communication. Moreover, we discover that the attention maps share high similarities across heads, leading to computational redundancy. To address this, we present a cascaded group attention module feeding attention heads with different splits of the full feature, which not only saves computation cost but also improves attention diversity. Comprehensive experiments demonstrate EfficientViT outperforms existing efficient models, striking a good trade-off between speed and accuracy. For instance, our EfficientViT-M5 surpasses MobileNetV3-Large by 1.9% in accuracy, while getting 40.4% and 45.2% higher throughput on Nvidia V100 GPU and Intel Xeon CPU, respectively. Compared to the recent efficient model MobileViT-XXS, EfficientViT-M2 achieves 1.8% superior accuracy, while running 5.8x/3.7x faster on the GPU/CPU, and 7.4x faster when converted to ONNX format. Code and models are available at https://github.com/microsoft/Cream/tree/main/EfficientViT.Comment: CVPR 202

    WH<sup>2</sup>D<sup>2</sup>N<sup>2</sup>: distributed AI-enabled OK-ASN service for Web of Things

    No full text
    Model data-driven ontology and knowledge presentation for evolving semantic Asian social networks (OK-ASN) is a critical strategy for web of things (WoT) services. Meanwhile, Deep Neural Network (DNN)-based OK-ASN service in WoT is growing rapidly. However, most DNN-based services cannot utilize the potential of WoT fully, as heterogeneity exists in WoT. Therefore, this article proposes a novel framework called Web-based Heterogeneous Hierarchical Distributed Deep Neural Network (WH2D2N2) to deploy the DNNs for OK-ASN services on WoT, overcoming the heterogeneity. The architecture of the system and the designed Edge-Cloud-Joint execute scheme utilize heterogeneous devices to make DNN inference ubiquitous and output two types of results to meet various requirements. To bring robustness to OK-ASN services, a global scheduling is designed to arrange the workflow dynamically. The results of our experiments prove the efficiency of the execute scheme and the global scheduling in the system.</p

    An adhesive cellulose nanocrystal-reinforced nanocomposite hydrogel electrolyte for supercapacitor applications

    No full text
    Hydrogel electrolytes were applied in various energy storage devices, including supercapacitors. However, they still suffer from disadvantages such as low mechanical performance and poor adhesion of the interfaces between electrolytes and electrodes. Herein, an adhesive hydrogel electrolyte with promising mechanical strength and electrochemical performance was designed by introducing hydrophobic carbon chains as long-range physical cross-linkers and cellulose nanocrystal (CNC) as biopolymer nano-reinforcement, and soak-loading liquid electrolytes such as KOH into the hydrogel matrix. The hydrogel electrolyte loaded with 1 M KOH demonstrated the best tensile stress of 362.31 kPa and an elongation of 2479 %, and exhibited self-repairability by applying stimuli on the cut interface. The hydrogel electrolyte showed excellent adhesion on various surfaces, including nonconductive and conductive materials such as cardboard, leather, carbon film, and carbon cloth. Regarding electrochemical properties, the hydrogel electrolyte showed the largest conductivity of 0.207 ± 0.005 S/cm when soak-loading in 1 M KOH for 24 h. Moreover, the hydrogel electrolyte exhibited promising electrochemical performance when assembled into coin-cell supercapacitors using free-standing activated carbon sheets as electrodes. A capacitance of 67.31 F/g at 0.05 A/g, and almost 100 % capacitance retention at 0.1 A/g after 2200 cycles was achieved
    corecore