Search CORE

10 research outputs found

Transcending the Limit of Local Window: Advanced Super-Resolution Transformer with Adaptive Token Dictionary

Author: Gu Shuhang
Li Yawei
Zhang Leheng
Zhao Xiaorui
Zhou Xingyu
Publication venue
Publication date: 18/01/2024
Field of study

Single Image Super-Resolution is a classic computer vision problem that involves estimating high-resolution (HR) images from low-resolution (LR) ones. Although deep neural networks (DNNs), especially Transformers for super-resolution, have seen significant advancements in recent years, challenges still remain, particularly in limited receptive field caused by window-based self-attention. To address these issues, we introduce a group of auxiliary Adaptive Token Dictionary to SR Transformer and establish an ATD-SR method. The introduced token dictionary could learn prior information from training data and adapt the learned prior to specific testing image through an adaptive refinement step. The refinement strategy could not only provide global information to all input tokens but also group image tokens into categories. Based on category partitions, we further propose a category-based self-attention mechanism designed to leverage distant but similar tokens for enhancing input features. The experimental results show that our method achieves the best performance on various single image super-resolution benchmarks.Comment: 15 pages, 9 figure

arXiv.org e-Print Archive

Empowering Collaborative Filtering with Principled Adversarial Contrastive Loss

Author: Cai Zhibo
Chua Tat-Seng
Sheng Leheng
Wang Xiang
Zhang An
Publication venue
Publication date: 28/10/2023
Field of study

Contrastive Learning (CL) has achieved impressive performance in self-supervised learning tasks, showing superior generalization ability. Inspired by the success, adopting CL into collaborative filtering (CF) is prevailing in semi-supervised top-K recommendations. The basic idea is to routinely conduct heuristic-based data augmentation and apply contrastive losses (e.g., InfoNCE) on the augmented views. Yet, some CF-tailored challenges make this adoption suboptimal, such as the issue of out-of-distribution, the risk of false negatives, and the nature of top-K evaluation. They necessitate the CL-based CF scheme to focus more on mining hard negatives and distinguishing false negatives from the vast unlabeled user-item interactions, for informative contrast signals. Worse still, there is limited understanding of contrastive loss in CF methods, especially w.r.t. its generalization ability. To bridge the gap, we delve into the reasons underpinning the success of contrastive loss in CF, and propose a principled Adversarial InfoNCE loss (AdvInfoNCE), which is a variant of InfoNCE, specially tailored for CF methods. AdvInfoNCE adaptively explores and assigns hardness to each negative instance in an adversarial fashion and further utilizes a fine-grained hardness-aware ranking criterion to empower the recommender's generalization ability. Training CF models with AdvInfoNCE, we validate the effectiveness of AdvInfoNCE on both synthetic and real-world benchmark datasets, thus showing its generalization ability to mitigate out-of-distribution problems. Given the theoretical guarantees and empirical superiority of AdvInfoNCE over most contrastive loss functions, we advocate its adoption as a standard loss in recommender systems, particularly for the out-of-distribution tasks. Codes are available at https://github.com/LehengTHU/AdvInfoNCE.Comment: Accepted to NeurIPS 202

arXiv.org e-Print Archive

Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention

Author: Gu Shuhang
Li Leida
Wang Keze
Zhang Leheng
Zhao Xiaorui
Zhou Xingyu
Publication venue
Publication date: 29/03/2024
Field of study

Recently, Vision Transformer has achieved great success in recovering missing details in low-resolution sequences, i.e., the video super-resolution (VSR) task. Despite its superiority in VSR accuracy, the heavy computational burden as well as the large memory footprint hinder the deployment of Transformer-based VSR models on constrained devices. In this paper, we address the above issue by proposing a novel feature-level masked processing framework: VSR with Masked Intra and inter frame Attention (MIA-VSR). The core of MIA-VSR is leveraging feature-level temporal continuity between adjacent frames to reduce redundant computations and make more rational use of previously enhanced SR features. Concretely, we propose an intra-frame and inter-frame attention block which takes the respective roles of past features and input features into consideration and only exploits previously enhanced features to provide supplementary information. In addition, an adaptive block-wise mask prediction module is developed to skip unimportant computations according to feature similarity between adjacent frames. We conduct detailed ablation studies to validate our contributions and compare the proposed method with recent state-of-the-art VSR approaches. The experimental results demonstrate that MIA-VSR improves the memory and computation efficiency over state-of-the-art methods, without trading off PSNR accuracy. The code is available at https://github.com/LabShuHangGU/MIA-VSR.Comment: Accepted by CVPR 202

arXiv.org e-Print Archive

Design and Application of Intelligent Transportation Multi-Source Data Collaboration Framework Based on Digital Twins

Author: Dingding Han
Leheng Fang
Xiaobo Zhang
Xihou Zhang
Publication venue: 'MDPI AG'
Publication date: 02/02/2023
Field of study

The increasing urban traffic problems have made the transportation system require a large amount of data. Aiming at the current problems of data types redundancy and low coordination rate of intelligent transportation systems (ITS), this paper proposes an improved digital twin architecture applicable to ITS. Based on the improved digital twin architecture, a framework for dynamic and static data collaboration in ITS is constructed. For various collaboration methods, this paper specifically describes the collaboration methods and scopes, and designs the framework and interfaces for data mapping. Finally, the effectiveness of the framework is verified by case studies to mine the spatiotemporal distribution characteristics of data, capture human travel characteristics, and visualize intersections using digital twins. This paper provides a new data fusion idea for digital twin systems in ITS, and the framework covers all data types in digital twin systems for cross-integration analysis

Multidisciplinary Digital Publishing Institute

Design and Application of Intelligent Transportation Multi-Source Data Collaboration Framework Based on Digital Twins

Author: Dingding Han
Leheng Fang
Xiaobo Zhang
Xihou Zhang
Publication venue: MDPI AG
Publication date: 01/02/2023
Field of study

Directory of Open Access Journals

Experimental Study on the Comparison between Network Microstructure Titanium Matrix Composites and Ti6Al4V on EDM Milling

Author: Leheng Zhang
Sirui Gong
Yizhou Hu
Zhenlong Wang
Publication venue: MDPI AG
Publication date: 01/05/2024
Field of study

Network microstructure titanium matrix composites (NMTMCs), featuring Ti6Al4V as the matrix and network-distributed TiB whiskers (TiBw) as reinforcement, exhibit remarkable potential for diverse applications due to their superior physical properties. Due to the difficulty in machining titanium matrix composites, electrical discharge machining (EDM) stands as one of the preferred machining techniques for NMTMCs. Nevertheless, the compromised surface quality and the recast layer significantly impact the performance of the workpiece machined by EDM. Therefore, for the purpose of enhancing the surface quality and restraining the defects of NMTMCs, this study conducted comparative EDM milling experiments between NMTMCs and Ti6Al4V to analyze the effects of discharge capacitance, charging current, and pulse interval on the surface roughness, recast layer thickness, recast layer uniformity, and surface microcrack density of both materials. The results indicated that machining energy significantly influences workpiece surface quality. Furthermore, comparative experiments exploring the influence of network reinforcement on EDM milling revealed that NMTMCs have a higher melting point, leading to an accumulation phenomenon in low-energy machining where the reinforcement could not be completely removed. The residual reinforcement in the recasting layer had an adsorption effect on molten metal affecting the thermal conductivity and uniformity within the recasting layer. Finally, specific guidelines are put forward for optimizing the material’s surface roughness, recast layer thickness, and uniformity, along with minimizing microcrack density, which attain a processing effect that features a roughness of Ra 0.9 μm, an average recast layer thickness of 6 μm with a range of 8 μm, and a surface microcrack density of 0.08 μm−1

Directory of Open Access Journals

Enhancement of Imaging Quality of Interferenceless Coded Aperture Correlation Holography Based on Physics-Informed Deep Learning

Author: Leheng Li
Lili Qi
Rui Xiong
Xiangchao Zhang
Xiangqian Jiang
Xinyang Ma
Publication venue: MDPI AG
Publication date: 01/12/2022
Field of study

Interferenceless coded aperture correlation holography (I-COACH) was recently introduced for recording incoherent holograms without two-wave interference. In I-COACH, the light radiated from an object is modulated by a pseudo-randomly-coded phase mask and recorded as a hologram by a digital camera without interfering with any other beams. The image reconstruction is conducted by correlating the object hologram with the point spread hologram. However, the image reconstructed by the conventional correlation algorithm suffers from serious background noise, which leads to poor imaging quality. In this work, via an effective combination of the speckle correlation and neural network, we propose a high-quality reconstruction strategy based on physics-informed deep learning. Specifically, this method takes the autocorrelation of the speckle image as the input of the network, and switches from establishing a direct mapping between the object and the image into a mapping between the autocorrelations of the two. This method improves the interpretability of neural networks through prior physics knowledge, thereby remedying the data dependence and computational cost. In addition, once a final model is obtained, the image reconstruction can be completed by one camera exposure. Experimental results demonstrate that the background noise can be effectively suppressed, and the resolution of the reconstructed images can be enhanced by three times

Directory of Open Access Journals

Flexible Image Reconstruction in the Orbital Angular Momentum Holography with Binarized Airy Lens

Author: Feili Wang
He Yuan
Leheng Li
Rui Xiong
Xiangchao Zhang
Xiangqian Jiang
Xinyang Ma
Publication venue: MDPI AG
Publication date: 01/06/2022
Field of study

The orbital angular momentum (OAM) holography has been marked a path to achieving ultrahigh capacity holographic information systems. However, the practical applicability of the OAM holography is limited by the complicated optical setup and unadjustable image intensity and position. Here, a decoding method is proposed by using a binarized phase map derived from an autofocusing Airy beam. By adjusting the parameters of the phase map, the position and intensity distribution of the reconstructed image become flexibly adjustable. In addition, the cross-talk between different image channels can be effectively reduced thanks to the abruptly autofocusing capability of the Airy beams. As a result, the quality and practicability of the OAM holography can be greatly enhanced

Directory of Open Access Journals

Bi-DAINet: Bi-Directional Discard-Accept-Integrate Network for salient object detection

Author: Bi
Bo Jin
Borji
Carrasco
Chen
Chen
Cheng
Cuili Yao
Deutsch
Everingham
Fan
Fang
Feng
He
Hochreiter
Hong
Hou
Itti
Jia
Leheng Li
Li
Li
Li
Li
Lin Feng
Liu
Liu
Liu
Long
Luo
Qin
Ronneberger
Wang
Wang
Wang
Wang
Wang
Wang
Wang
Wang
Wei
Wu
Yan
Yang
Yiwei Liu
Yuqiu Kong
Zeng
Zhang
Zhang
Zhang
Zhang
Zhang
Zhao
Zhao
Zhao
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref