Search CORE

706 research outputs found

Statistical Analysis of a Posteriori Channel and Noise Distribution Based on HARQ Feedback

Author: Ding Zhi
Mittelmann Hans
Wu Wenhao
Publication venue
Publication date: 16/01/2016
Field of study

In response to a comment on one of our manuscript, this work studies the posterior channel and noise distributions conditioned on the NACKs and ACKs of all previous transmissions in HARQ system with statistical approaches. Our main result is that, unless the coherence interval (time or frequency) is large as in block-fading assumption, the posterior distribution of the channel and noise either remains almost identical to the prior distribution, or it mostly follows the same class of distribution as the prior one. In the latter case, the difference between the posterior and prior distribution can be modeled as some parameter mismatch, which has little impact on certain type of applications.Comment: 15 pages, 2 figures, 4 table

arXiv.org e-Print Archive

eScholarship - University of California

High-Performance Matrix Multiplication: Hierarchical Data Structures, Optimized Kernel Routines, and Qualitative Performance Modeling

Author: Wu Wenhao
Publication venue: Scholars Junction
Publication date: 10/07/2003
Field of study

The optimal implementation of matrix multiplication on modern computer architectures is of great importance for scientific and engineering applications. However, achieving the optimal performance for matrix multiplication has been continuously challenged both by the ever-widening performance gap between the processor and memory hierarchy and the introduction of new architectural features in modern architectures. The conventional way of dealing with these challenges benefits significantly from the blocking algorithm, which improves the data locality in the cache memory, and from the highly tuned inner kernel routines, which in turn exploit the architectural aspects on the specific processor to deliver near peak performance. A state-of-art improvement of the blocking algorithm is the self-tuning approach that utilizes heroic combinatorial optimization of parameters spaces. Other recent research approaches include the approach that explicitly blocks for the TLB (Translation Lookaside Buffer) and the hierarchical formulation that employs memoryriendly Morton Ordering (a spaceilling curve methodology). This thesis compares and contrasts the TLB-blocking-based and Morton-Order-based methods for dense matrix multiplication, and offers a qualitative model to explain the performance behavior. Comparisons to the performance of self-tuning library and the vendor library are also offered for the Alpha architecture. The practical benchmark experiments demonstrate that neither conventional blocking-based implementations nor the self-tuning libraries are optimal to achieve consistent high performance in dense matrix multiplication of relatively large square matrix size. Instead, architectural constraints and issues evidently restrict the critical path and options available for optimal performance, so that the relatively simple strategy and framework presented in this study offers higher and flatter overall performance. Interestingly, maximal inner kernel efficiency is not a guarantee of global minimal multiplication time. Also, efficient and flat performance is possible at all problem sizes that fit in main memory, rather than jagged performance curves often observed in blocking and self-tuned blocking libraries

Mississippi State University Libraries ETD database

Scholars Junction - Mississippi State University Institutional Repository

Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning

Author: Li Zhiheng
Wu Wenhao
Yao Huanjin
Publication venue
Publication date: 27/11/2023
Field of study

Large pre-trained vision models achieve impressive success in computer vision. However, fully fine-tuning large models for downstream tasks, particularly in video understanding, can be prohibitively computationally expensive. Recent studies turn their focus towards efficient image-to-video transfer learning. Nevertheless, existing efficient fine-tuning methods lack attention to training memory usage and exploration of transferring a larger model to the video domain. In this paper, we present a novel Spatial-Temporal Side Network for memory-efficient fine-tuning large image models to video understanding, named Side4Video. Specifically, we introduce a lightweight spatial-temporal side network attached to the frozen vision model, which avoids the backpropagation through the heavy pre-trained model and utilizes multi-level spatial features from the original image model. Extremely memory-efficient architecture enables our method to reduce 75% memory usage than previous adapter-based methods. In this way, we can transfer a huge ViT-E (4.4B) for video understanding tasks which is 14x larger than ViT-L (304M). Our approach achieves remarkable performance on various video datasets across unimodal and cross-modal tasks (i.e., action recognition and text-video retrieval), especially in Something-Something V1&V2 (67.3% & 74.6%), Kinetics-400 (88.6%), MSR-VTT (52.3%), MSVD (56.1%) and VATEX (68.8%). We release our code at https://github.com/HJYao00/Side4Video.Comment: Technical repor

arXiv.org e-Print Archive

Lorentz Quantum Computer

Author: He Wenhao
Wang Zhenduo
Wu Biao
Publication venue
Publication date: 03/06/2021
Field of study

A theoretical model of computation is proposed based on Lorentz quantum mechanics. Besides the standard qubits, this model has an additional bit, which we call hyperbolic bit (or hybit in short). A set of basic logical gates are constructed and their universality is proved. As an application, a search algorithm is designed for this computer model and is found to be exponentially faster than the Grover's search algorithm

arXiv.org e-Print Archive

Revisiting Classifier: Transferring Vision-Language Models for Video Recognition

Author: Ouyang Wanli
Sun Zhun
Wu Wenhao
Publication venue
Publication date: 26/03/2023
Field of study

Transferring knowledge from task-agnostic pre-trained deep models for downstream tasks is an important topic in computer vision research. Along with the growth of computational capacity, we now have open-source vision-language pre-trained models in large scales of the model architecture and amount of data. In this study, we focus on transferring knowledge for video classification tasks. Conventional methods randomly initialize the linear classifier head for vision classification, but they leave the usage of the text encoder for downstream visual recognition tasks undiscovered. In this paper, we revise the role of the linear classifier and replace the classifier with the different knowledge from pre-trained model. We utilize the well-pretrained language model to generate good semantic target for efficient transferring learning. The empirical study shows that our method improves both the performance and the training speed of video classification, with a negligible change in the model. Our simple yet effective tuning paradigm achieves state-of-the-art performance and efficient training on various video recognition scenarios, i.e., zero-shot, few-shot, general recognition. In particular, our paradigm achieves the state-of-the-art accuracy of 87.8% on Kinetics-400, and also surpasses previous methods by 20~50% absolute top-1 accuracy under zero-shot, few-shot settings on five popular video datasets. Code and models can be found at https://github.com/whwu95/Text4Vis .Comment: Accepted by AAAI-2023. Camera Ready Versio

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

In-situ electrochemical fabrication of natural contacts on single nanowires

Author: DiMaria J. B.
Pan Shanlin
Rothberg L. J.
Wu Wenhao
Yoo Han G.
Zhang Yong
Publication venue: 'AIP Publishing'
Publication date: 07/10/2003
Field of study

We report a template-based in-situ electrochemical method for fabricating natural electric contacts on single nanowires using a pair of cross-patterned electrodes. Such electric contacts are highly stable upon thermal cycling between room temperature and milli-Kelvin temperatures. Direct imaging of the single-nanowire contacts using scanning electron microscopy is also demonstrated.Comment: 13 pages, 4 figure

arXiv.org e-Print Archive

Crossref