Search CORE

1,115 research outputs found

Benign Overfitting and Noisy Features

Author: Li Zhu
Sejdinovic Dino
Su Weijie
Publication venue
Publication date: 04/02/2021
Field of study

Modern machine learning often operates in the regime where the number of parameters is much higher than the number of data points, with zero training loss and yet good generalization, thereby contradicting the classical bias-variance trade-off. This \textit{benign overfitting} phenomenon has recently been characterized using so called \textit{double descent} curves where the risk undergoes another descent (in addition to the classical U-shaped learning curve when the number of parameters is small) as we increase the number of parameters beyond a certain threshold. In this paper, we examine the conditions under which \textit{Benign Overfitting} occurs in the random feature (RF) models, i.e. in a two-layer neural network with fixed first layer weights. We adopt a new view of random feature and show that \textit{benign overfitting} arises due to the noise which resides in such features (the noise may already be present in the data and propagate to the features or it may be added by the user to the features directly) and plays an important implicit regularization role in the phenomenon

arXiv.org e-Print Archive

Oxford University Research Archive

Performance Analysis of Asynchronous NB-IoT Up-link Systems

Author: Zhu Weijie
Publication venue
Publication date: 08/11/2017
Field of study

The Third Generation Partnership Project (3GPP) published LTE release 13, which standardized a new radio access network (RAN) called Narrowband Internet of Things (NB-IoT). Such networks, particularly designed for massive machine-type communications (mMTC), inherit theIR functionalities from the existing LTE systems with slight differences and operate in a narrow frequency band of 180 kHz, consisting of one resource block (RB) of 12 LTE subcarriers. This thesis is mainly focused on single-tone in-band transmission with one 15 kHz subcarrier of the NB-IoT RB in the middle of the LTE RBs. The aim of this thesis is to examine the performance of both NB-IoT transmission and LTE transmission after certain enhancements of the NB-IoT transmitter. These additional approaches including time-domain windowing and filtering. Also a nonlinear power amplifier model for the NB-IoT transmitter is included in the study. It is worth to mention that NB-IoT and LTE signals are transmitted together through asynchronous channels to evaluate the effect of noise and Inter-Carrier Interference (ICI). In order to compare the effects of different modulation schemes, 4-QAM and 64-QAM are both considered for LTE transmission. Filters are designed to suppress the spectral sidelobes of transmitted signals to reduce the interferences due to asynchronous operation. What’s more, transmissions with one-subcarrier-wide guard band between the active NB-IoT and LTE subcarriers or without guard band are both examined from bit error-rate (BER) perspective

Trepo - Institutional Repository of Tampere University

TUT DPub

Performance Analysis of Asynchronous NB-IoT Up-link Systems

Author: Zhu Weijie
Publication venue
Publication date: 08/11/2017
Field of study

University of Debrecen Electronic Archive

Trepo - Institutional Repository of Tampere University

Data Augmentation Vision Transformer for Fine-grained Image Classification

Author: Hu Chao
Qiu Weibin
Wu Weijie
Zhu Liqiang
Publication venue
Publication date: 24/11/2022
Field of study

Recently, the vision transformer (ViT) has made breakthroughs in image recognition. Its self-attention mechanism (MSA) can extract discriminative labeling information of different pixel blocks to improve image classification accuracy. However, the classification marks in their deep layers tend to ignore local features between layers. In addition, the embedding layer will be fixed-size pixel blocks. Input network Inevitably introduces additional image noise. To this end, we study a data augmentation vision transformer (DAVT) based on data augmentation and proposes a data augmentation method for attention cropping, which uses attention weights as the guide to crop images and improve the ability of the network to learn critical features. Secondly, we also propose a hierarchical attention selection (HAS) method, which improves the ability of discriminative markers between levels of learning by filtering and fusing labels between levels. Experimental results show that the accuracy of this method on the two general datasets, CUB-200-2011, and Stanford Dogs, is better than the existing mainstream methods, and its accuracy is 1.4\% and 1.6\% higher than the original ViT, respectivelyComment: IEEE Signal Processing Letter

arXiv.org e-Print Archive

Continual Learning in Open-vocabulary Classification with Complementary Memory Systems

Author: Hoiem Derek
Lyu Weijie
Xiao Yao
Zhu Zhen
Publication venue
Publication date: 03/07/2023
Field of study

We introduce a method for flexible continual learning in open-vocabulary image classification, drawing inspiration from the complementary learning systems observed in human cognition. We propose a "tree probe" method, an adaption of lazy learning principles, which enables fast learning from new examples with competitive accuracy to batch-trained linear models. Further, we propose a method to combine predictions from a CLIP zero-shot model and the exemplar-based model, using the zero-shot estimated probability that a sample's class is within any of the exemplar classes. We test in data incremental, class incremental, and task incremental settings, as well as ability to perform flexible inference on varying subsets of zero-shot and learned categories. Our proposed method achieves a good balance of learning speed, target task effectiveness, and zero-shot effectiveness.Comment: In revie

arXiv.org e-Print Archive

A Lightweight Reconstruction Network for Surface Defect Inspection

Author: Hu Chao
Qiu Weibin
Wu Weijie
Yao Jian
Zhu Liqiang
Publication venue
Publication date: 25/12/2022
Field of study

Currently, most deep learning methods cannot solve the problem of scarcity of industrial product defect samples and significant differences in characteristics. This paper proposes an unsupervised defect detection algorithm based on a reconstruction network, which is realized using only a large number of easily obtained defect-free sample data. The network includes two parts: image reconstruction and surface defect area detection. The reconstruction network is designed through a fully convolutional autoencoder with a lightweight structure. Only a small number of normal samples are used for training so that the reconstruction network can be A defect-free reconstructed image is generated. A function combining structural loss and

\mathit{L}1

loss is proposed as the loss function of the reconstruction network to solve the problem of poor detection of irregular texture surface defects. Further, the residual of the reconstructed image and the image to be tested is used as the possible region of the defect, and conventional image operations can realize the location of the fault. The unsupervised defect detection algorithm of the proposed reconstruction network is used on multiple defect image sample sets. Compared with other similar algorithms, the results show that the unsupervised defect detection algorithm of the reconstructed network has strong robustness and accuracy.Comment: Journal of Mathematical Imaging and Vision(JMIV

arXiv.org e-Print Archive

Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction

Author: Bian Weijie
Fang Yuchen
Gai Kun
Qin Jiarui
Ren Kan
Xu Jian
Yu Yong
Zhang Weinan
Zheng Lei
Zhou Guorui
Zhu Xiaoqiang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/05/2019
Field of study

User response prediction, which models the user preference w.r.t. the presented items, plays a key role in online services. With two-decade rapid development, nowadays the cumulated user behavior sequences on mature Internet service platforms have become extremely long since the user's first registration. Each user not only has intrinsic tastes, but also keeps changing her personal interests during lifetime. Hence, it is challenging to handle such lifelong sequential modeling for each individual user. Existing methodologies for sequential modeling are only capable of dealing with relatively recent user behaviors, which leaves huge space for modeling long-term especially lifelong sequential patterns to facilitate user modeling. Moreover, one user's behavior may be accounted for various previous behaviors within her whole online activity history, i.e., long-term dependency with multi-scale sequential patterns. In order to tackle these challenges, in this paper, we propose a Hierarchical Periodic Memory Network for lifelong sequential modeling with personalized memorization of sequential patterns for each user. The model also adopts a hierarchical and periodical updating mechanism to capture multi-scale sequential patterns of user interests while supporting the evolving user behavior logs. The experimental results over three large-scale real-world datasets have demonstrated the advantages of our proposed model with significant improvement in user response prediction performance against the state-of-the-arts.Comment: SIGIR 2019. Reproducible codes and datasets: https://github.com/alimamarankgroup/HPM

arXiv.org e-Print Archive

Crossref