Search CORE

79 research outputs found

Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning

Author: Hu Zixuan
Liu Tongliang
Shen Li
Tao Dacheng
Wang Zhenyi
Yuan Chun
Publication venue
Publication date: 19/06/2023
Field of study

The goal of data-free meta-learning is to learn useful prior knowledge from a collection of pre-trained models without accessing their training data. However, existing works only solve the problem in parameter space, which (i) ignore the fruitful data knowledge contained in the pre-trained models; (ii) can not scale to large-scale pre-trained models; (iii) can only meta-learn pre-trained models with the same network architecture. To address those issues, we propose a unified framework, dubbed PURER, which contains: (1) ePisode cUrriculum inveRsion (ECI) during data-free meta training; and (2) invErsion calibRation following inner loop (ICFIL) during meta testing. During meta training, we propose ECI to perform pseudo episode training for learning to adapt fast to new unseen tasks. Specifically, we progressively synthesize a sequence of pseudo episodes by distilling the training data from each pre-trained model. The ECI adaptively increases the difficulty level of pseudo episodes according to the real-time feedback of the meta model. We formulate the optimization process of meta training with ECI as an adversarial form in an end-to-end manner. During meta testing, we further propose a simple plug-and-play supplement-ICFIL-only used during meta testing to narrow the gap between meta training and meta testing task distribution. Extensive experiments in various real-world scenarios show the superior performance of ours

arXiv.org e-Print Archive

Symmetric Pruning in Quantum Neural Networks

Author: Du Yuxuan
Liu Junyu
Liu Tongliang
Luo Yong
Tao Dacheng
Wang Xinbiao
Publication venue
Publication date: 07/02/2023
Field of study

Many fundamental properties of a quantum system are captured by its Hamiltonian and ground state. Despite the significance of ground states preparation (GSP), this task is classically intractable for large-scale Hamiltonians. Quantum neural networks (QNNs), which exert the power of modern quantum machines, have emerged as a leading protocol to conquer this issue. As such, how to enhance the performance of QNNs becomes a crucial topic in GSP. Empirical evidence showed that QNNs with handcraft symmetric ansatzes generally experience better trainability than those with asymmetric ansatzes, while theoretical explanations have not been explored. To fill this knowledge gap, here we propose the effective quantum neural tangent kernel (EQNTK) and connect this concept with over-parameterization theory to quantify the convergence of QNNs towards the global optima. We uncover that the advance of symmetric ansatzes attributes to their large EQNTK value with low effective dimension, which requests few parameters and quantum circuit depth to reach the over-parameterization regime permitting a benign loss landscape and fast convergence. Guided by EQNTK, we further devise a symmetric pruning (SP) scheme to automatically tailor a symmetric ansatz from an over-parameterized and asymmetric one to greatly improve the performance of QNNs when the explicit symmetry information of Hamiltonian is unavailable. Extensive numerical simulations are conducted to validate the analytical results of EQNTK and the effectiveness of SP.Comment: Accepted to International Conference on Learning Representations (ICLR) 202

arXiv.org e-Print Archive

Assessing the wind energy potential of China in considering its variability/intermittency

Author: Gao Yang
Gong Yulai
Ma Shaoxiu
Peng Fei
Tsunekawa Atsushi
Wang Tao
Wang Tongliang
Publication venue: 'Elsevier BV'
Publication date: 28/07/2021
Field of study

While wind energy experienced massive deployment in the last decades, the intermittency of wind energy hindered its usage and hence leads to curtailment. It is imperative to quantify and mitigate the intermittency/variability of wind energy for research community as well as industry, but there are no consensus methods yet. The present study took the first attempt to quantify the cost of the variability/intermittency of wind energy with battery energy storage system, aiming at comprehensively assessing the spatial distribution of the exploitability of wind energy in China. The research found that the most abundant wind resources are located in Tibet Plateau, Hexi Corridor, Inner Mongolia in considering the abundance of wind resources, land use type, and landforms, as well as the variability of wind energy. In the near future, wind farms with the advanced energy storage technology in 2030 or 2050 could provide stable wind energy with marketing comparable prices, which is lower than the price of current coal-fired electricity (about 0.5 CNY/kWh). It is worth to note that the variability of wind energy in Qinghai Tibet Plateau could lead to high demanding of storage capacity and therefore unaffordable cost. The proposed methodology can be applied in different regions worldwide. The results of this study could also be a scientific foundation for policy makers for wind power development in China mainland

Tottori University Research Result Repository

Continual Learning From a Stream of APIs

Author: Guo Guibing
Liu Tongliang
Shen Li
Tao Dacheng
Wang Xingwei
Wang Zhenyi
Yang Enneng
Yin Nan
Publication venue
Publication date: 31/08/2023
Field of study

Continual learning (CL) aims to learn new tasks without forgetting previous tasks. However, existing CL methods require a large amount of raw data, which is often unavailable due to copyright considerations and privacy risks. Instead, stakeholders usually release pre-trained machine learning models as a service (MLaaS), which users can access via APIs. This paper considers two practical-yet-novel CL settings: data-efficient CL (DECL-APIs) and data-free CL (DFCL-APIs), which achieve CL from a stream of APIs with partial or no raw data. Performing CL under these two new settings faces several challenges: unavailable full raw data, unknown model parameters, heterogeneous models of arbitrary architecture and scale, and catastrophic forgetting of previous APIs. To overcome these issues, we propose a novel data-free cooperative continual distillation learning framework that distills knowledge from a stream of APIs into a CL model by generating pseudo data, just by querying APIs. Specifically, our framework includes two cooperative generators and one CL model, forming their training as an adversarial game. We first use the CL model and the current API as fixed discriminators to train generators via a derivative-free method. Generators adversarially generate hard and diverse synthetic data to maximize the response gap between the CL model and the API. Next, we train the CL model by minimizing the gap between the responses of the CL model and the black-box API on synthetic data, to transfer the API's knowledge to the CL model. Furthermore, we propose a new regularization term based on network similarity to prevent catastrophic forgetting of previous APIs.Our method performs comparably to classic CL with full raw data on the MNIST and SVHN in the DFCL-APIs setting. In the DECL-APIs setting, our method achieves 0.97x, 0.75x and 0.69x performance of classic CL on CIFAR10, CIFAR100, and MiniImageNet

arXiv.org e-Print Archive

BadLabel: A Robust Perspective on Evaluating and Enhancing Label-noise Learning

Author: Han Bo
Liu Lei
Liu Tongliang
Song Bo
Sugiyama Masashi
Wang Haohan
Zhang Jingfeng
Publication venue
Publication date: 28/05/2023
Field of study

Label-noise learning (LNL) aims to increase the model's generalization given training data with noisy labels. To facilitate practical LNL algorithms, researchers have proposed different label noise types, ranging from class-conditional to instance-dependent noises. In this paper, we introduce a novel label noise type called BadLabel, which can significantly degrade the performance of existing LNL algorithms by a large margin. BadLabel is crafted based on the label-flipping attack against standard classification, where specific samples are selected and their labels are flipped to other labels so that the loss values of clean and noisy labels become indistinguishable. To address the challenge posed by BadLabel, we further propose a robust LNL method that perturbs the labels in an adversarial manner at each epoch to make the loss values of clean and noisy labels again distinguishable. Once we select a small set of (mostly) clean labeled data, we can apply the techniques of semi-supervised learning to train the model accurately. Empirically, our experimental results demonstrate that existing LNL algorithms are vulnerable to the newly introduced BadLabel noise type, while our proposed robust LNL method can effectively improve the generalization performance of the model under various types of label noise. The new dataset of noisy labels and the source codes of robust LNL algorithms are available at https://github.com/zjfheart/BadLabels

arXiv.org e-Print Archive

Unleashing the Potential of Regularization Strategies in Learning with Noisy Labels

Author: Han Bo
Huang Huaxi
Kang Hui
Liu Sheng
Liu Tongliang
Wang Dadong
Yu Jun
Publication venue
Publication date: 11/07/2023
Field of study

In recent years, research on learning with noisy labels has focused on devising novel algorithms that can achieve robustness to noisy training labels while generalizing to clean data. These algorithms often incorporate sophisticated techniques, such as noise modeling, label correction, and co-training. In this study, we demonstrate that a simple baseline using cross-entropy loss, combined with widely used regularization strategies like learning rate decay, model weights average, and data augmentations, can outperform state-of-the-art methods. Our findings suggest that employing a combination of regularization strategies can be more effective than intricate algorithms in tackling the challenges of learning with noisy labels. While some of these regularization strategies have been utilized in previous noisy label learning research, their full potential has not been thoroughly explored. Our results encourage a reevaluation of benchmarks for learning with noisy labels and prompt reconsideration of the role of specialized learning algorithms designed for training with noisy labels

arXiv.org e-Print Archive