Search CORE

116 research outputs found

Shakeout: A New Approach to Regularized Deep Neural Network Training

Author: Kang Guoliang
Li Jun
Tao Dacheng
Publication venue
Publication date: 01/05/2018
Field of study

Recent years have witnessed the success of deep neural networks in dealing with a plenty of practical problems. Dropout has played an essential role in many successful deep neural networks, by inducing regularization in the model training. In this paper, we present a new regularized training approach: Shakeout. Instead of randomly discarding units as Dropout does at the training stage, Shakeout randomly chooses to enhance or reverse each unit's contribution to the next layer. This minor modification of Dropout has the statistical trait: the regularizer induced by Shakeout adaptively combines

L_0

L_1

and

L_2

regularization terms. Our classification experiments with representative deep architectures on image datasets MNIST, CIFAR-10 and ImageNet show that Shakeout deals with over-fitting effectively and outperforms Dropout. We empirically demonstrate that Shakeout leads to sparser weights under both unsupervised and supervised settings. Shakeout also leads to the grouping effect of the input units in a layer. Considering the weights in reflecting the importance of connections, Shakeout is superior to Dropout, which is valuable for the deep model compression. Moreover, we demonstrate that Shakeout can effectively reduce the instability of the training process of the deep architecture.Comment: Appears at T-PAMI 201

arXiv.org e-Print Archive

OPUS - University of Technology Sydney

Regularization in deep neural networks

Author: Kang Guoliang
Publication venue
Publication date: 01/01/2019
Field of study

University of Technology Sydney. Faculty of Engineering and Information Technology.Recent years have witnessed the great success of deep learning. As the deep architecture becomes larger and deeper, it is easy to overfit to relatively small amount of data. Regularization has proved to be an effective way to reduce overfitting in traditional statistical learning area. In the context of deep learning, some special design is required to regularize their training process. Generally, we firstly proposed a new regularization technique named “Shakeout” to improve the generalization ability of deep neural networks beyond Dropout, via introducing a combination of L₀, L₁, and L₂ regularization effect into the network training. Then we considered the unsupervised domain adaptation setting where the source domain data is labelled and the target domain data is unlabeled. We proposed “deep adversarial attention alignment” to regularize the behavior of the convolutional layers. Such regularization reduces the domain shift existing at the start in the convolutional layers which has been ignored by previous works and leads to superior adaptation results

OPUS - University of Technology Sydney

Prediction of Supernova Rates in Known Galaxy-galaxy Strong-lens Systems

Author: Bolton Adam S.
Kang Xi
Li Guoliang
Mao Shude
Shu Yiping
Soraisam Monika
Publication venue: 'American Astronomical Society'
Publication date: 01/01/2018
Field of study

We propose a new strategy of finding strongly-lensed supernovae (SNe) by monitoring known galaxy-scale strong-lens systems. Strongly lensed SNe are potentially powerful tools for the study of cosmology, galaxy evolution, and stellar populations, but they are extremely rare. By targeting known strongly lensed starforming galaxies, our strategy significantly boosts the detection efficiency for lensed SNe compared to a blind search. As a reference sample, we compile the 128 galaxy-galaxy strong-lens systems from the Sloan Lens ACS Survey (SLACS), the SLACS for the Masses Survey, and the Baryon Oscillation Spectroscopic Survey Emission-Line Lens Survey. Within this sample, we estimate the rates of strongly-lensed Type Ia SN (SNIa) and core-collapse SN (CCSN) to be

1.23 \pm 0.12

and

10.4 \pm 1.1

events per year, respectively. The lensed SN images are expected to be widely separated with a median separation of 2 arcsec. Assuming a conservative fiducial lensing magnification factor of 5 for the most highly magnified SN image, we forecast that a monitoring program with a single-visit depth of 24.7 mag (5

\sigma

point source,

r

band) and a cadence of 5 days can detect 0.49 strongly-lensed SNIa event and 2.1 strongly-lensed CCSN events per year within this sample. Our proposed targeted-search strategy is particularly useful for prompt and efficient identifications and follow-up observations of strongly-lensed SN candidates. It also allows telescopes with small field of views and limited time to efficiently discover strongly-lensed SNe with a pencil-beam scanning strategy.Comment: 14 pages, 5 figures, ApJ in pres

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

The Correspondence between Convergence Peaks from Weak Lensing and Massive Dark Matter Haloes

Author: Fan Zuhui
Kang Xi
Li Guoliang
Liu Xiangkun
Pan Chuzhong
Wei Chengliang
Yuan Shuo
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

The convergence peaks, constructed from galaxy shape measurement in weak lensing, is a powerful probe of cosmology as the peaks can be connected with the underlined dark matter haloes. However the capability of convergence peak statistic is affected by the noise in galaxy shape measurement, signal to noise ratio as well as the contribution from the projected mass distribution from the large-scale structures along the line of sight (LOS). In this paper we use the ray-tracing simulation on a curved sky to investigate the correspondence between the convergence peak and the dark matter haloes at the LOS. We find that, in case of no noise and for source galaxies at

z_{\rm s}=1

, more than

65\%

peaks with

\text{SNR} \geq 3

(signal to noise ratio) are related to more than one massive haloes with mass larger than

10^{13} {\rm M}_{\odot}

. Those massive haloes contribute

87.2\%

to high peaks (

\text{SNR} \geq 5

) with the remaining contributions are from the large-scale structures. On the other hand, the peaks distribution is skewed by the noise in galaxy shape measurement, especially for lower SNR peaks. In the noisy field where the shape noise is modelled as a Gaussian distribution, about

60\%

high peaks (

\text{SNR} \geq 5

) are true peaks and the fraction decreases to

20\%

for lower peaks (

3 \leq \text{SNR} < 5

). Furthermore, we find that high peaks (

\text{SNR} \geq 5

) are dominated by very massive haloes larger than

10^{14} {\rm M}_{\odot}

.Comment: 13 pages, 11 figures, 4 tables, accepted for publication in MNRAS. Our mock galaxy catalog is available upon request by email to the author ([email protected]

arXiv.org e-Print Archive

MPG.PuRe

SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model

Author: Chen Ling
Kang Guoliang
Wang Liyuan
Wei Yunchao
Zhang Gengwei
Publication venue
Publication date: 18/07/2023
Field of study

The goal of continual learning is to improve the performance of recognition models in learning sequentially arrived data. Although most existing works are established on the premise of learning from scratch, growing efforts have been devoted to incorporating the benefits of pre-training. However, how to adaptively exploit the pre-trained knowledge for each incremental task while maintaining its generalizability remains an open question. In this work, we present an extensive analysis for continual learning on a pre-trained model (CLPM), and attribute the key challenge to a progressive overfitting problem. Observing that selectively reducing the learning rate can almost resolve this issue in the representation layer, we propose a simple but extremely effective approach named Slow Learner with Classifier Alignment (SLCA), which further improves the classification layer by modeling the class-wise distributions and aligning the classification layers in a post-hoc fashion. Across a variety of scenarios, our proposal provides substantial improvements for CLPM (e.g., up to 49.76%, 50.05%, 44.69% and 40.16% on Split CIFAR-100, Split ImageNet-R, Split CUB-200 and Split Cars-196, respectively), and thus outperforms state-of-the-art approaches by a large margin. Based on such a strong baseline, critical factors and promising directions are analyzed in-depth to facilitate subsequent research.Comment: 11 pages, 8 figures, accepted by ICCV 202

arXiv.org e-Print Archive