5,222 research outputs found
Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization
Global covariance pooling in convolutional neural networks has achieved
impressive improvement over the classical first-order pooling. Recent works
have shown matrix square root normalization plays a central role in achieving
state-of-the-art performance. However, existing methods depend heavily on
eigendecomposition (EIG) or singular value decomposition (SVD), suffering from
inefficient training due to limited support of EIG and SVD on GPU. Towards
addressing this problem, we propose an iterative matrix square root
normalization method for fast end-to-end training of global covariance pooling
networks. At the core of our method is a meta-layer designed with loop-embedded
directed graph structure. The meta-layer consists of three consecutive
nonlinear structured layers, which perform pre-normalization, coupled matrix
iteration and post-compensation, respectively. Our method is much faster than
EIG or SVD based ones, since it involves only matrix multiplications, suitable
for parallel implementation on GPU. Moreover, the proposed network with ResNet
architecture can converge in much less epochs, further accelerating network
training. On large-scale ImageNet, we achieve competitive performance superior
to existing counterparts. By finetuning our models pre-trained on ImageNet, we
establish state-of-the-art results on three challenging fine-grained
benchmarks. The source code and network models will be available at
http://www.peihuali.org/iSQRT-COVComment: Accepted to CVPR 201
Learning Deep SPD Visual Representation for Image Classification
Symmetric positive definite (SPD) visual representations are effective due to their ability to capture high-order statistics to describe images. Reliable and efficient calculation of SPD matrix representation from small sized feature maps with a high number of channels in CNN is a challenging issue. This thesis presents three novel methods to address the above challenge. The first method, called Relation Dropout (ReDro), is inspired by the fact that eigen-decomposition of a block diagonal matrix can be efficiently obtained by eigendecomposition of each block separately. Thus, instead of using a full covariance matrix as in the literature, this thesis randomly group the channels and form a covariance matrix per group. ReDro is inserted as an additional layer preceding the matrix normalisation step and the random grouping is made transparent to all subsequent layers. ReDro can be seen as a dropout-related regularisation which discards some pair-wise channel relationships across each group. The second method, called FastCOV, exploits the intrinsic connection between eigensytems of XXT and XTX. Specifically, it computes position-wise covariance matrix upon convolutional feature maps instead of the typical channel-wise covariance matrix. As the spatial size of feature maps is usually much smaller than the channel number, conducting eigen-decomposition of the position-wise covariance matrix avoids rank-deficiency and it is faster than the decomposition of the channel-wise covariance matrix. The eigenvalues and eigenvectors of the normalised channel-wise covariance matrix can be retrieved by the connection of the XXT and XTX eigen-systems. The third method, iSICE, deals with the reliable covariance estimation from small sized and highdimensional CNN feature maps. It exploits the prior structure of the covariance matrix to estimate sparse inverse covariance which is developed in the literature to deal with the covariance matrix’s small sample issue. Given a covariance matrix, this thesis iteratively minimises its log-likelihood penalised by a sparsity with gradient descend. The resultant representation characterises partial correlation instead of indirect correlation characterised in covariance representation. As experimentally demonstrated, all three proposed methods improve the image classification performance, whereas the first two proposed methods reduce the computational cost of learning large SPD visual representations
A Robust Adaptive Stochastic Gradient Method for Deep Learning
Stochastic gradient algorithms are the main focus of large-scale optimization
problems and led to important successes in the recent advancement of the deep
learning algorithms. The convergence of SGD depends on the careful choice of
learning rate and the amount of the noise in stochastic estimates of the
gradients. In this paper, we propose an adaptive learning rate algorithm, which
utilizes stochastic curvature information of the loss function for
automatically tuning the learning rates. The information about the element-wise
curvature of the loss function is estimated from the local statistics of the
stochastic first order gradients. We further propose a new variance reduction
technique to speed up the convergence. In our experiments with deep neural
networks, we obtained better performance compared to the popular stochastic
gradient algorithms.Comment: IJCNN 2017 Accepted Paper, An extension of our paper, "ADASECANT:
Robust Adaptive Secant Method for Stochastic Gradient
Penilaian kepatuhan syariat islam dalam merekabentuk tanah perkuburan islam berkonsepkan taman teknologi
Tanah Perkuburan Islam di Malaysia telah mencapai banyak pembaharuan. Antaranya pembinaan Raudhatul Sakinah iaitu tanah perkuburan dalam taman. Pada peringkat awal, dua tanah perkuburan telah dijadikan tapak pembinaan iaitu Tanah Perkuburan Islam KL-Karak dan Tanah Perkuburan Islam Bukit Kiara yang diuruskan oleh Jabatan Agama Islam Wilayah Persekutuan (JAWI). Lanjutan dari itu sekumpulan penyelidik dari UTHM melakukan penambahbaikan melalui usaha merekabentuk Tanah Perkuburan Islam Berkonsepkan Taman Teknologi menggunakan dengan aplikasi Geographical Information System (GIS) sebagai nilaitambah dalam dalam proses pembinaan tanah perkuburan Islam yang lebih sistematik. Lokasi kajian ini terletak di Tanah Perkuburan Islam, Parit Raja, Batu Pahat. Perkembangan ini memerlukan memerlukan satu garis panduan yang jelas agar usaha yang dilakukan berada dalam ruang lingkup kepatuhan syariat Islam. Justeru kertas kerja ini dihasilkan bagi menilai kepa rekabentuk tanah perkuburan Islam berkonsepkan taman teknologi ini adalah selari dengan ketetapan syariat Islam. Pendekatan kajian ini menggunakan kaedah temubual, permerhatian dan kajian perpustakaan. Hasil dari analisis kajian, terdapat tiga aspek yang perlu diambilkira semasa merekabentuk tanah perkuburan berkonsepkan taman teknologi iaitu tujuan mengkebumikan jenazah, tujuan menziarahi kubur dan bentuk binaan di atas tapak perkuburan. Dapatan daripada kajian ini akan menjadikan rekabentuk Tanah Perkuburan berkonsepkan Taman Teknologi menepati syariat Islam, diterima serta dimanafaatkan oleh seluruh masyarakat Islam di Malaysia
- …