Search CORE

10,956 research outputs found

Convergence Theory of Learning Over-parameterized ResNet: A Full Characterization

Author: Chen Wei
Liu Tie-Yan
Yi Mingyang
Yu Da
Zhang Huishuai
Publication venue
Publication date: 12/07/2019
Field of study

ResNet structure has achieved great empirical success since its debut. Recent work established the convergence of learning over-parameterized ResNet with a scaling factor

\tau=1/L

on the residual branch where

L

is the network depth. However, it is not clear how learning ResNet behaves for other values of

\tau

. In this paper, we fully characterize the convergence theory of gradient descent for learning over-parameterized ResNet with different values of

\tau

. Specifically, with hiding logarithmic factor and constant coefficients, we show that for

\tau\le 1/\sqrt{L}

gradient descent is guaranteed to converge to the global minma, and especially when

\tau\le 1/L

the convergence is irrelevant of the network depth. Conversely, we show that for

\tau>L^{-\frac{1}{2}+c}

, the forward output grows at least with rate

L^c

in expectation and then the learning fails because of gradient explosion for large

L

. This means the bound

\tau\le 1/\sqrt{L}

is sharp for learning ResNet with arbitrary depth. To the best of our knowledge, this is the first work that studies learning ResNet with full range of

\tau

.Comment: 31 page

arXiv.org e-Print Archive

Revisiting the morphology and systematic placement of the enigmatic Cretaceous ommatid beetle Bukhkalius lindae (Coleoptera: Archostemata: Ommatidae)

Author: Cai Chen-Yang
Huang Di-Ying
Li Yan-Da
Publication venue: Universidade de São Paulo (USP), Museu de Zoologia (MZUSP).
Publication date: 18/02/2021
Field of study

The systematic position of Bukhkalius lindae (Jarzembowski et al. 2017) from mid-Cretaceous Burmese amber is revised based on a re-examination of the type specimen. Our observation confirms that B. lindae lacks unique apomorphies of Tetraphalerus Waterhouse, in which it was originally placed, including complete antennal grooves and mushroom-shaped micro-tubercles. It is well justified to place it into a separate genus, i.e., Bukhkalius Kirejtshuk & Jarzembowski, 2020. The incomplete but distinct antennal grooves and the Tetraphalerus-like mouthparts of B. lindae suggest that it has a close affinity to Tetraphalerus. Therefore, B. lindae may represent a missing link bridging the morphological gap between extant ommatid subfamilies Tetraphalerinae and Ommatinae

Cadernos Espinosanos (E-Journal)

Spatial Classification With Limited Observations Based On Physics-Aware Structural Constraint

Author: Chen Haiquan
He Wenchong
Jiang Zhe
Sainju Arpan Man
Yan Da
Publication venue
Publication date: 03/04/2020
Field of study

Spatial classification with limited feature observations has been a challenging problem in machine learning. The problem exists in applications where only a subset of sensors are deployed at certain spots or partial responses are collected in field surveys. Existing research mostly focuses on addressing incomplete or missing data, e.g., data cleaning and imputation, classification models that allow for missing feature values or model missing features as hidden variables in the EM algorithm. These methods, however, assume that incomplete feature observations only happen on a small subset of samples, and thus cannot solve problems where the vast majority of samples have missing feature observations. To address this issue, we recently proposed a new approach that incorporates physics-aware structural constraint into the model representation. Our approach assumes that a spatial contextual feature is observed for all sample locations and establishes spatial structural constraint from the underlying spatial contextual feature map. We design efficient algorithms for model parameter learning and class inference. This paper extends our recent approach by allowing feature values of samples in each class to follow a multi-modal distribution. We propose learning algorithms for the extended model with multi-modal distribution. Evaluations on real-world hydrological applications show that our approach significantly outperforms baseline methods in classification accuracy, and the multi-modal extension is more robust than our early single-modal version especially when feature distribution in training samples is multi-modal. Computational experiments show that the proposed solution is computationally efficient on large datasets

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications