10,956 research outputs found
Convergence Theory of Learning Over-parameterized ResNet: A Full Characterization
ResNet structure has achieved great empirical success since its debut. Recent
work established the convergence of learning over-parameterized ResNet with a
scaling factor on the residual branch where is the network
depth. However, it is not clear how learning ResNet behaves for other values of
. In this paper, we fully characterize the convergence theory of gradient
descent for learning over-parameterized ResNet with different values of .
Specifically, with hiding logarithmic factor and constant coefficients, we show
that for gradient descent is guaranteed to converge to the
global minma, and especially when the convergence is irrelevant
of the network depth. Conversely, we show that for ,
the forward output grows at least with rate in expectation and then the
learning fails because of gradient explosion for large . This means the
bound is sharp for learning ResNet with arbitrary depth.
To the best of our knowledge, this is the first work that studies learning
ResNet with full range of .Comment: 31 page
Revisiting the morphology and systematic placement of the enigmatic Cretaceous ommatid beetle Bukhkalius lindae (Coleoptera: Archostemata: Ommatidae)
The systematic position of Bukhkalius lindae (Jarzembowski et al. 2017) from mid-Cretaceous Burmese amber is revised based on a re-examination of the type specimen. Our observation confirms that B. lindae lacks unique apomorphies of Tetraphalerus Waterhouse, in which it was originally placed, including complete antennal grooves and mushroom-shaped micro-tubercles. It is well justified to place it into a separate genus, i.e., Bukhkalius Kirejtshuk & Jarzembowski, 2020. The incomplete but distinct antennal grooves and the Tetraphalerus-like mouthparts of B. lindae suggest that it has a close affinity to Tetraphalerus. Therefore, B. lindae may represent a missing link bridging the morphological gap between extant ommatid subfamilies Tetraphalerinae and Ommatinae
Spatial Classification With Limited Observations Based On Physics-Aware Structural Constraint
Spatial classification with limited feature observations has been a
challenging problem in machine learning. The problem exists in applications
where only a subset of sensors are deployed at certain spots or partial
responses are collected in field surveys. Existing research mostly focuses on
addressing incomplete or missing data, e.g., data cleaning and imputation,
classification models that allow for missing feature values or model missing
features as hidden variables in the EM algorithm. These methods, however,
assume that incomplete feature observations only happen on a small subset of
samples, and thus cannot solve problems where the vast majority of samples have
missing feature observations. To address this issue, we recently proposed a new
approach that incorporates physics-aware structural constraint into the model
representation. Our approach assumes that a spatial contextual feature is
observed for all sample locations and establishes spatial structural constraint
from the underlying spatial contextual feature map. We design efficient
algorithms for model parameter learning and class inference. This paper extends
our recent approach by allowing feature values of samples in each class to
follow a multi-modal distribution. We propose learning algorithms for the
extended model with multi-modal distribution. Evaluations on real-world
hydrological applications show that our approach significantly outperforms
baseline methods in classification accuracy, and the multi-modal extension is
more robust than our early single-modal version especially when feature
distribution in training samples is multi-modal. Computational experiments show
that the proposed solution is computationally efficient on large datasets
- …