4,885 research outputs found
Benign Oscillation of Stochastic Gradient Descent with Large Learning Rates
In this work, we theoretically investigate the generalization properties of
neural networks (NN) trained by stochastic gradient descent (SGD) algorithm
with large learning rates. Under such a training regime, our finding is that,
the oscillation of the NN weights caused by the large learning rate SGD
training turns out to be beneficial to the generalization of the NN, which
potentially improves over the same NN trained by SGD with small learning rates
that converges more smoothly. In view of this finding, we call such a
phenomenon "benign oscillation". Our theory towards demystifying such a
phenomenon builds upon the feature learning perspective of deep learning.
Specifically, we consider a feature-noise data generation model that consists
of (i) weak features which have a small -norm and appear in each data
point; (ii) strong features which have a larger -norm but only appear
in a certain fraction of all data points; and (iii) noise. We prove that NNs
trained by oscillating SGD with a large learning rate can effectively learn the
weak features in the presence of those strong features. In contrast, NNs
trained by SGD with a small learning rate can only learn the strong features
but makes little progress in learning the weak features. Consequently, when it
comes to the new testing data which consist of only weak features, the NN
trained by oscillating SGD with a large learning rate could still make correct
predictions consistently, while the NN trained by small learning rate SGD
fails. Our theory sheds light on how large learning rate training benefits the
generalization of NNs. Experimental results demonstrate our finding on "benign
oscillation".Comment: 63 pages, 10 figure
Fenton Reagent Oxidation and Decolorizing Reaction Kinetics of Reactive Red SBE
AbstractFenton reagent was employed to treat and decolorize the wastewater of Reactive Red SBE by on-line spectrophotometry. The effects of initial FeSO4 concentration, initial H2O2 concentration, pH, reactive red SBE and temperature on the decoloration of reactive red SBE were investigated. The results show that Fenton oxidize process follows pseudo first order kinetics in the first stage and reaction activation energy is 2.608 kJ/mol. The decolorizing reaction rate constants (k) increase with the rise of FeSO4 concentration, H2O2 concentration, temperature, but decrease with the rise of reactive red SBE, and the optimum pH is 3. Initial FeSO4 concentration and initial H2O2 concentration against k are linear correlation
A Robust and Constrained Multi-Agent Reinforcement Learning Framework for Electric Vehicle AMoD Systems
Electric vehicles (EVs) play critical roles in autonomous mobility-on-demand
(AMoD) systems, but their unique charging patterns increase the model
uncertainties in AMoD systems (e.g. state transition probability). Since there
usually exists a mismatch between the training and test (true) environments,
incorporating model uncertainty into system design is of critical importance in
real-world applications. However, model uncertainties have not been considered
explicitly in EV AMoD system rebalancing by existing literature yet and remain
an urgent and challenging task. In this work, we design a robust and
constrained multi-agent reinforcement learning (MARL) framework with transition
kernel uncertainty for the EV rebalancing and charging problem. We then propose
a robust and constrained MARL algorithm (ROCOMA) that trains a robust EV
rebalancing policy to balance the supply-demand ratio and the charging
utilization rate across the whole city under state transition uncertainty.
Experiments show that the ROCOMA can learn an effective and robust rebalancing
policy. It outperforms non-robust MARL methods when there are model
uncertainties. It increases the system fairness by 19.6% and decreases the
rebalancing costs by 75.8%.Comment: 8 page
Phase retrieval from single biomolecule diffraction pattern
In this paper, we propose the SPR (sparse phase retrieval) method, which is a
new phase retrieval method for coherent x-ray diffraction imaging (CXDI).
Conventional phase retrieval methods effectively solve the problem for high
signal-to-noise ratio measurements, but would not be sufficient for single
biomolecular imaging which is expected to be realized with femto-second x-ray
free electron laser pulses. The SPR method is based on the Bayesian statistics.
It does not need to set the object boundary constraint that is required by the
commonly used hybrid input-output (HIO) method, instead a prior distribution is
defined with an exponential distribution and used for the estimation.
Simulation results demonstrate that the proposed method reconstructs the
electron density under a noisy condition even some central pixels are masked.Comment: 13 pages, 13 figures, submitted for a journa
GDL-DS: A Benchmark for Geometric Deep Learning under Distribution Shifts
Geometric deep learning (GDL) has gained significant attention in various
scientific fields, chiefly for its proficiency in modeling data with intricate
geometric structures. Yet, very few works have delved into its capability of
tackling the distribution shift problem, a prevalent challenge in many relevant
applications. To bridge this gap, we propose GDL-DS, a comprehensive benchmark
designed for evaluating the performance of GDL models in scenarios with
distribution shifts. Our evaluation datasets cover diverse scientific domains
from particle physics and materials science to biochemistry, and encapsulate a
broad spectrum of distribution shifts including conditional, covariate, and
concept shifts. Furthermore, we study three levels of information access from
the out-of-distribution (OOD) testing data, including no OOD information, only
OOD features without labels, and OOD features with a few labels. Overall, our
benchmark results in 30 different experiment settings, and evaluates 3 GDL
backbones and 11 learning algorithms in each setting. A thorough analysis of
the evaluation results is provided, poised to illuminate insights for DGL
researchers and domain practitioners who are to use DGL in their applications.Comment: Code and data are available at https://github.com/Graph-COM/GDL_D
Application of Machine Learning Method to Model-Based Library Approach to Critical Dimension Measurement by CD-SEM
The model-based library (MBL) method has already been established for the
accurate measurement of critical dimension (CD) of semiconductor linewidth from
a critical dimension scanning electron microscope (CD-SEM) image. In this work
the MBL method has been further investigated by combing the CD-SEM image
simulation with a neural network algorithm. The secondary electron linescan
profiles were calculated at first by a Monte Carlo simulation method, enabling
to obtain the dependence of linescan profiles on the selected values of various
geometrical parameters (e.g., top CD, sidewall angle and height) for Si and Au
trapezoidal line structures. The machine learning methods have then been
applied to predicate the linescan profiles from a randomly selected training
set of the calculated profiles. The predicted results agree very well with the
calculated profiles with the standard deviation of 0.1% and 6% for the relative
error distributions of Si and Au line structures, respectively. This result
shows that the machine learning methods can be practically applied to the MBL
method for the purpose of reducing the library size, accelerating the
construction of the MBL database and enriching the content of an available MBL
database
- …