6,101 research outputs found
Weight Prediction Boosts the Convergence of AdamW
In this paper, we introduce weight prediction into the AdamW optimizer to
boost its convergence when training the deep neural network (DNN) models. In
particular, ahead of each mini-batch training, we predict the future weights
according to the update rule of AdamW and then apply the predicted future
weights to do both forward pass and backward propagation. In this way, the
AdamW optimizer always utilizes the gradients w.r.t. the future weights instead
of current weights to update the DNN parameters, making the AdamW optimizer
achieve better convergence. Our proposal is simple and straightforward to
implement but effective in boosting the convergence of DNN training. We
performed extensive experimental evaluations on image classification and
language modeling tasks to verify the effectiveness of our proposal. The
experimental results validate that our proposal can boost the convergence of
AdamW and achieve better accuracy than AdamW when training the DNN models
AdaPlus: Integrating Nesterov Momentum and Precise Stepsize Adjustment on AdamW Basis
This paper proposes an efficient optimizer called AdaPlus which integrates
Nesterov momentum and precise stepsize adjustment on AdamW basis. AdaPlus
combines the advantages of AdamW, Nadam, and AdaBelief and, in particular, does
not introduce any extra hyper-parameters. We perform extensive experimental
evaluations on three machine learning tasks to validate the effectiveness of
AdaPlus. The experiment results validate that AdaPlus (i) is the best adaptive
method which performs most comparable with (even slightly better than) SGD with
momentum on image classification tasks and (ii) outperforms other
state-of-the-art optimizers on language modeling tasks and illustrates the
highest stability when training GANs. The experiment code of AdaPlus is
available at: https://github.com/guanleics/AdaPlus
Capacity-Achieving Iterative LMMSE Detection for MIMO-NOMA Systems
This paper considers a iterative Linear Minimum Mean Square Error (LMMSE)
detection for the uplink Multiuser Multiple-Input and Multiple-Output (MU-MIMO)
systems with Non-Orthogonal Multiple Access (NOMA). The iterative LMMSE
detection greatly reduces the system computational complexity by departing the
overall processing into many low-complexity distributed calculations. However,
it is generally considered to be sub-optimal and achieves relatively poor
performance. In this paper, we firstly present the matching conditions and area
theorems for the iterative detection of the MIMO-NOMA systems. Based on the
proposed matching conditions and area theorems, the achievable rate region of
the iterative LMMSE detection is analysed. We prove that by properly design the
iterative LMMSE detection, it can achieve (i) the optimal sum capacity of
MU-MIMO systems, (ii) all the maximal extreme points in the capacity region of
MU-MIMO system, and (iii) the whole capacity region of two-user MIMO systems.Comment: 6pages, 5 figures, accepted by IEEE ICC 2016, 23-27 May 2016, Kuala
Lumpur, Malaysi
Research on the Improvement of Calculation Method for the Interference Assembly of Locomotive Traction Gear
The interference assembly is the main method for the connection between the traction gear and the shaft. The selection of the interference plays a critical role in the design of the traction gear. The traditional method of the calculation of the interference of the traction gear oversimply the mathematical model. The error goes out of the acceptable range so that the old method is not suitable for the design of the web structure. In this paper we propose an improved algorithm for solving the interference of the traction gear by combining the classical elastic mechanics theory and the finite element segmentation technique. The results from our improved algorithm is compared with that from the traditional method and the finite element simulation data is compared with the experimental results. Both comparisons verified the rationality and the feasibility of our algorithm. Our research provides the theoretical reference significance and practical guiding value for the selection of the range of interference
- …