6,101 research outputs found

    Weight Prediction Boosts the Convergence of AdamW

    Full text link
    In this paper, we introduce weight prediction into the AdamW optimizer to boost its convergence when training the deep neural network (DNN) models. In particular, ahead of each mini-batch training, we predict the future weights according to the update rule of AdamW and then apply the predicted future weights to do both forward pass and backward propagation. In this way, the AdamW optimizer always utilizes the gradients w.r.t. the future weights instead of current weights to update the DNN parameters, making the AdamW optimizer achieve better convergence. Our proposal is simple and straightforward to implement but effective in boosting the convergence of DNN training. We performed extensive experimental evaluations on image classification and language modeling tasks to verify the effectiveness of our proposal. The experimental results validate that our proposal can boost the convergence of AdamW and achieve better accuracy than AdamW when training the DNN models

    AdaPlus: Integrating Nesterov Momentum and Precise Stepsize Adjustment on AdamW Basis

    Full text link
    This paper proposes an efficient optimizer called AdaPlus which integrates Nesterov momentum and precise stepsize adjustment on AdamW basis. AdaPlus combines the advantages of AdamW, Nadam, and AdaBelief and, in particular, does not introduce any extra hyper-parameters. We perform extensive experimental evaluations on three machine learning tasks to validate the effectiveness of AdaPlus. The experiment results validate that AdaPlus (i) is the best adaptive method which performs most comparable with (even slightly better than) SGD with momentum on image classification tasks and (ii) outperforms other state-of-the-art optimizers on language modeling tasks and illustrates the highest stability when training GANs. The experiment code of AdaPlus is available at: https://github.com/guanleics/AdaPlus

    Capacity-Achieving Iterative LMMSE Detection for MIMO-NOMA Systems

    Full text link
    This paper considers a iterative Linear Minimum Mean Square Error (LMMSE) detection for the uplink Multiuser Multiple-Input and Multiple-Output (MU-MIMO) systems with Non-Orthogonal Multiple Access (NOMA). The iterative LMMSE detection greatly reduces the system computational complexity by departing the overall processing into many low-complexity distributed calculations. However, it is generally considered to be sub-optimal and achieves relatively poor performance. In this paper, we firstly present the matching conditions and area theorems for the iterative detection of the MIMO-NOMA systems. Based on the proposed matching conditions and area theorems, the achievable rate region of the iterative LMMSE detection is analysed. We prove that by properly design the iterative LMMSE detection, it can achieve (i) the optimal sum capacity of MU-MIMO systems, (ii) all the maximal extreme points in the capacity region of MU-MIMO system, and (iii) the whole capacity region of two-user MIMO systems.Comment: 6pages, 5 figures, accepted by IEEE ICC 2016, 23-27 May 2016, Kuala Lumpur, Malaysi

    Research on the Improvement of Calculation Method for the Interference Assembly of Locomotive Traction Gear

    Get PDF
    The interference assembly is the main method for the connection between the traction gear and the shaft. The selection of the interference plays a critical role in the design of the traction gear. The traditional method of the calculation of the interference of the traction gear oversimply the mathematical model. The error goes out of the acceptable range so that the old method is not suitable for the design of the web structure. In this paper we propose an improved algorithm for solving the interference of the traction gear by combining the classical elastic mechanics theory and the finite element segmentation technique. The results from our improved algorithm is compared with that from the traditional method and the finite element simulation data is compared with the experimental results. Both comparisons verified the rationality and the feasibility of our algorithm. Our research provides the theoretical reference significance and practical guiding value for the selection of the range of interference
    • …
    corecore