1,418 research outputs found

    AdaLomo: Low-memory Optimization with Adaptive Learning Rate

    Full text link
    Large language models have achieved remarkable success, but their extensive parameter size necessitates substantial memory for training, thereby setting a high threshold. While the recently proposed low-memory optimization (LOMO) reduces memory footprint, its optimization technique, akin to stochastic gradient descent, is sensitive to hyper-parameters and exhibits suboptimal convergence, failing to match the performance of the prevailing optimizer for large language models, AdamW. Through empirical analysis of the Adam optimizer, we found that, compared to momentum, the adaptive learning rate is more critical for bridging the gap. Building on this insight, we introduce the low-memory optimization with adaptive learning rate (AdaLomo), which offers an adaptive learning rate for each parameter. To maintain memory efficiency, we employ non-negative matrix factorization for the second-order moment estimation in the optimizer state. Additionally, we suggest the use of a grouped update normalization to stabilize convergence. Our experiments with instruction-tuning and further pre-training demonstrate that AdaLomo achieves results on par with AdamW, while significantly reducing memory requirements, thereby lowering the hardware barrier to training large language models.Comment: Fix some typ

    Estudio de las formas del mejoramiento del parque aeronáutico y logístico de Shanghai

    Get PDF
    Fil: LV, Kai. Universidad de Buenos Aires. Facultad de Ciencias EconĂłmicas. Buenos Aires, Argentina

    Modeling Viscosity of High Titania Slag

    Get PDF
    TiO2-FeO-Ti2O3 slag system is the dominant system for industrial high-titania slag production. In the present work, viscosities of TiO2-FeO and TiO2-FeO-Ti2O3 systems were experimentally determined using the concentric rotating cylinder method under argon atmosphere. A viscosity model suitable for the TiO2-FeO-Ti2O3 slag system was then established based on the modification of the Vogel-Fulcher-Tammann (VFT) equation. The experimental results indicate that completely melted high-titania slags exhibit very low viscosity of around 0.8 dPa s with negligible dependence on temperature and compositions. However, it increases dramatically with decreasing temperature slightly below the critical temperature. Besides, the increase in FeO content was found to remarkably lower the critical temperature, while the addition of Ti2O3 increases it. The developed model can predict the viscosities of the TiO2-FeO-Ti2O3 and TiO2-FeO systems over wide ranges of compositions and temperatures within experimental uncertainties. The average relative error for the present model calculation is < 18.82 pct, which is better than the previously developed models for silicate slags reported in the literature. An iso-viscosity distribution diagram was made for the TiO2-FeO-Ti2O3 slag system, which can serve as a roadmap for the Ilmenite smelting reduction process as well as the high titania slags tapping process.publishedVersio

    Research on the influence of virtual modeling and testing–based rubber track system on vibration performance of engineering vehicles

    Get PDF
    The rubbertrack system can be quickly swapped on the tyres, exerting a smaller ground pressure while generating a greater adhesion to solve the problem vehicles faced in traversing rough and difficult terrain. This paper will discuss the influence of rubber track system on the ride comfort of engineering vehicles with rigid suspension. First, a multi-body dynamic model of the rubber track system and a mathematical model of contact between the ground and the track are established, and then the macro commands are programmed to add many complex contact forces. Moreover, by using the method of physical prototype obstacle testing, the correctness of the simulation model is validated. The ride comfort of the engineering vehicle when equipped with rubber track system is explored by the method of the multi-body dynamics and real vehicle test. The research shows that a flexible roller wheel system can significantly improve the ride comfort of the engineering vehicle when compared to wheeled vehicles. When the vehicle speed is low, the weighted root-meansquare acceleration of the wheeled vehicle and tracked vehicle is almost the same. At the same time, it is verified that the ride comfort of the steelchain tracked vehicles is worse than that of rubber tracked vehicles, due to the polygon effect. Through the multi-body dynamics simulation of the virtual prototype, we can predict and evaluate the ride comfort of vehicles, saving the cost of testing and obtaining the actual experimental data, which has great significance for the research and development of vehicles

    Link Prediction Based on Common-Neighbors for Dynamic Social Network

    Get PDF
    AbstractLink prediction is an important issue in social networks. Most of the existing methods aim to predict interactions between individuals for static networks, ignoring the dynamic feature of social networks. This paper proposes a link prediction method which considers the dynamic topology of social networks. Given a snapshot of a social network at time t (or network evolution between t1 and t2), we seek to accurately predict the edges that will be added during the interval from time t (or t2) to a given future time t′. Our approach utilizes three metrics, the time-varied weight, the change degree of common neighbor and the intimacy between common neighbors. Moreover, we redefine the common neighbors by finding them within two hops. Experiments on DBLP show that our method can reach better results

    An Approach to Mismatched Disturbance Rejection Control for Continuous-Time Uncontrollable Systems

    Full text link
    This paper focuses on optimal mismatched disturbance rejection control for linear continuoustime uncontrollable systems. Different from previous studies, by introducing a new quadratic performance index to transform the mismatched disturbance rejection control into a linear quadratic tracking problem, the regulated state can track a reference trajectory and minimize the influence of disturbance. The necessary and sufficient conditions for the solvability and the disturbance rejection controller are obtained by solving a forward-backward differential equation over a finite horizon. A sufficient condition for system stability is obtained over an infinite horizon under detectable condition. This paper details our novel approach for transforming disturbance rejection into a linear quadratic tracking problem. The effectiveness of the proposed method is provided with two examples to demonstrate.Comment: arXiv admin note: substantial text overlap with arXiv:2209.0701
    • …
    corecore