30 research outputs found

    Inefficiency of K-FAC for Large Batch Size Training

    Full text link
    In stochastic optimization, using large batch sizes during training can leverage parallel resources to produce faster wall-clock training times per training epoch. However, for both training loss and testing error, recent results analyzing large batch Stochastic Gradient Descent (SGD) have found sharp diminishing returns, beyond a certain critical batch size. In the hopes of addressing this, it has been suggested that the Kronecker-Factored Approximate Curvature (\mbox{K-FAC}) method allows for greater scalability to large batch sizes, for non-convex machine learning problems such as neural network optimization, as well as greater robustness to variation in model hyperparameters. Here, we perform a detailed empirical analysis of large batch size training %of these two hypotheses, for both \mbox{K-FAC} and SGD, evaluating performance in terms of both wall-clock time and aggregate computational cost. Our main results are twofold: first, we find that both \mbox{K-FAC} and SGD doesn't have ideal scalability behavior beyond a certain batch size, and that \mbox{K-FAC} does not exhibit improved large-batch scalability behavior, as compared to SGD; and second, we find that \mbox{K-FAC}, in addition to requiring more hyperparameters to tune, suffers from similar hyperparameter sensitivity behavior as does SGD. We discuss extensive results using ResNet and AlexNet on \mbox{CIFAR-10} and SVHN, respectively, as well as more general implications of our findings

    Uncertainty Analysis on Risk Assessment of Water Inrush in Karst Tunnels

    Get PDF
    An improved attribute recognition method is reviewed and discussed to evaluate the risk of water inrush in karst tunnels. Due to the complex geology and hydrogeology, the methodology discusses the uncertainties related to the evaluation index and attribute measure. The uncertainties can be described by probability distributions. The values of evaluation index and attribute measure were employed through random numbers generated by Monte Carlo simulations and an attribute measure belt was chosen instead of the linearity attribute measure function. Considering the uncertainties of evaluation index and attribute measure, the probability distributions of four risk grades are calculated using random numbers generated by Monte Carlo simulation. According to the probability distribution, the risk level can be analyzed under different confidence coefficients. The method improvement is more accurate and feasible compared with the results derived from the attribute recognition model. Finally, the improved attribute recognition method was applied and verified in Longmenshan tunnel in China

    High-efficiency Algorithm for the Most Unfavourable Load Case Combinations of Multilayered Frame-Type Wharf Structure

    Get PDF
    The wharf, which was built in the Three Gorges Reservoir of China, is constructed as a layered frame-type structure for adapting to large water level fluctuations that exceed 30 m. These large fluctuations cause the frame-type structure to exhibit a considerably higher number of load case combinations than traditional marine high-piled wharfs. To estimate the most adverse combined internal force and the corresponding unfavourable load case combinations of significant components for multilayered frame-type wharf structures in the Three Gorges Reservoir of China, a high-efficiency algorithm is developed in this study. This algorithm can skilfully transform the computational mode of load case combinations into a matrix operations process by computer programming. By applying the proposed algorithm, the number of load case combinations for eight significant components of frame-type wharf, including piles, columns, beams, braces and berthing components, are resolved to a total of 21 from the original quantity of more than six billion. This high-efficiency algorithm can provide powerful technical support for evaluating the bearing capability of multilayered frame-type wharfs in the Three Gorges Reservoir of China

    Bulk Density Adjustment of Resin-Based Equivalent Material for Geomechanical Model Test

    Get PDF
    An equivalent material is of significance to the simulation of prototype rock in geomechanical model test. Researchers attempt to ensure that the bulk density of equivalent material is equal to that of prototype rock. In this work, barite sand was used to increase the bulk density of a resin-based equivalent material. The variation law of the bulk density was revealed in the simulation of a prototype rock of a different bulk density. Over 300 specimens were made for uniaxial compression test. Test results indicated that the substitution of quartz sand by barite sand had no apparent influence on the uniaxial compressive strength and elastic modulus of the specimens but can increase the bulk density, according to the proportional coarse aggregate content. An ideal linearity was found in the relationship between the barite sand substitution ratio and the bulk density. The relationship between the bulk density and the usage of coarse aggregate and barite sand was also presented. The test results provided an insight into the bulk density adjustment of resin-based equivalent materials

    Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

    Full text link
    Transformer based architectures have become de-facto models used for a range of Natural Language Processing tasks. In particular, the BERT based models achieved significant accuracy gain for GLUE tasks, CoNLL-03 and SQuAD. However, BERT based models have a prohibitive memory footprint and latency. As a result, deploying BERT based models in resource constrained environments has become a challenging task. In this work, we perform an extensive analysis of fine-tuned BERT models using second order Hessian information, and we use our results to propose a novel method for quantizing BERT models to ultra low precision. In particular, we propose a new group-wise quantization scheme, and we use a Hessian based mix-precision method to compress the model further. We extensively test our proposed method on BERT downstream tasks of SST-2, MNLI, CoNLL-03, and SQuAD. We can achieve comparable performance to baseline with at most 2.3%2.3\% performance degradation, even with ultra-low precision quantization down to 2 bits, corresponding up to 13×13\times compression of the model parameters, and up to 4×4\times compression of the embedding table as well as activations. Among all tasks, we observed the highest performance loss for BERT fine-tuned on SQuAD. By probing into the Hessian based analysis as well as visualization, we show that this is related to the fact that current training/fine-tuning strategy of BERT does not converge for SQuAD

    A multiscale model for the oxide ion conducting and proton conducting solid oxide cells

    Get PDF
    Solid oxide cells (SOCs) are high-efficiency energy conversion devices under high temperature. However, the key reaction mechanisms governing the overall performance of SOCs are not well understood. Here, we develop a multiscale model combining density functional theory calculations, transition state theory and continuum modeling to elucidate the essential reaction steps and predict the performance of the device. Density functional theory calculations are used to obtain the free energy barriers for different reaction steps, transition state theory is used to predict the reaction rate constants for each step based on the free energy barriers, and the continuum theory utilizes the reaction rate constants to obtain the voltage loss-current density relations. We apply the methodology to both the oxide ion-conducting SOCs as well as the proton-conducting SOCs. The proposed multiscale model yields quantitative agreement with the voltage loss-current density data from experiments. The results indicate that as to the oxygen electrode reactions in the Lanthanum Strontium Cobalt Ferrite (La_{1-x}Sr_{x}Co_{1-y}Fe_{y} O_{3-\delta} or LSCF) based oxide ion-conducting SOCs, the reaction step involving the splitting of the surface oxygen molecules into oxide ions under SOFC mode and the combination of surface oxide ions into oxygen molecules under SOEC mode is the rate limiting reaction step, and the diffusion of oxide ions in bulk LSCF is the rate limiting diffusion step. As to the Pt/Y-doped BaZrO_3/Ag based proton-conducting SOFC, the cathode reactions are rate-limiting steps

    Low rank approximation in simulations of quantum algorithms

    No full text
    corecore