1,360 research outputs found

    EFFECTS OF RUNNING HABITS ON MORPHOLOGY AND PLANTAR FLEXION TORQUE OF MEDIAL GASTROCNEMIUS-ACHILLES TENDON UNIT

    Get PDF
    This study aims to explore the effects of running habits on the morphology of gMTU (medial gastrocnemius-Achilles tendon unit) and plantar flexion torque, to reveal the adaptive changes of different running habits. Male habitual distance runners with forefoot strike pattern (FFS, n=10), male habitual distance runners with rearfoot strike pattern (RFS, n=10), and male non-runners (NR, n=10) were recruited. The Mindray M7 Super ultrasonography system was used to measure the morphological characteristics of the gMTU. A dynamometer was utilized to determine the plantar flexion torque. One-way ANOVA and Nonparametric Test were used for analysis. The significance level was set as 0.05. Significant differences between groups were detected on muscle fascicle length (FL) (p \u3c 0.05), Normalized FL (p \u3c 0.05), and pennation angle (PA) (p \u3c 0.01), while no significant difference was observed in other parameters. Specifically, the FL and Normalized FL were greater in FFS than NR (p \u3c 0.05), while the PA was smaller in FFS than NR (p \u3c 0.05). These results suggest that long-term running with FFS pattern could induce a greater contraction velocity and a more efficient force transmitting of the medial gastrocnemius (MG)

    The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent

    Full text link
    In this paper, we study the implicit regularization of stochastic gradient descent (SGD) through the lens of {\em dynamical stability} (Wu et al., 2018). We start by revising existing stability analyses of SGD, showing how the Frobenius norm and trace of Hessian relate to different notions of stability. Notably, if a global minimum is linearly stable for SGD, then the trace of Hessian must be less than or equal to 2/η2/\eta, where η\eta denotes the learning rate. By contrast, for gradient descent (GD), the stability imposes a similar constraint but only on the largest eigenvalue of Hessian. We then turn to analyze the generalization properties of these stable minima, focusing specifically on two-layer ReLU networks and diagonal linear networks. Notably, we establish the {\em equivalence} between these metrics of sharpness and certain parameter norms for the two models, which allows us to show that the stable minima of SGD provably generalize well. By contrast, the stability-induced regularization of GD is provably too weak to ensure satisfactory generalization. This discrepancy provides an explanation of why SGD often generalizes better than GD. Note that the learning rate (LR) plays a pivotal role in the strength of stability-induced regularization. As the LR increases, the regularization effect becomes more pronounced, elucidating why SGD with a larger LR consistently demonstrates superior generalization capabilities. Additionally, numerical experiments are provided to support our theoretical findings.Comment: ICML 2023 camera read

    When does SGD favor flat minima? A quantitative characterization via linear stability

    Full text link
    The observation that stochastic gradient descent (SGD) favors flat minima has played a fundamental role in understanding implicit regularization of SGD and guiding the tuning of hyperparameters. In this paper, we provide a quantitative explanation of this striking phenomenon by relating the particular noise structure of SGD to its \emph{linear stability} (Wu et al., 2018). Specifically, we consider training over-parameterized models with square loss. We prove that if a global minimum θ∗\theta^* is linearly stable for SGD, then it must satisfy ∥H(θ∗)∥F≤O(B/η)\|H(\theta^*)\|_F\leq O(\sqrt{B}/\eta), where ∥H(θ∗)∥F,B,η\|H(\theta^*)\|_F, B,\eta denote the Frobenius norm of Hessian at θ∗\theta^*, batch size, and learning rate, respectively. Otherwise, SGD will escape from that minimum \emph{exponentially} fast. Hence, for minima accessible to SGD, the flatness -- as measured by the Frobenius norm of the Hessian -- is bounded independently of the model size and sample size. The key to obtaining these results is exploiting the particular geometry awareness of SGD noise: 1) the noise magnitude is proportional to loss value; 2) the noise directions concentrate in the sharp directions of local landscape. This property of SGD noise provably holds for linear networks and random feature models (RFMs) and is empirically verified for nonlinear networks. Moreover, the validity and practical relevance of our theoretical findings are justified by extensive numerical experiments

    A Hybrid BP-EP-VMP Approach to Joint Channel Estimation and Decoding for FTN Signaling over Frequency Selective Fading Channels

    Get PDF
    This paper deals with low-complexity joint channel estimation and decoding for faster-than-Nyquist (FTN) signaling over frequency selective fading channels. The inter-symbol interference (ISI) imposed by FTN signaling and the frequency selective channel are intentionally separated to fully exploit the known structure of the FTN-induced ISI. Colored noise due to the faster sampling rate than that of the Nyquist signaling system is approximated by autoregressive process. A Forney style factor graph representation of the FTN system is developed and Gaussian message passing is performed on the graph. Expectation propagation (EP) is employed to approximate the message from channel decoder to Gaussian distribution. Since the inner product between FTN symbols and channel coefficients is infeasible by belief propagation (BP), we propose to perform variational message passing (VMP) on an equivalent soft node in factor graph to tackle this problem. Simulation results demonstrate that the proposed low-complexity hybrid BP-EP-VMP algorithm outperforms the existing methods in FTN system. Compared with the Nyquist counterpart, FTN signaling with the proposed algorithm is able to increase the transmission rate by over 40%, with only negligible BER performance loss

    Data Augmentation Vision Transformer for Fine-grained Image Classification

    Full text link
    Recently, the vision transformer (ViT) has made breakthroughs in image recognition. Its self-attention mechanism (MSA) can extract discriminative labeling information of different pixel blocks to improve image classification accuracy. However, the classification marks in their deep layers tend to ignore local features between layers. In addition, the embedding layer will be fixed-size pixel blocks. Input network Inevitably introduces additional image noise. To this end, we study a data augmentation vision transformer (DAVT) based on data augmentation and proposes a data augmentation method for attention cropping, which uses attention weights as the guide to crop images and improve the ability of the network to learn critical features. Secondly, we also propose a hierarchical attention selection (HAS) method, which improves the ability of discriminative markers between levels of learning by filtering and fusing labels between levels. Experimental results show that the accuracy of this method on the two general datasets, CUB-200-2011, and Stanford Dogs, is better than the existing mainstream methods, and its accuracy is 1.4\% and 1.6\% higher than the original ViT, respectivelyComment: IEEE Signal Processing Letter
    • …
    corecore