3 research outputs found

    Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization

    Full text link
    It is well-known that stochastic gradient noise (SGN) acts as implicit regularization for deep learning and is essentially important for both optimization and generalization of deep networks. Some works attempted to artificially simulate SGN by injecting random noise to improve deep learning. However, it turned out that the injected simple random noise cannot work as well as SGN, which is anisotropic and parameter-dependent. For simulating SGN at low computational costs and without changing the learning rate or batch size, we propose the Positive-Negative Momentum (PNM) approach that is a powerful alternative to conventional Momentum in classic optimizers. The introduced PNM method maintains two approximate independent momentum terms. Then, we can control the magnitude of SGN explicitly by adjusting the momentum difference. We theoretically prove the convergence guarantee and the generalization advantage of PNM over Stochastic Gradient Descent (SGD). By incorporating PNM into the two conventional optimizers, SGD with Momentum and Adam, our extensive experiments empirically verified the significant advantage of the PNM-based variants over the corresponding conventional Momentum-based optimizers.Comment: ICML 2021; 20 pages; 13 figures; Key Words: deep learning theory, optimizer, momentum, generalization, gradient nois

    Individualized music induces theta-gamma phase-amplitude coupling in patients with disorders of consciousness

    Get PDF
    ObjectiveThis study aimed to determine whether patients with disorders of consciousness (DoC) could experience neural entrainment to individualized music, which explored the cross-modal influences of music on patients with DoC through phase-amplitude coupling (PAC). Furthermore, the study assessed the efficacy of individualized music or preferred music (PM) versus relaxing music (RM) in impacting patient outcomes, and examined the role of cross-modal influences in determining these outcomes.MethodsThirty-two patients with DoC [17 with vegetative state/unresponsive wakefulness syndrome (VS/UWS) and 15 with minimally conscious state (MCS)], alongside 16 healthy controls (HCs), were recruited for this study. Neural activities in the frontal–parietal network were recorded using scalp electroencephalography (EEG) during baseline (BL), RM and PM. Cerebral-acoustic coherence (CACoh) was explored to investigate participants’ abilitiy to track music, meanwhile, the phase-amplitude coupling (PAC) was utilized to evaluate the cross-modal influences of music. Three months post-intervention, the outcomes of patients with DoC were followed up using the Coma Recovery Scale-Revised (CRS-R).ResultsHCs and patients with MCS showed higher CACoh compared to VS/UWS patients within musical pulse frequency (p = 0.016, p = 0.045; p < 0.001, p = 0.048, for RM and PM, respectively, following Bonferroni correction). Only theta-gamma PAC demonstrated a significant interaction effect between groups and music conditions (F(2,44) = 2.685, p = 0.036). For HCs, the theta-gamma PAC in the frontal–parietal network was stronger in the PM condition compared to the RM (p = 0.016) and BL condition (p < 0.001). For patients with MCS, the theta-gamma PAC was stronger in the PM than in the BL (p = 0.040), while no difference was observed among the three music conditions in patients with VS/UWS. Additionally, we found that MCS patients who showed improved outcomes after 3 months exhibited evident neural responses to preferred music (p = 0.019). Furthermore, the ratio of theta-gamma coupling changes in PM relative to BL could predict clinical outcomes in MCS patients (r = 0.992, p < 0.001).ConclusionIndividualized music may serve as a potential therapeutic method for patients with DoC through cross-modal influences, which rely on enhanced theta-gamma PAC within the consciousness-related network

    Positive-negative momentum: manipulating stochastic gradient noise to improve generalization

    No full text
    It is well-known that stochastic gradient noise (SGN) acts as implicit regularization for deep learning and is essentially important for both optimization and generalization of deep networks. Some works attempted to artificially simulate SGN by injecting random noise to improve deep learning. However, it turned out that the injected simple random noise cannot work as well as SGN, which is anisotropic and parameter-dependent. For simulating SGN at low computational costs and without changing the learning rate or batch size, we propose the Positive-Negative Momentum (PNM) approach that is a powerful alternative to conventional Momentum in classic optimizers. The introduced PNM method maintains two approximate independent momentum terms. Then, we can control the magnitude of SGN explicitly by adjusting the momentum difference. We theoretically prove the convergence guarantee and the generalization advantage of PNM over Stochastic Gradient Descent (SGD). By incorporating PNM into the two conventional optimizers, SGD with Momentum and Adam, our extensive experiments empirically verified the significant advantage of the PNM-based variants over the corresponding conventional Momentum-based optimizers. Code: https://github.com/zeke-xie/Positive-Negative-Momentum
    corecore