3 research outputs found
Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization
It is well-known that stochastic gradient noise (SGN) acts as implicit
regularization for deep learning and is essentially important for both
optimization and generalization of deep networks. Some works attempted to
artificially simulate SGN by injecting random noise to improve deep learning.
However, it turned out that the injected simple random noise cannot work as
well as SGN, which is anisotropic and parameter-dependent. For simulating SGN
at low computational costs and without changing the learning rate or batch
size, we propose the Positive-Negative Momentum (PNM) approach that is a
powerful alternative to conventional Momentum in classic optimizers. The
introduced PNM method maintains two approximate independent momentum terms.
Then, we can control the magnitude of SGN explicitly by adjusting the momentum
difference. We theoretically prove the convergence guarantee and the
generalization advantage of PNM over Stochastic Gradient Descent (SGD). By
incorporating PNM into the two conventional optimizers, SGD with Momentum and
Adam, our extensive experiments empirically verified the significant advantage
of the PNM-based variants over the corresponding conventional Momentum-based
optimizers.Comment: ICML 2021; 20 pages; 13 figures; Key Words: deep learning theory,
optimizer, momentum, generalization, gradient nois
Individualized music induces theta-gamma phase-amplitude coupling in patients with disorders of consciousness
ObjectiveThis study aimed to determine whether patients with disorders of consciousness (DoC) could experience neural entrainment to individualized music, which explored the cross-modal influences of music on patients with DoC through phase-amplitude coupling (PAC). Furthermore, the study assessed the efficacy of individualized music or preferred music (PM) versus relaxing music (RM) in impacting patient outcomes, and examined the role of cross-modal influences in determining these outcomes.MethodsThirty-two patients with DoC [17 with vegetative state/unresponsive wakefulness syndrome (VS/UWS) and 15 with minimally conscious state (MCS)], alongside 16 healthy controls (HCs), were recruited for this study. Neural activities in the frontal–parietal network were recorded using scalp electroencephalography (EEG) during baseline (BL), RM and PM. Cerebral-acoustic coherence (CACoh) was explored to investigate participants’ abilitiy to track music, meanwhile, the phase-amplitude coupling (PAC) was utilized to evaluate the cross-modal influences of music. Three months post-intervention, the outcomes of patients with DoC were followed up using the Coma Recovery Scale-Revised (CRS-R).ResultsHCs and patients with MCS showed higher CACoh compared to VS/UWS patients within musical pulse frequency (p = 0.016, p = 0.045; p < 0.001, p = 0.048, for RM and PM, respectively, following Bonferroni correction). Only theta-gamma PAC demonstrated a significant interaction effect between groups and music conditions (F(2,44) = 2.685, p = 0.036). For HCs, the theta-gamma PAC in the frontal–parietal network was stronger in the PM condition compared to the RM (p = 0.016) and BL condition (p < 0.001). For patients with MCS, the theta-gamma PAC was stronger in the PM than in the BL (p = 0.040), while no difference was observed among the three music conditions in patients with VS/UWS. Additionally, we found that MCS patients who showed improved outcomes after 3 months exhibited evident neural responses to preferred music (p = 0.019). Furthermore, the ratio of theta-gamma coupling changes in PM relative to BL could predict clinical outcomes in MCS patients (r = 0.992, p < 0.001).ConclusionIndividualized music may serve as a potential therapeutic method for patients with DoC through cross-modal influences, which rely on enhanced theta-gamma PAC within the consciousness-related network
Positive-negative momentum: manipulating stochastic gradient noise to improve generalization
It is well-known that stochastic gradient noise (SGN) acts as implicit regularization for deep learning and is essentially important for both optimization and generalization of deep networks. Some works attempted to artificially simulate SGN by injecting random noise to improve deep learning. However, it turned out that the injected simple random noise cannot work as well as SGN, which is anisotropic and parameter-dependent. For simulating SGN at low computational costs and without changing the learning rate or batch size, we propose the Positive-Negative Momentum (PNM) approach that is a powerful alternative to conventional Momentum in classic optimizers. The introduced PNM method maintains two approximate independent momentum terms. Then, we can control the magnitude of SGN explicitly by adjusting the momentum difference. We theoretically prove the convergence guarantee and the generalization advantage of PNM over Stochastic Gradient Descent (SGD). By incorporating PNM into the two conventional optimizers, SGD with Momentum and Adam, our extensive experiments empirically verified the significant advantage of the PNM-based variants over the corresponding conventional Momentum-based optimizers. Code: https://github.com/zeke-xie/Positive-Negative-Momentum