26 research outputs found

    Spatial-Temporal Fusion Graph Neural Networks for Traffic Flow Forecasting

    Full text link
    Spatial-temporal data forecasting of traffic flow is a challenging task because of complicated spatial dependencies and dynamical trends of temporal pattern between different roads. Existing frameworks typically utilize given spatial adjacency graph and sophisticated mechanisms for modeling spatial and temporal correlations. However, limited representations of given spatial graph structure with incomplete adjacent connections may restrict effective spatial-temporal dependencies learning of those models. To overcome those limitations, our paper proposes Spatial-Temporal Fusion Graph Neural Networks (STFGNN) for traffic flow forecasting. SFTGNN could effectively learn hidden spatial-temporal dependencies by a novel fusion operation of various spatial and temporal graphs, which is generated by a data-driven method. Meanwhile, by integrating this fusion graph module and a novel gated convolution module into a unified layer, SFTGNN could handle long sequences. Experimental results on several public traffic datasets demonstrate that our method achieves state-of-the-art performance consistently than other baselines.Comment: 8 pages, 3 figures, to be published in AAAI2021. arXiv admin note: text overlap with arXiv:1903.00919 by other author

    Who Is the Rightful Owner? Young Children’s Ownership Judgments in Different Transfer Contexts

    Get PDF
    This study aimed to examine whether Chinese preschoolers understand that ownership can be transferred in different contexts. The study participants were 3- to 5-year-old Chinese children (n = 96) and adults (n = 34). With four scenarios that contained different transfer types (giving, stealing, losing, and abandoning), participants were asked four questions about ownership. The results indicated that preschoolers’ ability to distinguish legitimate ownership transfers from illegitimate ownership transfers improved with age. Three-year-olds understood that ownership cannot be transferred in a stealing context, but the appropriate understanding of ownership was not attained until 4 years old in a giving context and 5 years old in losing and abandoning contexts, which is similar to the adults’ performance. In addition to the first possessor bias (a tendency to judge the first possessor as the owner) found in previous studies, 3-year-olds also displayed a loan bias (a tendency to believe everything that is transferred should be returned) in the study. The findings suggest that the developmental trajectories of preschoolers’ understanding of ownership transfers varied across different contexts, which may relate to children’s ability to consider the role of intent in determining ownership and parents’ disciplinary behavior. Both cross-cultural similarities and differences are discussed

    Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization

    Full text link
    It is well-known that stochastic gradient noise (SGN) acts as implicit regularization for deep learning and is essentially important for both optimization and generalization of deep networks. Some works attempted to artificially simulate SGN by injecting random noise to improve deep learning. However, it turned out that the injected simple random noise cannot work as well as SGN, which is anisotropic and parameter-dependent. For simulating SGN at low computational costs and without changing the learning rate or batch size, we propose the Positive-Negative Momentum (PNM) approach that is a powerful alternative to conventional Momentum in classic optimizers. The introduced PNM method maintains two approximate independent momentum terms. Then, we can control the magnitude of SGN explicitly by adjusting the momentum difference. We theoretically prove the convergence guarantee and the generalization advantage of PNM over Stochastic Gradient Descent (SGD). By incorporating PNM into the two conventional optimizers, SGD with Momentum and Adam, our extensive experiments empirically verified the significant advantage of the PNM-based variants over the corresponding conventional Momentum-based optimizers.Comment: ICML 2021; 20 pages; 13 figures; Key Words: deep learning theory, optimizer, momentum, generalization, gradient nois

    Amata: An Annealing Mechanism for Adversarial Training Acceleration

    Full text link
    Despite the empirical success in various domains, it has been revealed that deep neural networks are vulnerable to maliciously perturbed input data that much degrade their performance. This is known as adversarial attacks. To counter adversarial attacks, adversarial training formulated as a form of robust optimization has been demonstrated to be effective. However, conducting adversarial training brings much computational overhead compared with standard training. In order to reduce the computational cost, we propose an annealing mechanism, Amata, to reduce the overhead associated with adversarial training. The proposed Amata is provably convergent, well-motivated from the lens of optimal control theory and can be combined with existing acceleration methods to further enhance performance. It is demonstrated that on standard datasets, Amata can achieve similar or better robustness with around 1/3 to 1/2 the computational time compared with traditional methods. In addition, Amata can be incorporated into other adversarial training acceleration algorithms (e.g. YOPO, Free, Fast, and ATTA), which leads to further reduction in computational time on large-scale problems.Comment: accepted by AAA

    Towards Making Deep Transfer Learning Never Hurt

    Full text link
    Transfer learning have been frequently used to improve deep neural network training through incorporating weights of pre-trained networks as the starting-point of optimization for regularization. While deep transfer learning can usually boost the performance with better accuracy and faster convergence, transferring weights from inappropriate networks hurts training procedure and may lead to even lower accuracy. In this paper, we consider deep transfer learning as minimizing a linear combination of empirical loss and regularizer based on pre-trained weights, where the regularizer would restrict the training procedure from lowering the empirical loss, with conflicted descent directions (e.g., derivatives). Following the view, we propose a novel strategy making regularization-based Deep Transfer learning Never Hurt (DTNH) that, for each iteration of training procedure, computes the derivatives of the two terms separately, then re-estimates a new descent direction that does not hurt the empirical loss minimization while preserving the regularization affects from the pre-trained weights. Extensive experiments have been done using common transfer learning regularizers, such as L2-SP and knowledge distillation, on top of a wide range of deep transfer learning benchmarks including Caltech, MIT indoor 67, CIFAR-10 and ImageNet. The empirical results show that the proposed descent direction estimation strategy DTNH can always improve the performance of deep transfer learning tasks based on all above regularizers, even when transferring pre-trained weights from inappropriate networks. All in all, DTNH strategy can improve state-of-the-art regularizers in all cases with 0.1%--7% higher accuracy in all experiments.Comment: 10 page

    Individualized music induces theta-gamma phase-amplitude coupling in patients with disorders of consciousness

    Get PDF
    ObjectiveThis study aimed to determine whether patients with disorders of consciousness (DoC) could experience neural entrainment to individualized music, which explored the cross-modal influences of music on patients with DoC through phase-amplitude coupling (PAC). Furthermore, the study assessed the efficacy of individualized music or preferred music (PM) versus relaxing music (RM) in impacting patient outcomes, and examined the role of cross-modal influences in determining these outcomes.MethodsThirty-two patients with DoC [17 with vegetative state/unresponsive wakefulness syndrome (VS/UWS) and 15 with minimally conscious state (MCS)], alongside 16 healthy controls (HCs), were recruited for this study. Neural activities in the frontal–parietal network were recorded using scalp electroencephalography (EEG) during baseline (BL), RM and PM. Cerebral-acoustic coherence (CACoh) was explored to investigate participants’ abilitiy to track music, meanwhile, the phase-amplitude coupling (PAC) was utilized to evaluate the cross-modal influences of music. Three months post-intervention, the outcomes of patients with DoC were followed up using the Coma Recovery Scale-Revised (CRS-R).ResultsHCs and patients with MCS showed higher CACoh compared to VS/UWS patients within musical pulse frequency (p = 0.016, p = 0.045; p < 0.001, p = 0.048, for RM and PM, respectively, following Bonferroni correction). Only theta-gamma PAC demonstrated a significant interaction effect between groups and music conditions (F(2,44) = 2.685, p = 0.036). For HCs, the theta-gamma PAC in the frontal–parietal network was stronger in the PM condition compared to the RM (p = 0.016) and BL condition (p < 0.001). For patients with MCS, the theta-gamma PAC was stronger in the PM than in the BL (p = 0.040), while no difference was observed among the three music conditions in patients with VS/UWS. Additionally, we found that MCS patients who showed improved outcomes after 3 months exhibited evident neural responses to preferred music (p = 0.019). Furthermore, the ratio of theta-gamma coupling changes in PM relative to BL could predict clinical outcomes in MCS patients (r = 0.992, p < 0.001).ConclusionIndividualized music may serve as a potential therapeutic method for patients with DoC through cross-modal influences, which rely on enhanced theta-gamma PAC within the consciousness-related network

    Spatial-temporal fusion graph neural networks for traffic flow forecasting

    No full text
    Spatial-temporal data forecasting of traffic flow is a challenging task because of complicated spatial dependencies and dynamical trends of temporal pattern between different roads. Existing frameworks typically utilize given spatial adjacency graph and sophisticated mechanisms for modeling spatial and temporal correlations. However, limited representations of given spatial graph structure with incomplete adjacent connections may restrict effective spatial-temporal dependencies learning of those models. Furthermore, existing methods are out at elbows when solving complicated spatial-temporal data: they usually utilize separate modules for spatial and temporal correlations, or they only use independent components capturing localized or global heterogeneous dependencies. To overcome those limitations, our paper proposes a novel Spatial-Temporal Fusion Graph Neural Networks (STFGNN) for traffic flow forecasting. First, a data-driven method of generating “temporal graph” is proposed to compensate several existing correlations that spatial graph may not reflect. SFTGNN could effectively learn hidden spatial-temporal dependencies by a novel fusion operation of various spatial and temporal graphs, treated for different time periods in parallel. Meanwhile, by integrating this fusion graph module and a novel gated convolution module into a unified layer, SFTGNN could handle long sequences by learning more spatial-temporal dependencies with layers stacked. Experimental results on several public traffic datasets demonstrate that our method achieves state-of-the-art performance consistently than other baselines
    corecore