25 research outputs found

    Spatial-Temporal Fusion Graph Neural Networks for Traffic Flow Forecasting

    Full text link
    Spatial-temporal data forecasting of traffic flow is a challenging task because of complicated spatial dependencies and dynamical trends of temporal pattern between different roads. Existing frameworks typically utilize given spatial adjacency graph and sophisticated mechanisms for modeling spatial and temporal correlations. However, limited representations of given spatial graph structure with incomplete adjacent connections may restrict effective spatial-temporal dependencies learning of those models. To overcome those limitations, our paper proposes Spatial-Temporal Fusion Graph Neural Networks (STFGNN) for traffic flow forecasting. SFTGNN could effectively learn hidden spatial-temporal dependencies by a novel fusion operation of various spatial and temporal graphs, which is generated by a data-driven method. Meanwhile, by integrating this fusion graph module and a novel gated convolution module into a unified layer, SFTGNN could handle long sequences. Experimental results on several public traffic datasets demonstrate that our method achieves state-of-the-art performance consistently than other baselines.Comment: 8 pages, 3 figures, to be published in AAAI2021. arXiv admin note: text overlap with arXiv:1903.00919 by other author

    Who Is the Rightful Owner? Young Childrenā€™s Ownership Judgments in Different Transfer Contexts

    Get PDF
    This study aimed to examine whether Chinese preschoolers understand that ownership can be transferred in different contexts. The study participants were 3- to 5-year-old Chinese children (n = 96) and adults (n = 34). With four scenarios that contained different transfer types (giving, stealing, losing, and abandoning), participants were asked four questions about ownership. The results indicated that preschoolersā€™ ability to distinguish legitimate ownership transfers from illegitimate ownership transfers improved with age. Three-year-olds understood that ownership cannot be transferred in a stealing context, but the appropriate understanding of ownership was not attained until 4 years old in a giving context and 5 years old in losing and abandoning contexts, which is similar to the adultsā€™ performance. In addition to the first possessor bias (a tendency to judge the first possessor as the owner) found in previous studies, 3-year-olds also displayed a loan bias (a tendency to believe everything that is transferred should be returned) in the study. The findings suggest that the developmental trajectories of preschoolersā€™ understanding of ownership transfers varied across different contexts, which may relate to childrenā€™s ability to consider the role of intent in determining ownership and parentsā€™ disciplinary behavior. Both cross-cultural similarities and differences are discussed

    Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization

    Full text link
    It is well-known that stochastic gradient noise (SGN) acts as implicit regularization for deep learning and is essentially important for both optimization and generalization of deep networks. Some works attempted to artificially simulate SGN by injecting random noise to improve deep learning. However, it turned out that the injected simple random noise cannot work as well as SGN, which is anisotropic and parameter-dependent. For simulating SGN at low computational costs and without changing the learning rate or batch size, we propose the Positive-Negative Momentum (PNM) approach that is a powerful alternative to conventional Momentum in classic optimizers. The introduced PNM method maintains two approximate independent momentum terms. Then, we can control the magnitude of SGN explicitly by adjusting the momentum difference. We theoretically prove the convergence guarantee and the generalization advantage of PNM over Stochastic Gradient Descent (SGD). By incorporating PNM into the two conventional optimizers, SGD with Momentum and Adam, our extensive experiments empirically verified the significant advantage of the PNM-based variants over the corresponding conventional Momentum-based optimizers.Comment: ICML 2021; 20 pages; 13 figures; Key Words: deep learning theory, optimizer, momentum, generalization, gradient nois

    Amata: An Annealing Mechanism for Adversarial Training Acceleration

    Full text link
    Despite the empirical success in various domains, it has been revealed that deep neural networks are vulnerable to maliciously perturbed input data that much degrade their performance. This is known as adversarial attacks. To counter adversarial attacks, adversarial training formulated as a form of robust optimization has been demonstrated to be effective. However, conducting adversarial training brings much computational overhead compared with standard training. In order to reduce the computational cost, we propose an annealing mechanism, Amata, to reduce the overhead associated with adversarial training. The proposed Amata is provably convergent, well-motivated from the lens of optimal control theory and can be combined with existing acceleration methods to further enhance performance. It is demonstrated that on standard datasets, Amata can achieve similar or better robustness with around 1/3 to 1/2 the computational time compared with traditional methods. In addition, Amata can be incorporated into other adversarial training acceleration algorithms (e.g. YOPO, Free, Fast, and ATTA), which leads to further reduction in computational time on large-scale problems.Comment: accepted by AAA

    Towards Making Deep Transfer Learning Never Hurt

    Full text link
    Transfer learning have been frequently used to improve deep neural network training through incorporating weights of pre-trained networks as the starting-point of optimization for regularization. While deep transfer learning can usually boost the performance with better accuracy and faster convergence, transferring weights from inappropriate networks hurts training procedure and may lead to even lower accuracy. In this paper, we consider deep transfer learning as minimizing a linear combination of empirical loss and regularizer based on pre-trained weights, where the regularizer would restrict the training procedure from lowering the empirical loss, with conflicted descent directions (e.g., derivatives). Following the view, we propose a novel strategy making regularization-based Deep Transfer learning Never Hurt (DTNH) that, for each iteration of training procedure, computes the derivatives of the two terms separately, then re-estimates a new descent direction that does not hurt the empirical loss minimization while preserving the regularization affects from the pre-trained weights. Extensive experiments have been done using common transfer learning regularizers, such as L2-SP and knowledge distillation, on top of a wide range of deep transfer learning benchmarks including Caltech, MIT indoor 67, CIFAR-10 and ImageNet. The empirical results show that the proposed descent direction estimation strategy DTNH can always improve the performance of deep transfer learning tasks based on all above regularizers, even when transferring pre-trained weights from inappropriate networks. All in all, DTNH strategy can improve state-of-the-art regularizers in all cases with 0.1%--7% higher accuracy in all experiments.Comment: 10 page

    Spatial-temporal fusion graph neural networks for traffic flow forecasting

    No full text
    Spatial-temporal data forecasting of traffic flow is a challenging task because of complicated spatial dependencies and dynamical trends of temporal pattern between different roads. Existing frameworks typically utilize given spatial adjacency graph and sophisticated mechanisms for modeling spatial and temporal correlations. However, limited representations of given spatial graph structure with incomplete adjacent connections may restrict effective spatial-temporal dependencies learning of those models. Furthermore, existing methods are out at elbows when solving complicated spatial-temporal data: they usually utilize separate modules for spatial and temporal correlations, or they only use independent components capturing localized or global heterogeneous dependencies. To overcome those limitations, our paper proposes a novel Spatial-Temporal Fusion Graph Neural Networks (STFGNN) for traffic flow forecasting. First, a data-driven method of generating ā€œtemporal graphā€ is proposed to compensate several existing correlations that spatial graph may not reflect. SFTGNN could effectively learn hidden spatial-temporal dependencies by a novel fusion operation of various spatial and temporal graphs, treated for different time periods in parallel. Meanwhile, by integrating this fusion graph module and a novel gated convolution module into a unified layer, SFTGNN could handle long sequences by learning more spatial-temporal dependencies with layers stacked. Experimental results on several public traffic datasets demonstrate that our method achieves state-of-the-art performance consistently than other baselines

    The Role of Social Value Orientation in Chinese Adolescentsā€™ Moral Emotion Attribution

    No full text
    Previous studies have explored the role of cognitive factors and sympathy in childrenā€™s development of moral emotion attribution, but the effect of personal dispositional factors on adolescentsā€™ moral emotion expectancy has been neglected. In this study, we address this issue by testing adolescentsā€™ moral emotion attribution with different social value orientation (SVO). Eight hundred and eighty Chinese adolescents were classified into proselfs, prosocials and mixed types in SVO and asked to indicate their moral emotions in four moral contexts (prosocial, antisocial, failing to act prosocially (FAP) and resisting antisocial impulse (RAI)). The findings revealed an obvious contextual effect in adolescentsā€™ moral emotion attribution and the effect depends on SVO. Prosocials evaluated more positively than proselfs and mixed types in the prosocial and RAI contexts, but proselfs evaluated more positively than prosocials and mixed types in the antisocial and FAP contexts. The findings indicate that individual differences of adolescentsā€™ moral emotion attribution have roots in their social value orientation, and suggest the role of dispositional factors in the processing of moral emotion
    corecore